07-05-2011 09:25 AM
I run EV 9.0.1 and Exchange 2007. I have over 2000 mailboxes and setup EV to archive all items older than 30 days, nightly, from midnight to 8 am. I run the Items Archived per Hour report every morning to see how many archives were processedthe previous night. The total archives processed number ranges anywhere from 800 to 1200 and the amount of data archived is usually 15 to 30 GB. This has been consistant for over a year possibly longer.
About a week ago the total number of items archived between midnight and 8 am dropped into the teens and recently the single digits. To my knowledge nothing has changed with EV, no patches or upgrades. All policies and settings are still the same.
I checked my mailbox and I now have email older than 30 days that is not archived. Other users' mailbox sizes have also grown. We usually have 1 or 2 mailboxes that are over 2 GB in size and now I have close to 40, and growing.
There are no errors in the apps or ev logs. I thought maybe it was a proboem with the MSMQ or the archiving task, so I recreated the task which created new queues and I deleted the old ones. I really won't know if this fixes the problem until tomorrow morning when I run the Items Archvied per Hour report, but I'm not confident it will. There are over 2000 items in the A7 queue, synchronization runs at 11 am daily. Its moving very slow. It has not even gone down by 100 after 1 hour.
I ran a manual archive of my mailbox an hour ago, but nothing has archived yet and the A3 queue still shows a 1 in it, me.
The vault is on a SAN and access to it is fine. Everyone can open their archived email without a problem. The EV system account is not locked and everything seems ok with it. No permissions have been changed to my knowledge.
What can be wrong? I'm pulling my hair out with this one. :)
Thank you
07-05-2011 07:40 PM
Sounds like you're in read only or report mode
07-06-2011 07:16 AM
Tthere are no reports being generated, so maybe something is putting the vault into read-only mode. We only run backups (using netbackup) on Saturday's, so i'm not sure what else it could be. I'm upgrading tonight from 9.0.1 to 9.0.2, so i'll be able to check into that.
thanks
07-06-2011 10:10 AM
Since your performace has been decreasing and not just stopped that would rule out "read-only" or the archive task in report mode.
Have you checked to see if you have space available for the Vault Store partitions and Index locations?
In your hourly archive report, do you see only a few messages each hour during the archiving run? This would indicate some type of throttling or a bottleneck in ether Exchange or the Storage / Index processes.
When you access a mailbox, especially a large one, (not in Ol cached mode) does it take longer to fully populate OL than normal?
Is your EV server a physical or virtual? and what are the specs? A virtual server with it's resources reduced could account for the diminished performance.
07-07-2011 03:37 AM
Have you done a Run Now in Report mode, against your mailbox to just verify that EV thinks it is going to archive stuff?
The A7 queue is for syncs, but it shouldn't be backed up like that - I wonder if you have a network issue of some kind? I would have thought that you would see errors like 3310's in the EV event log if it was experiencing comm issues though.....odd.
You can purge the A7 queue (and then the associated Admin queue) without any worries, but it would normally be 0. Also, odd that the A3 job didn't move off, but bear in mind that you can sometimes get "ghost" entries shown in the MSMQ page. An acid test would be to do a run now (either real or report) and get a DTRACE running.....
07-07-2011 06:43 AM
I upgraded last night from 9.0.1 to 9.0.2. The upgrade went well. I unscheduled the nightly archive run and kicked off a Run Now of all mailboxes, archiving up to 5000 items per. I have 2300+ mailboxes. This was at about 1 am. Its 9:30 am now and the a3 queue still shows 2300+ mailboxes. I think only a handful were processed. The vault and the index are not in read-only mode, like you said, if it was, nothing would get processed.
There are no errors in ev log or the system's apps log for that matter.
I have plenty of space left on my partitions drive, 870 gb.
I can successfully ping and map drives from my ev server to my exchange server.
I can't find any settings that have changed.
I'm guessing the Run Now is still processing, so if I start a dtrace now, what should I look for?
07-07-2011 06:52 AM
It's prob an idea to look at the Performance Guide and get some counters into PerfMon - then just leave it running. We don't normally recommend doing a Run Now against all mailboxes.
07-07-2011 07:02 AM
Also, as an FYI, I think I am right in saying that a "Run Now" just does one pass of the mailbox, whereas Scheduled will go through the mailbox more than once if it has time in the schedule.
You have a large number of items per pass - this may show up as some mailboxes not getting archived. I would advise that you drop it to the default, at least that way each mailbox will get touched.
07-07-2011 07:35 AM
I'm setting up a perf mon tonight to run between midnight and 8 am, my archive schedule, on both my exchange server and ev server to find out if there are any network problems. As for the run now, yes, I know it only does one pass, but it was more of a test to see if there were any ev problems after the upgrade last night. I've had ev set up to process up to 5000 items per mailbox and 5 mailboxes at once for a long time and it has always processed around 1000 mailboxes nightly. Hopefully the perf mon will tell me what's going on.
thanks to all for the help, thus far.
07-07-2011 07:50 AM
i just complered a dtrace. what should I look for in the log. its 55 mg in size.
07-07-2011 08:06 AM
things to look for in a dtrace are
EV~E = Error that would be thrown to the event log
EV~W = Warning that would be thrown to the event log
EV~I = Information that would be thrown to the event log
0x8004 = Typical beginning of a MAPI error (for instance 0x8004011B = Corrupt Data)
You can find a list of MAPI Errors and their meanings here: http://support.microsoft.com/kb/238119
0x8007 = Typical beginning of a Windows error (for instance 0x80070005 = Access Denied)
0xC00 = Typical internal Enterprise Vault error code (for instance 0xC0040BFB)
"HRXEX fn in trace" - would indicate an error has been thrown
" .cpp" - usually the only time a CPP source file will be thrown is when an error has been thrown and this is is used for support and dev purposes
Also look for "Eligible" as you will see phrases such as
"Message: [title] is eligible for archiving" or "Message: [title] is not eligible for archiving
and then also search for the obvious "fatal" "error" and "Exception"
07-07-2011 09:03 AM
Could something like this be a problem? Although I confirmed that the path does exist.
139784 10:15:54.688 [3132] (StorageArchive) <436> EV:M CSavesetPersist::Save File: Y:\Enterprise Vault\Vault Stores\ATL01MailVaultStore01\2011 Partition 05\2011\07-07\8\065\806524459608682C3AD6F7DF7CC964A1.DVS Remember: True
139785 10:15:54.688 [3132] (StorageArchive) <436> EV:L CSavesetPersist::AllocateFile File: Y:\Enterprise Vault\Vault Stores\ATL01MailVaultStore01\2011 Partition 05\2011\07-07\8\065\806524459608682C3AD6F7DF7CC964A1.DVS
139786 10:15:54.688 [3132] (StorageArchive) <436> EV:H CSaveset2::Save _com_error exception. hr=The system cannot find the path specified. (0x80070003)
139787 10:15:54.688 [3132] (StorageArchive) <436> EV:H {CSaveset2::Save} (Exit) Status: [The system cannot find the path specified. (0x80070003)]
139788 10:15:54.688 [3132] (StorageArchive) <436> EV:H CSaveset2::WriteFile _com_error exception. hr=The system cannot find the path specified. (0x80070003)
139789 10:15:54.688 [3132] (StorageArchive) <436> EV:H {CSaveset2::WriteFile} (Exit) Status: [The system cannot find the path specified. (0x80070003)]