cancel
Showing results for 
Search instead for 
Did you mean: 

Strange errors with EV 8 - assistance appreciated

biosphere
Level 4
Partner

Hi all,

I've inherited a client with an EV 8 install that is displaying some strange behavior.

Environment:

EV Server (Win 2003 Ent SP2, EV 8.0 SP4, does Exchange Mailbox archiving and PST migration)
SQL Server (Win 2003 Ent X64, SQL 2005)
Several Exchange 2007 Servers

In short, the "strange behavior" manifests as everything looking dandy, but nothing actually happening.

So, all services are running and functional, all mailbox tasks are running and functional and no error messages in Event Viewer. However, the server just doesn't seem to be processing anything.

From what I can see, what happens is this:

* MSMQ fills up with a number approximately the number of targeted mailboxes, and stays there. It's been a while since I worked with EV, but I seem to recall that this number changes quite instantly as processing/archiving occurs.

* The EV cache folder also fills up with data, I've had to increase the cache size a few times due to event messages stating that the cache is full.

* Both PSTHolding and PSTTemp folders are stuffed with large number of files, some extremely large

* Several instances of ArchiveTask.exe and MigratorServer.exe pull pretty much all available CPU constantly (rarely below 95% util)

Any pointers/tips on how to start troubleshooting, where to look etc would be hugely appreciated.
1 ACCEPTED SOLUTION

Accepted Solutions

biosphere
Level 4
Partner
Well, at least I've gotten the server to process something. Made a series of changes, but slightly unsure of what actually made it start to tick:

* Set HKLM\Software\KVS\Enterprise Vault\Storage\ThrottleLowerThreshold =5000 HKLM\Software\KVS\Enterprise Vault\Storage\ThrottleUpperThreshold = 10000 for MSMQ

* Went through the best practice reg keys for memory management and so forth once more

* Increased the  number of archiving processes for the storage service to 10.

* Reinstalled MSMQ (again), and worked through all permissions on the MSMQ volume

* Increased MSMQ message and journal storage

* Closed old partition and enabled new one, closed off existing index location and added new ones

The main change is that now messages actually make it to the storage archive queue.

However, seen as the server probably has been sitting in the previous state for quite some time (again, I inherited this client/install, and very little is documented), the backlog is HUGE, to say the least. I've been trying to squeeze in as much archiving time as possible over weekends etc, but the server still seems to be lagging behind quite miserably, and does not seem to able to reach a "normal" balance any time soon.

I have some hardware I can use to set up additional EV servers, what is the best practice for adding additional EV servers for offloading purposes?

Cheers

View solution in original post

14 REPLIES 14

Mohawk_Marvin
Level 6
Partner

Is the server in backup mode?

biosphere
Level 4
Partner

Had to leave the office, but I'll check tomorrow morning. Thanks for the input. :)

Patrick_Kenevan
Level 4
If the server is in backup mode in EV 8 MSMQ queues are not populated unless you are using legacy backup method ( Event id 7077).  Can you create two Dtraces, one for PSTmigratortask and Migratorserver and one for archivetask and storagearchive when the archiving is scheduled to run? have you tried manual "Run Now" against any mailbox?

biosphere
Level 4
Partner
Neither stores nor indexes were in backup mode, however the pre- and post backup scripts (using the registry keys) set to trigger in the TSM backup scripts were indeed generating event id 7077. I substituted these with properly configured Powershell scripts, which worked fine. Backup is still running from last night though (probably quite a bit of backlog to take care of :) ), so I'll have to wait and see how things look when it's finished.

Thanks for your input so far, will report back later. :)

Mohawk_Marvin
Level 6
Partner

Make sure the legacy Reg keys are deleted as well, I think the Event ID contains the keys to remove.

biosphere
Level 4
Partner
However, TSM seems to be struggling with backing up EV. Backup job started at 6 pm last night, and had only gotten as far as 1,3 GB by the time I just cancelled the running job, which obviously is quite ridiculous, performance-wise. A quick googling points to some others having issues with the TSM / EV combo, but no solutions posted as far as I can tell. Do any pointers / best practices exist for this sort of setup?

Cheers

biosphere
Level 4
Partner
So, essentially getting nowhere with this...

Have confirmed that:

* Legacy reg keys have been removed
* New PS pre/post backup scripts function correctly

We are having issues with the backup software (TSM) against the EV server, backup guys are researching this now.

However, the initial issues persist:

* The a5 and a7 queues for each Mailbox archiving task are "stuck" at the number of enabled mailboxes, and refuse to budge. I have attempted to purge these (along with the relevant admin queues), but as soon as I start the services again and the tasks fire up, the queues fill up again and stay there.

* ArchiveTask.exe, StorageOnlineOpns.exe and MigratorServer.exe collectively eat up all available CPU - the server is running steadily at 99-100% CPU util

* No significant errors are logged in eventvwr

(Concerning the last point: is it just me, or is it quite hard to figure out what EV8 is actually _doing_? Yes, there are loads of events logged in event viewer, but apart from that very little information is provided in general, IMHO. For example, trying to figure out how many items are being processed/archived etc is nigh on impossible, or some sort of best effort exercise based on collectively ogling MSMQ queues, eventvwr items and feedback from users on icon statuses.)

I found the following post which quite resembles my situation: https://www-secure.symantec.com/connect/forums/msmq-not-clearing

However, removing the "delete orphaned shortcuts" tick didn't change anything in my case.



Patrick: "Run Now" against a mailbox does not work, from what I can tell. I can see the message sitting in the relevant a3 queue. Also, I have run the Dtraces you noted, but can't see anything obviously wrong.

Cheers

Also, the EVCache folder is filling up like crazy. I'm consistently getting event 41099 ("the specified cache location has reached the specified maximum), no matter how high I set the max value (it's currently at 200 gb).

GertjanA
Moderator
Moderator
Partner    VIP    Accredited Certified
Hi biosphere:

Disable Vault Cache and Virtual Vault for the clients (if enabled)
also disable Moved Items (not only the orphaned shortcuts, but also the moved items)

run provisioning, sync mailboxes, restart services.

Check msmq

When ok, enable moved items for a small group of people via provisioning (new policy, targeting a new provisioning group)

Verify if Vault Cache is correctly configure (do all users need it?)
Regards. Gertjan

biosphere
Level 4
Partner
Hi GertJan :)

Sure, but what would this accomplish (apart from disabling quite a lot of functionality ;)?

Cheers

GertjanA
Moderator
Moderator
Partner    VIP    Accredited Certified
Hello Biossphere.

You write in one of your postings here: a7 queues for each Mailbox archiving task are "stuck" at the number of enabled mailboxes

a7 is (if I recall correctly) the queue that processes the moved shortcuts.
What I would like to get to with my disabling advise, is ruling out issues with virtual vault/vault cache/moved items. If this is an recently upgraded environment, the move shortcuts and virtual vault may choke the system. Hence, disabling the new features, seeing if EV then works as expected, then turning on features one by one to see if they work, helps in determining where the root-cause of the issue your are seeing is.

Regards. Gertjan

biosphere
Level 4
Partner
Well, from the admin guide it seems that A7 queue is responsible for synchronization requests, and A6 for moved items update. A5, obviously, is the main queue for scheduled mailbox processing. So I believe disabling moved items etc may not have an impact on the issues I'm experiencing.

One thing I noticed from the admin guide:

"The A7 queue is processed at all times but is always the lowest priority task. This means that scheduled background archives always take precedence over a
synchronize."

So essentially, seen as my A5 queue fills up and rarely goes down, does this mean that synchronization never takes place?

Michael_Bilsbor
Level 6
Accredited
Hi,

The a5 queue is only processed during the archiving schedule so outside of the archivng schedule the a6 and a7 requests should be getting processed.
if items remain on a5 queue sounds like perhaps an issue with storage archive process or perhaps exchange connectivity or performance issues.

Mike

biosphere
Level 4
Partner
Well, at least I've gotten the server to process something. Made a series of changes, but slightly unsure of what actually made it start to tick:

* Set HKLM\Software\KVS\Enterprise Vault\Storage\ThrottleLowerThreshold =5000 HKLM\Software\KVS\Enterprise Vault\Storage\ThrottleUpperThreshold = 10000 for MSMQ

* Went through the best practice reg keys for memory management and so forth once more

* Increased the  number of archiving processes for the storage service to 10.

* Reinstalled MSMQ (again), and worked through all permissions on the MSMQ volume

* Increased MSMQ message and journal storage

* Closed old partition and enabled new one, closed off existing index location and added new ones

The main change is that now messages actually make it to the storage archive queue.

However, seen as the server probably has been sitting in the previous state for quite some time (again, I inherited this client/install, and very little is documented), the backlog is HUGE, to say the least. I've been trying to squeeze in as much archiving time as possible over weekends etc, but the server still seems to be lagging behind quite miserably, and does not seem to able to reach a "normal" balance any time soon.

I have some hardware I can use to set up additional EV servers, what is the best practice for adding additional EV servers for offloading purposes?

Cheers

MichelZ
Level 6
Partner Accredited Certified
You could run the task in reporting mode, and check how many items that still need archiving, or just let it process the backlog. If you want to add servers, then you need to hqce multiple exchange servers to effectively load balance it, or you would need to add servers just for indexing/storage processing, but both scenarios need moving of archives using MoveArchive. Cheers

cloudficient - EV Migration, creators of EVComplete.