Showing results for 
Search instead for 
Did you mean: 

ESVR - Replay Storage Queue

Level 4

Hi I am hoping someone can help with this. First off some background of our environment. We have a 2 node Enterprise Vault 11.01 cluster using SAN Storage. We keep our EV partitions quite small in order to make backups manageble, and we use volume mount points for the vault storage partitions and the indexes. We can present more storage and mount to the folder containing the mount points which works well.

Now to our problem.

We are in the process of updating our anti virus software to a later version from the same provider. In this case Trend Office scan. Exclusions were in place on the old server and replicated on the new server. We added exclusions for the folder containing the mount points for the vault store partitions and indexes as well as the usual EV exclusions, Storage Queue etc. It is only the partitions and indexes which are using volume mount points. The Storage Queue and clustered MSMQ are using regular drive letters. The exclusions worked without issue on the old version of the AV software and it should exclude all files and folders below the path specified. We updated the passive node, then failed over and updated the active node. We then observed the AV realtime monitor scanning the vault store partitions and index locations but it was not referancing the mount point path ie E:\VaultStores\VStore1PT16\Data but the physical disk path with a long guide something along the lines of \\devideharddisk1\longguid\data\....

We resolved by adding wildcard paths for the vault store partitions *\VStore1*\Data

After a couple of days we noticed items building up in the storage queue which we thought was due to items being queued up on a node and then the active node being moved so backups were taken and then the cluster moved. I then remembered that this issue was resolved in hotfix 5 -

Then realised that the AV must have caused some kind of corruption to the vault Store partition the 12th of May and the 15th of May. checking the event log I then noticed event ID 29054.

Watch file scan has found more than 100 invalid savesets within 120 seconds for the vault Store. The scan for this vault store will be stopped. Hence why the Storage queue is not clearing.

There was nothing ion the AV logs to say that it deleted, removed or quarantined any files.

With the storage queue items are added to the .EVSQ file in the storage queue location, storage hen take the item from the EVSQ file and adds to the archive. Items are not removed from the Storage queue until all archived items are secure.

Logically if the AV has caused some corruption in the vault Store partition the items should still be in the Storage Queue. The idea being items are in the Storage Queue in case the disk containing the partition fails?

I ran a verify on the vault Store partition for the period in question using EVSR and the log file had lots of entries saying SavesetIT  ......file not found and also fingerprint validation failed.

I then ran a report on the Storage queue. Once completed I then cross referanced the first 20 or so saveset ID's in the 1st log file against the storage queue report and all the savesets that cannot be found are successfully found in the Storage queue report.

So my question. How do i use EVSVR to replay the contents of the Storage queue. I already tried a repair on the storage queue and that did not accomplish anything.

I logged a call with Vertias but they could not help as EV 11.01 is end of life. We have active maintenance but not extended support and cannot go to EV 12 yet as we are still running Exchange 2007 which are just about to upgrade/migrate from.

Can anybody help.










Partner    VIP    Accredited

Based on my understanding of the issue, I don't think the storage queue will help because you're referring to items that have already made it all the way to your vault store partition. If EVSVR is reporting on missing or corrupt items and you suspect that AV was the cause, then I think the first place to look would be your backups. However, the window of risk exists between the time that the item completed archiving and the time that your backups run.