Most likely the issue revolved around a user that was no longer on that exchange server attempting to restore items, it blocks the R1, causes the Archive task to throw 3310's and then any subsequent restores will also fail.
For example
- User A is on Exchange Server A and is enabled
- They restore an item and it goes to the R1 queue for Exchange Server A
- User A then gets moved to Exchange Server B
- Provisioning and synchronization doesn't run or doesn't pick up the changes
- User A then tries to restore an item
- The item goes to the R1 queue for Exchange Server A (NOT Exchange Server B)
- Enterprise Vault then tries to connect to Exchange Server A to find User A
- An error gets thrown in the background saying User A no longer exists on Exchange Server A
- The Archiving Task then goes to sleep for 10 minutes, throwing a 3310 error
- The Archiving Task then wakes up and tries the same process, user can't be found, sleeps for 10 mins
This would continue until you manually purge the queue, make sure the user is provisioned and then synced correctly on Exchange Server B, and the problem goes away
This has been resolved in EV9 SP1.
On top of that if you have things stuck in your Storage Spool, Storage Archive or Storage Restore Queues, remember these are the actual messages, not just small 1k/2k pointers to where the message is, so if someone restores a 10mb item, you will have about 5 messages in the Storage queues as it will be 10mb split in to chunks.
So it is possible that with 300,000 messages in a storage queue that the 1GB quota limit on MSMQ has been breached, in Windows 2000 and early versions of 2003, the quota was 8GB, then by at least SP2 it was bought down to 1GB
The combination of large items in the Storage queues, anything going to the dead letter queues, large backlogs in the Outgoing queues and if someone had turned on journaling on the MSMQ, that quota can be breached quite easily, and it is just like an Exchange quota, once it has been breached, you have to clear items out before it becomes usable again.
In future cases what i would suggest is if you have a large backlog in your R2's/R1's and storage queues, firstly make sure you haven't breached your Quota on the MSMQ.
Secondly, actually look at the messages that are "stuck", it will give the DN's of the user it is trying to store or restore to, make sure that user is on that queue.
Plus if its the same user trying to store or restore a huge amount of items, maybe its best contacting them, find out why they are trying to restore that much email, and see if there is any alternative method you can use.
For instance lets just say they're tired of the shortcuts and using search or explorer, maybe offer them virtual vault and vault cache, then they can have their email local on their machine, looks like a PST file, they don't end up restoring email every single time it gets archived etc etc etc