When people implement Enterprise Vault features like collections, and implement best practices like periodically closing partitions, they are often surprised that disk space can still be consumed on those partitions. Managing an Enterprise Vault mailbox archiving environment often involves spinning many different plates at once, and one of the most common ones that requires attention from time to time is storage.
So how can it be that a closed partition, with collections enabled still requires some disk space?
A closed partition, as we know, is no longer being written to by Enterprise Vault archiving tasks. No new data is being written to that particular location. Some people initially suspect that when Enterprise Vault performs it's single instancing operations that this causes storage usage to increase. Certainly in old versions of Enterprise Vault this might have been the case but since Enterprise Vault 8 all of the single instancing, or sharing as it is sometimes known, takes place at the database level rather than on the file system.
The operation of building new collections (or CAB files) is also sometimes suspected of consuming disk space. This also is not true, since CAB files don't involve any kind of compression, adding new files to a CAB file doesn't involve the same sorts of disk processes as (for example) zipping a file on a file system.
So what can be causing the usage of more disk space?
Here are four Enterprise Vault operations that would still result in disk space being used on these types of storage partitions:
When a retrieval of an item is performed by an end-user the Enterprise Vault server will reach into the CAB file, and extract the item that the user has requested. That extracted file has to be stored, at least temporarily, somewhere. Instead of storing that in say the system temp location, Enterprise Vault storage processes will restore that item to the main partition file system, and give the file a special file extension. Examples of these file extensions are:
These are then presented back to the user who performed the retrieval.
This means that as well as having the .CAB file on disk, we've now used up some additional storage space to hold the .ARCH* files.
Retrieving one item like this is of course only going to involve a very small amount of disk space, but consider what happens when hundreds of users are retrieving thousands of items.
The space usage then becomes significant.
From time to time Enterprise Vault administrators may need to restore archives back to a mailbox or to a PST. Whilst the reasons often vary, the net result in relation to storage usage on closed partitions is that all of the items in the users archive need to be retrieved, temporarily, so that they can be added to the mailbox (or PST file).
As with the ordinary user retrievals doing this for one or two archives might not result in a huge amount of additional space being used, depending on the size of the archives, of course. But doing this for 10 or 20 or more archives might start to have a significant impact on the partition space usage.
If Discovery Accelerator is used to perform eDiscovery style searches and enquiries into archived data, this too can generate disk space usage on closed partitions. Not during the process of searching, or building the information required for a legal discovery (that uses the Enterprise Vault Index data), but when items are reviewed, those items need to be restored, temporarily.
The final thing that might use disk space on closed partitions with collections enabled is Virtual Vault, and specifically Vault Cache. The cache needs, depending on the policy which is implemented, a copy of every item in the users archive.
Usually, something like Virtual Vault or Vault Cache is deployed to all or almost all of an organisation so this disk space usage can be considerable.
So what can be done?
We've seen that there are several ways that these .ARCH* files can become problems on the disk storage that is used by Enterprise Vault.. but what can be done about that?
The first thing is that you need to continue running collections. It is the daily collections process which performs the maintenance task of removing the temporary .ARCH* files. If they have not been accessed in the last 24 hours, they will be removed. Many people think that once they're done with the particular partition, and close it, that they no longer need to do anything else with the partition, but that is not true.
The second thing which can be done is to make sure that as an Enterprise Vault administrator of this environment you understand what the environment is used for... know when DA searches are being done, and monitor and maintain information relating to when exports of archives are performed.
Managing Enterprise Vault storage does not simply stop when you decide that a partition is now closed. Care must still be taken with those partitions as there are several regular operations that might lead to disk space still being consumed, and without background tasks being kept in place this temporary usage may become more permanent than you would like.