When it comes to Enterprise Vault there are a number of options that an administrator can choose from and implement in order to improve the availability of Enterprise Vault. I'm talking specifically of 'high availability' options - things like clustering, replication and so on. In this article I'll explain a little bit about to make some small changes to your Enterprise Vault environment which can dramatically improve the availability, and expandability of your service. The article focuses on storage, and specifically on ways in which that can be made highly available. Clustering is of course very important, but that is server-based redundancy and does nothing for your core Enterprise Vault data files.
Use Off-Server Storage
It is all too easy to setup Enterprise Vault with all your Enterprise Vault data stored locally. This is a viable option for very small implementations, but as soon as you start talking about the data needed to support several hundred, or even several thousand users, then having this storage locally inside a server class machine starts to become an expensive lifestyle.
The data for all these users that I'm referring to here, is the Vault Store partition data and the indexing data. It is always worth remembering that the index data *can* be rebuilt, even though it might take a very long time to do so, whereas the Vault Store partition data most likely can't. This might have some influence down the road if it comes to budgetary constraints and having to choose what to do with each type of data. In an ideal of world, of course, we'd treat both types of data as needing some sort of 'off server storage'. Remember that indexing data can be 12%+ of the overall size of your Vault Store partition data, so if you have around 5 TB of archiving data then you're going to need 600 GB+ of storage space for your indexes, and this needs to have fast random access capabilities in order to support archiving, and retrieval.
If we make the decision to move to server storage there are a number of options that we can explore. There is everything from relatively 'cheap' NAS devices promising almost limitless storage, to high-end SAN devices offering lightning fast access [due to the nature of striping data across many physical spindles, as well as advanced, and costly disk caches] and just about everything in between.
My personal preference is to go somewhere in between when it comes to price versus performance, and to stick with simpler technology like NAS devices rather than high-end SAN devices. As we'll see below there are additional changes that we can make when venturing down this road of 'off server storage', which make even relatively cheap off-server storage a viable option, and have protection as well.
What do we have left locally?
Once we move all the Vault Store partition data, and the Indexing data off the server on to our new 'limitless' NAS device, the question then becomes 'What do we have left locally'? There are a few things that might be left locally such as:
- Indexing metadata location
- Temporary files area, eg %TEMP%.
- PST Holding Area
- PST Temp Area
- Server cache folder
- Vault Cache build area
Many of these you can live without in the event of a disaster. To my mind the other critical one in this list is the PST Holding Area, and of course if you aren't doing PST migrations, then this isn't something that you need to worry about. It's possible that if you are doing PST Migrations then you can also move this storage area to a 'cheap' NAS device too. As it's likely to be transient it can even be an older / slower model, with less storage, and it can be less performant. It may be that it can be removed once the PST Migration has finished.
Which Network Storage is best?
There are many options for EV and Storage Administrators to choose from. Sometimes the type of device is dictated by corporate standards or by a non-EV team. But let me surprise you by saying that you can go 'cheap and cheerful' here. There is no need to buy expensive NetApp or EMC devices, you can get really good quality cheap NAS devices from many vendors. Which device to specifically choose is something that is down to you, and your own research.
"But what if that NAS device that I just bought fails?". It's a question which is rightly asked by people purchasing relatively unknown hardware (and relatively cheap hardware), but there is also a chance that the really expensive alternatives will fail too. For that reason many companies opt for replication of their storage data.
In my mind there are a couple of different replication options ... hardware based, but that tends to be very expensive, and software based, which tends to be cheaper. In fact the reason we invented EVnearSync was just because hardware based replication was so expensive. With a product like EVnearSync you can replicate Vault Store partition data in real-time and in addition you can replicate Index data, SQL database dumps and in fact any path you like, to additional remote storage.
EVnearSync has a real nice interface, which you can see below:
Of course as well as doing this replication there are components of EVnearSync which watch the storage that Enterprise Vault is using, and if it notices that the storage is not available it performs a storage-failover to the replica copy. You can also do this manually if you are planning maintenance on the primary storage. There are lots of other nice features in EVnearSync and it is great that is built to suite Enterprise Vault's data needs perfectly, and it integrates tightly with Enterprise Vault Storage safety copy mechanisms too.
In summary then the main thing to take away from this article is that you, as a diligent Enterprise Vault administrator, have to think outside of the box and come up with solutions to protect your data in the event of a disaster. Aside from the uber-expensive hardware options it is definitely worth investigating, getting a demo, and considering for yourself, replication products like EVnearSync.