Replicating Enterprise Vault Data
There are a few products on the market that deal with software based replication of data, ones that spring to mind are:
There is at least one other, as we'll see. In this article though I want to step back a little bit and look at what a replication product should give you when it comes to replicating Enterprise Vault data.
Simple to Configure
The first thing that comes to my mind when wanting a product to do data replication is that you want it to be simple to configure. You don't want to have to access some monolithic application (vSphere client, springs to mind!) or worse still a command line based system to configure or check on the replication. I, like many people, want to be able to configure and see this replication happening at a glance, and I want to be able to do it from anywhere, using pretty much any web browser. That's quite a tall order I know, but these are my wish-list of things in the 'perfect' software product.
Once I can access the configuration, I want a simple set of choices of what I can replicate from one location to another. I also want to be able to select multiple source locations and replicate them to a target location, without the need of configuring each individual Vault Store or partition. In other words I want to be able to group some aspects of my source together, to replicate to a common target.
Once all the configuration has been done in the environment it is important to be able to manually failover storage, partly to test that it works, but also if there is a need to do some maintenance work on the primary storage, failing over to the secondary storage is a necessity. During this failover, which I want to be really quick, I don't want to have to do too much other 'stuff' to manually assist the failover, ideally I want the failover to do all the work for me.
Real Time Replication
For the majority of the data, for example saveset data, I want the replication to be real time between my source and target locations. The Vault Store data on disk is critical to the Enterprise Vault system, and so replicating this hourly or daily or anything like that just doesn't fit the bill. Importantly it would not meet any kind of disaster recovery scenario. It's not necessary to replicate all of the data real-time though: data such as Index Volume data (which can of course be rebuilt) may only need to be replicated daily, or even weekly - I want to be able to choose that level of granularity when setting up the replication from the source locations.
Monitoring
Once I've configured the replication schedules and capabilities for all of my Enterprise Vault data which I need to replicate in case of a disaster, the main thing that I want to be able to do is monitor the replication. I think it is crucial to be able to see what data has been replicated, and that the source and target are 'in synch'. Another thing that is good to know is how much data is being replicated over time, it's just another visual way of seeing that replication is in fact taking place.
To me though, monitoring doesn't stop at me watching the system replicate data between the source and the target. Monitoring also means that the system is watching 'itself'. In case of storage not being available I want the system to be able to handle a failover of the storage, pretty much without any interaction from me, or any of the other admins that monitor and maintain the Enterprise Vault environment.
Data Consistency
A product that replicates the Enterprise Vault data in a high performance environment also needs to be able to check that the data between the source and the target is consistent. Especially when an environment is failed over to the standby storage and failed back. If data is written to that standby storage, when the failback happens that data will need to be copied over too.
It's worth thinking too about how much data loss can be tolerated. Products like SnapMirror if configured to do an hourly snapshot of data stand the chance of losing an hours worth of data, which can be incredibly difficult to resolve and consolidate when a product like Enterprise Vault is the consumer of that data.
Tight Integration with Enterprise Vault
Some products on the market I would describe as generic replication products don't really have this tight integration with Enterprise Vault. They replicate the data, great, but don't have an underlying concept of what Enterprise Vault is, or does, with it's data. For example the products mentioned at the start of this article don't know what Saveset data is, nor index data; they're generic. Whilst being generic is good in many ways, I think that when it comes to the complexities of Enterprise Vault data specialisation is best.
Conclusion
If you're looking for a product that does all these things, and more, then you should strongly consider looking at EVnearSync, from QUADROtech. We have an intimate knowledge of what Enterprise Vault does with it's data, and what it needs to 'survive' a replication situation or disaster recovery. EVnearSync has a great, simple, web based interface that provides all the management, monitoring and configuration needs when it comes to replicating Enterprise Vault data. Yes some of the other products mentioned at the beginning of the article have similar features to EVnearSync, but none come with the many years of experience of Enterprise Vault like EVnearSync.
Do you replicate your Enterprise Vault data? What do you use to do it?