cancel
Showing results for 
Search instead for 
Did you mean: 

Large Journal

Musa_Timur_Sari
Level 4
Partner Accredited

I'm about to create a large Journal archive, Large I men there will be 6 ev archiving server dedicated for journal. 300GB mail data created on Exchange servers daily basis. ev 10 Server will be used and there will be single NAS device at backend with 10G ethernet.

 

 

How to position INDEXes and journal storage.

 

Is it good idea to rollover partitions every six months to new partition. ??

 

Is it good idea to split journal archive which will be on single vault store (multiple partitions) For example create a new archive for yearly basis.

 

Storage , INDEX'ses will be on the same NAS device wihch will be quite big in terms of disk capacity?

 

How to backup this huge data, we will replicate the data (INDEX and STORAGE data will be on same NAS) to a same sized another NAS storage at the storage level. (SQL is backed up daily basis) Does any additional backup necessary? For example would it be wise to take weekly index and storage offline backup, or storage level snapshots ??

 

Any comments please??

1 ACCEPTED SOLUTION

Accepted Solutions

JesusWept3
Level 6
Partner Accredited Certified

So in really large financial institutions that archive the 2 Million + per day, the way they do it is they create a new archive every quarter, reasons being that if they are doing Discovery Searches and they have an exact time range, they can just specify the archives that meet those quarters.

Also if you ever have to rebuild an index for whatever reason, its faster to rebuild an archive thats split in to quarters, rather than an archive that has never been split and you now have to reindex from day 1 which could cause severe issues if you are facing deadlines for legal issues.

As for partition roll over, well thats purely up to you, typically you would want to make sure that you don't make it too big, just due to the fact that EV creates thousands and thousands of small files, which can severely impact performance just by sheer numbers alone.

As for indexes, just make sure you are creating plenty of of index root paths that have plenty of space, as you are going to rollover your indexes fairly frequently based on that amount of data being archived per day.

Also if you are going to have six ev servers, I would go against having one single archive, as that means you will be creating a bottle neck on storage with 5 servers sending all their data to one servers storage service, and then another server for indexing attempting to retrieve items from it to get them indexed.

So you'd want to have several servers, each with their own vault store and a journal archive that writes to the vault store hosted on that server, then vault stores should all be joined to a vault store group to take advantage of OSIS.

And then have several dedicated Index servers as it sounds like it might become heavily utilized by discovery processes at some point

https://www.linkedin.com/in/alex-allen-turl-07370146

View solution in original post

3 REPLIES 3

WiTSend
Level 6
Partner

Lots of management questions...

1)  I would recommend setting the VS partition rollover at not more than 1TB, for backup and restoral purposes.

2)  If you set up 8 index locations EV will rollover the indexes as necessary when they reach the default size.

3)  Creating a new Journal archive every year helps in the data management, but will cause a bit more effort in constructing eDiscovery searches if necessary.

4) Backups in a very large environment are best done using some type of disk-to-disk backup or replication.  I don't know that I'd recommend keeping the backups of the indexes for any extended length of time, they should only be used in event of a disaster recovery.    I would think that 2 weeks would be more than sufficient.

5)  Backups of the Vault Store partitions should also only be done for disaster recovery purposes.  The backups are not a historical repository, but just a DR solution.

GertjanA
Moderator
Moderator
Partner    VIP    Accredited Certified

Don't forget your transactionlog backup.

I am not going to advise, too little info. How many journal-mailboxes? Different retention? Will you use expiry? Will you use cab-file, and migration? Do you expect many DA-searches, large, small?

Regards. Gertjan

JesusWept3
Level 6
Partner Accredited Certified

So in really large financial institutions that archive the 2 Million + per day, the way they do it is they create a new archive every quarter, reasons being that if they are doing Discovery Searches and they have an exact time range, they can just specify the archives that meet those quarters.

Also if you ever have to rebuild an index for whatever reason, its faster to rebuild an archive thats split in to quarters, rather than an archive that has never been split and you now have to reindex from day 1 which could cause severe issues if you are facing deadlines for legal issues.

As for partition roll over, well thats purely up to you, typically you would want to make sure that you don't make it too big, just due to the fact that EV creates thousands and thousands of small files, which can severely impact performance just by sheer numbers alone.

As for indexes, just make sure you are creating plenty of of index root paths that have plenty of space, as you are going to rollover your indexes fairly frequently based on that amount of data being archived per day.

Also if you are going to have six ev servers, I would go against having one single archive, as that means you will be creating a bottle neck on storage with 5 servers sending all their data to one servers storage service, and then another server for indexing attempting to retrieve items from it to get them indexed.

So you'd want to have several servers, each with their own vault store and a journal archive that writes to the vault store hosted on that server, then vault stores should all be joined to a vault store group to take advantage of OSIS.

And then have several dedicated Index servers as it sounds like it might become heavily utilized by discovery processes at some point

https://www.linkedin.com/in/alex-allen-turl-07370146