cancel
Showing results for 
Search instead for 
Did you mean: 

Enterprise Vault - CAB File Questions

vanadasd
Level 4
Partner

Hello Again!

I'm looking to get some more information about your experience with the Enterprise Vault collector that creates CAB files, I haven't used it before and I wanted to get some feedback on what you guys have been seeing.

Some background information:

For better or for worse, we moved our Enterprise Vault(11) environment to a Data Domain (DD2500).  We're getting reasonable performance from a recall performance perspective right now, but our backup speeds aren't great at all. (200MB/Minute on average)  We're thinking that switching to the CAB files might really help the backup situation, but we're worried about the possible performance penalty for accessing the CAB files down the line.

If anybody can throw in their two cents about using CAB files and answer some or all of these questions, that would extremely helpful!

1. Does anybody have any "best practice" guidelines on how large the collection file size should be? I know the default is 10MB, but I want to know if people have had any luck with other values and the tradeoff between backup performance and retrieving archived files.

2. After selecting the "Use collection files" checkbox on a vault store (our vault stores are ~800GB in size), should I worry about the data being unaccessible during the collection process? I'm totally okay with it taking a long time for the collection process to complete, but I want to be sure that the data will be generally accessible during the collection process. If anybody has ballpark figures for how long their collection process took, that would be helpful as well.

3. Did anybody see a dramatic improvement in backup times after switching to a CAB file setup?  I've seen some symantec blogs that cite up to 33% improvement, but I'd love to get some additional feedback from others.

4. Has anybody switched to CAB files and noticed any noticeable performance degredation in rehydrating archived files?

Thanks in advance for all of your help!

Adam

2 ACCEPTED SOLUTIONS

Accepted Solutions

WiTSend
Level 6
Partner

The main complaints about CAB file have to do with unpacking the files in large data migrations and with a perceived performance delay when retreiving cabbed messages. 

One is just planning.  When you have to retreive, move, export, etc...  larger numbers of emails that are in CABs the cab files have to be fully expanded which can take additional space (temporarily).  During large data migrations, especially when changing archive platforms this can be challenging, but not serious when in that situation. 

Second, depending on how old that data is when you cab it there can be some performance impact on access messages due to the requirement that the cab must be fully expanded to retreive a single email.  Generally I don't collect the emails until they are 6 months old.  In this manner I mitigate the activity since the majority of access to archived data is before 6 months.

IMHO, the benefits of collections far outweigh the costs.

View solution in original post

AndrewB
Moderator
Moderator
Partner    VIP    Accredited

i would point out that there's only one single benefit to collections and that's to help with backups and by help i mean mitigate issues due to design flaws or changes in the environment, for example, like the OP mentioned with him having to move the EV data to DataDomain.

the drawbacks begin with what you outlined around performance and data migrations but to add to what you said, the other issues are with any situation that requires reindexing, exports of data for eDiscovery, and storage issues with closed partitions where you dont have enough free space for unpacked CABs or you need to waste space to accomodate for the potential for unpacked CABs.

like with massive migrations, the Move Archive process will also have the same drawbacks with having to unpack every single CAB file related to an archive even if it's just for 1 single email item.

if you loose a CAB file or a CAB becomes corrupt, all the DVS files inside it are impacted which could be in the order of magnitute od 100's of times the impact to just a single corrupt DVS.

and remember, there's no going back once you've enabled collections.

View solution in original post

7 REPLIES 7

WiTSend
Level 6
Partner

I don't use backup, but I do use replication and my storage platform was bogged down by billions of little files.  By turning on collections I have signficantly enhanced the storage platforms performance as well as replication performance.  I've seen no issues with access to archived data.

AndrewB
Moderator
Moderator
Partner    VIP    Accredited

just curious to know if you've seen the posts on this forum where we generally recommend against collections (aka CABs) and the reasons why?

vanadasd
Level 4
Partner

If you could summarize those or provide a link to the other posts, that would be extremely helpful.

 

Thanks!

WiTSend
Level 6
Partner

The main complaints about CAB file have to do with unpacking the files in large data migrations and with a perceived performance delay when retreiving cabbed messages. 

One is just planning.  When you have to retreive, move, export, etc...  larger numbers of emails that are in CABs the cab files have to be fully expanded which can take additional space (temporarily).  During large data migrations, especially when changing archive platforms this can be challenging, but not serious when in that situation. 

Second, depending on how old that data is when you cab it there can be some performance impact on access messages due to the requirement that the cab must be fully expanded to retreive a single email.  Generally I don't collect the emails until they are 6 months old.  In this manner I mitigate the activity since the majority of access to archived data is before 6 months.

IMHO, the benefits of collections far outweigh the costs.

View solution in original post

AndrewB
Moderator
Moderator
Partner    VIP    Accredited

absoutely, and glad to help. i just didnt want to repeat stuff to you that you already knew. check this out:

https://www-secure.symantec.com/connect/forums/enabling-ev-collection-consideration

AndrewB
Moderator
Moderator
Partner    VIP    Accredited

i would point out that there's only one single benefit to collections and that's to help with backups and by help i mean mitigate issues due to design flaws or changes in the environment, for example, like the OP mentioned with him having to move the EV data to DataDomain.

the drawbacks begin with what you outlined around performance and data migrations but to add to what you said, the other issues are with any situation that requires reindexing, exports of data for eDiscovery, and storage issues with closed partitions where you dont have enough free space for unpacked CABs or you need to waste space to accomodate for the potential for unpacked CABs.

like with massive migrations, the Move Archive process will also have the same drawbacks with having to unpack every single CAB file related to an archive even if it's just for 1 single email item.

if you loose a CAB file or a CAB becomes corrupt, all the DVS files inside it are impacted which could be in the order of magnitute od 100's of times the impact to just a single corrupt DVS.

and remember, there's no going back once you've enabled collections.

View solution in original post

vanadasd
Level 4
Partner

Thanks for your input guys, I really appreciate all of the extra feedback

I don't particularly feel comfortable with the drawbacks outlined by AndrewB, especially from a CAB corruption/reindexing perspective, but I think that my underlying infastructure changes are forcing my hand, I have no choice but to implement them anyway. I wish I had another option, but the backup times I'm experiencing now are unacceptable and this is the only change I can make from an Enterprise Vault perspective to improve them.

Thanks Again!

Adam