cancel
Showing results for 
Search instead for 
Did you mean: 

Tough Questions on EV file archiving

Samuel_Lee
Level 4
I am evaluating EV file archiving for my company. Would anyone here knows the answer for any of questions below?

How to determine SIS for file archiving?

How does the collector compression work?

If part of the vaulted files (place holder) are moved to a different file server, will they work?

If the file archiving source is NetApp OnTap 7.0, will EV work?

Does the file server still require FSA agent if we just want to archive files from it?

How much SQL database size ratio as compare to the vaulted file storage ratio?

If EV is down, will file retrieval work? What about if SQL down?


Thanks for your help,
Sam
6 REPLIES 6

Micah_Wyenn
Level 6
Okay,
I'll take a quick stab, but just a warning...FSA isn't often asked for so I'm guessing in a lot of spots...Sym support folks correct me where I'm wrong. :)

SIS for file archiving still holds to the "email archiving rules", aka same partition, same vault only. With that said, I believe (haven't tested it) that SIS will only shortcut when the two files are exactly the same...size, mod date, etc. If you mod it you'll create a version, which I don't think gets SIS'd.

The collector won't compress anything that's over it's size limit. In fact, I think anything over 50mb doesn't even get indexed, and becomes a .dfs file instead of a .dvs file. So if you've got a 20mb file, and collecting at 10mb, it ignores it and looks for little things (I think).

The placeholder should still work if the server has the placeholder service on it (I believe). If it doesn't, ya sorta screwed. Symantec does recommend that before you do any sorta move, that you do a recall first and then move the file to preserve any policy/folder settings you may have in the new location.

Ontap7.0 will work, but you need to have a placeholder service on a fileserver somewhere. I know this is in the install guide somewhere but I'm way too lazy to dig it up.

Um, technically it doesn't "require" the agent, but you'll want to install it if you ever want to get those files back. If you're just doing a pull in to the archive and not leaving stubs, then pull away (I think). Safest bet might be to not leave placeholders, and instead just go with the goofy URL shortcuts. Hey, macs are making a comeback right?

As far as I know, the DB's size is about equal to the email archiving measurements, as it's still just doing metadata. I can't see any real diff on the sizes between them in my demo.

If SQL is down, you're automagically hosed no matter what. If EV is down, and you've got multiple boxes, you can use USL to failover. Other then that, ya kinda hosed. I don't think it "caches" itself or leaves anything like a message body stub in email archiving can do.

So okay, it might not be the best answers, but at least ya getting some weight in there. :)

micahMessage was edited by:
Micah Wyenn

TonySterling
Moderator
Moderator
Partner    VIP    Accredited Certified
> How to determine SIS for file archiving?

A checksum is performed on the files. A document must be at least 50KB uncompressed in order to be considered for sharing. It has been determined that the additional expense of performing an MD5 hash, plus the extra database size/performance overhead is simply not worth it for documents smaller than 50KB uncompressed.

>
> How does the collector compression work?

Are you talking about Collection and Migration to move archived files to secondary storage.

>
> If part of the vaulted files (place holder) are moved
> to a different file server, will they work?

Yes, they will work if you move them, as in essence they are shortcuts. However, if you use ArchiveExplorer to copy them back they will go to the original location.

>
> If the file archiving source is NetApp OnTap 7.0,
> will EV work?

Yes, it works. The placeholder service in this case is installed on the EV server itself. The Vault admin help has some info on this.

>
> Does the file server still require FSA agent if we
> just want to archive files from it?

You do not have put a Placeholder service on the file server. If you do not your options are to leave an Internet Shortcut or not leave a shortcut at all. More information in the Admin guide on this as well.

>
> How much SQL database size ratio as compare to the
> vaulted file storage ratio?

Allow 250 bytes for every item archived. More space should be allowed to hold
static data and to allow for temporary growth and the transaction logs. It is
suggested that 5 GB is allowed for the Directory database and 10 GB for each
Vault Store database.

>
> If EV is down, will file retrieval work?

If EV is down your placeholders will not work.

>What about if SQL down?

If SQL is down you might be able to recall files, as long as the Directory service is running.

Lee_Allison
Level 6
> A checksum is performed on the files. A document
> must be at least 50KB uncompressed in order to be
> considered for sharing.

Tony, about a year ago dev stated that if the storage was Centera and the file was less than 100k it wouldn't be considered for sharing. However the size wasn't an issue on NTFS storage.

Did they expand the same "We won't bother SIS'ing something less than XYZ size" to NTFS storage as well? If so doesn't that present an issue in CA/DA export instances where our customers are relying on SIS to reduce legal team's workload?

TonySterling
Moderator
Moderator
Partner    VIP    Accredited Certified
How would SIS relate to export for DA/CA?

Even if an item isn't SIS'd it should be De-duplicated. Example, even if you do not turn on Share Archived items, we will still De-duplicate. Is that the reduction you are referring to for the legal team?

Lee_Allison
Level 6
SIS applies to dedupe because the dedupe is literally nothing more than "Do I have duplicate SSID's in this search result set?" If EV doesn't SIS then CA/DA won't dedupe, period.

For those playing along with the home game... in Compliance or Discovery Accelerator, when you run a search and accept those results the Accelerator will 'dedupe' the results. So your search may return 100 items, but if 10 are duplicates of other archived mail in the same result set, then when you accept the results there are only 90 items there.

This dedupe is nothing more than a check for duplicate SSID's. No content is looked at, nothing more is processed.



Also, a quick lab check showed that a single email of 3kb prior to archiving did get SIS'ed when archived from more than one mailbox into the same store.

TonySterling
Moderator
Moderator
Partner    VIP    Accredited Certified
Right, the 50kb is for Centera storage.

Also, good point of de-duplication. It is de-duplicating items which have the same saveset id.