cancel
Showing results for 
Search instead for 
Did you mean: 

Enterprise Vault Sizing Tool output - is this right?

bjorn_b
Level 6
Partner Accredited Certified

Hello

This is for those of you with access to the Sizing Estimator for Enterprise Vault (I've used the latest version)

I have run the mailbox analyzer tool and exported the results to an XML file. This XML file is imported into Excel, and this "fabulous" worksheet that will give us some information about how much is in the backlog (to archive), and how much will be archived in the three years to come. 

So far so good, but I think these numbers are strange (see attached image) and I find it extremely strange that

  1. The backlog size is smaller than the 1st year (and 2nd and 3rd year, but that is obviously because we expect data in Exchange to grow from Year 1)
  2. Index size is 3/4 of the vault store size.

I have used default values more or less.

Someone else experienced strange output, or could it be something I am missing out completely here?

 

Any (useful) help is much appreciated

1 ACCEPTED SOLUTION

Accepted Solutions

StephenWatts
Level 2
Employee Accredited Certified

Andy Joyce actually taught me this one, when I had the same question......

Indexes are based on a per user archive not a per item.

For example, if you have 2 users that have a 1 MB email, each user index will be 130kb (roughly) for a total of 260kb of total index against a 1 MB item. (See your message attachment SIS ratio)

So total storage is

800KB for the actual email (at 80% compression)

260 KB for the index of the 2 users

So it does look like the indexes are larger than 13%, more like 26% of the original item, but you have to take into account each user's individual index.

The use of SIS allows you to reduce the amount of storage required for the original item, but does not change the amount of space required for the use index.

The advantage of having a per user archive is that in the event of a corruption or disk failure, you only need to rebuild some indexes instead of one big index (which could take an extended period of time). Smaller indexes are less prone to corruption also.

You also have some monster size messages with lots of attachments to index

758KB for average message size is huge.

So be aware that your performance metrics wont be anywhere close. EV performance metrics are based on average message size of 70kb. For each doubling of the size, throughput drops by 1/3. Therefore the performance guide states 8 core CPU is roughly 50,000 items per hour at 70KB.

Without having to relearn calculus for the formula to figure out your expected performance, keep in mind it may look like EV is slow if you look at message throughput per hour.

 

View solution in original post

10 REPLIES 10

LCT
Level 6
Accredited Certified

The backlog it's based on what is currently on your exchange servers, the growth for year 1 and year 2 looks OK, but the Indexes should only 14% max of the Vault Store, might want to report that to Symantec and confirm if it's a bug in the tool.

Do you have the old tools? You can try and run that and compare the tools with the same data input?

TonySterling
Moderator
Moderator
Partner    VIP    Accredited Certified

Can you double check your percentage size for indexes?  the default is 13%

 

BTW, I am using EV 10 Sizing Estimator version 1.5

FreKac2
Level 6
Partner Accredited Certified

The backlog is what is currently in the Exchange servers. So day 0 -> Whatever.

The year 1 steady state is calculated (if I remember correctly) based on the average amount of mails during the last week and then calculated from there (so average number of mails of day 0->7 multiplied with the number of working days etc.).

Year 2 and 3 is just the "growth" added to steady state year1 figures.

The indexing is the percentage of the indexing level of the original message/attachment size (so not based on the vault store size which is SIS'ed and compressed size).

But the indexing figure seem to be quite large compared to the VS size, a 60-70GB size would make more sense.

Make sure you got the latest version of the estimator, if you have access to symiqforpartners drop a message there and Andy Joyce who created the spreadsheet will most likely comment on this.

bjorn_b
Level 6
Partner Accredited Certified

I have version 10.5 of the estimator, and the index size is set to 13 % (for full index)and the rest is set as shown in the attached picture.

I completely aggree that the indexes estimate seems to be way to large

StephenWatts
Level 2
Employee Accredited Certified

Andy Joyce actually taught me this one, when I had the same question......

Indexes are based on a per user archive not a per item.

For example, if you have 2 users that have a 1 MB email, each user index will be 130kb (roughly) for a total of 260kb of total index against a 1 MB item. (See your message attachment SIS ratio)

So total storage is

800KB for the actual email (at 80% compression)

260 KB for the index of the 2 users

So it does look like the indexes are larger than 13%, more like 26% of the original item, but you have to take into account each user's individual index.

The use of SIS allows you to reduce the amount of storage required for the original item, but does not change the amount of space required for the use index.

The advantage of having a per user archive is that in the event of a corruption or disk failure, you only need to rebuild some indexes instead of one big index (which could take an extended period of time). Smaller indexes are less prone to corruption also.

You also have some monster size messages with lots of attachments to index

758KB for average message size is huge.

So be aware that your performance metrics wont be anywhere close. EV performance metrics are based on average message size of 70kb. For each doubling of the size, throughput drops by 1/3. Therefore the performance guide states 8 core CPU is roughly 50,000 items per hour at 70KB.

Without having to relearn calculus for the formula to figure out your expected performance, keep in mind it may look like EV is slow if you look at message throughput per hour.

 

Andy_Joyce_VERI
Level 6
Partner Employee Accredited

I suppose I should weigh in here  :)

The previous responders are completely correct:

 

1) backlog is based on what is in the Exchange mailboxes currently - there is no direct correlation between this and the steady-state archiving

2) indexing is 13% (on average) of the original size of the items - unshared, as each mailbox archive has its own index.

 

A

 

FreKac2
Level 6
Partner Accredited Certified

As Stephen noted the average message size is quite large.

But what actually seems a bit strange is the vault store size :)

If you just take "ballpark" numbers for storage reductions with EV it's between 40-50% of the original size of the messages.

So if you take:

Number of mailboxes * Avrg. Messages Per Day * Number of Working Days * Avrg. Size of Messages and divide that by 1024*1024 (to get a GB figure).

You would get.

663 * 19 * 260 * 758 / 1024 / 1024 = Aprox. 2368 GB

With the average "ballpark" SIS+Compression that would give you roughly 1.1TB of VS size.

If you take 13% Index of 2368GB that would give you aprox 308GB which is about what you have in your sheet for the index size.

 

So there is something not quite "normal" with the results.

I would check a couple of things.

The sample size number: with that small amount of mailboxes I would run the EMA/ESR tool with 100% sample size.

Check if there as an anomaly in regard to the results from the tool.

E.g. a very large amount of mails or size during the last week (day 0->7).

Andy might have other ideas on what to check.

 

Londonnet
Not applicable

Can anyone help ?  I have Symantec Antivirus 10.1.5000.5   Normally I am prompted to update and this happens without any problem.

Over the last month I have tried to update using LiveUpdate and get error message telling me Live Update has an internal error and recommending that I reinstall Live Update.

Does anyone else have this problem ?

How do I reinstall Live Update ?

bjorn_b
Level 6
Partner Accredited Certified

Thanks a lot for your replies guys! I really appreciate it! :)

I think the marketing division puts a lot of huge e-mails into this system, but I agree that the average attachment size is way bigger than what we should expect.

FreKac: When you say 100% sample size, do you then mean full analysis mode? We did that in the first run, but that one failed, so I turned it of :)

/bjorn

FreKac2
Level 6
Partner Accredited Certified

What I meant with 100% is so that the sample rate is 100%, meaning each mailbox is part of the analysis and not just 10 or 30% since that may skew the results. E.g. if 10% that is just aprox 60 mailboxes, if those are really large or really small compared to the average of the organisation you may get a skewed result.

If I remember correctly, Full Analysis mean that the tool check for duplicates (to get the SIS rate).

But it takes quite some time so it's not often used, AFAIK.