cancel
Showing results for 
Search instead for 
Did you mean: 

Index size and reporting

Tobbe
Level 5
Hi.

I have a EV 2007 solution that has been up and running for approx 15 months. There are a couple of servers involved for file archiving and DA but primarily we have two servers that handle mail related archiving, one for journaling and one for mailbox archiving and PST migrations.
From day one the Indexing level has been set to FULL for both the journaling and mailbox archiving functions and according to sizing guides this should allocate approx 12% of the archived data.
The Journaling server has archived 3.280GB of data and its index is currently 120GB large. That is only 3.6% if I calculate this correctly.
The Archiving server has archived 3.610GB of data and its index is currently 233GB large and that is only 6.4%, again if I calculate this the correct way.

The sizes that I mention are taken directly from the storage location, which is a regular NTFS storage, so it is after EV's single instancing and compression efforts but without any hardware based intervention.
 I understand that the official 12% estimate is before SIS and compression, I.E 12% of original original size.
I know that by using Report manager I can get a figure on space saved by compression when using the “archived items per hour” template. Is there an easy way to get a figure on how much space that has been saved since initial implementation, due to SIS and compression per vaultstore?

Also, how does these figures compare to other implementations?

Thanks for your efforts.

/Tobbe

Edit: GB...not MB.
1 ACCEPTED SOLUTION

Accepted Solutions

Liam_Finn1
Level 6
Employee Accredited Certified
 Tobbe,

You must remember that not every item can be indexed. Any item that is encrypted is not indexed the same goes for items like some images. Also any item over 50MB is also ignored for indexing.

The % of index to actual storage is a moving target.

Example we have a total storage of 33,253,084MB and our indexes are 1107763.2MB which is 3.33% of vault storage and we are running full indexing on all of our indexes



View solution in original post

15 REPLIES 15

Wayne_Humphrey
Level 6
Partner Accredited Certified
Hi Tobbe,

Could you please confirm your indexing Levels that have been set on those?  Topically you looking at Brief 3%m Medium, 7% and Full 12% so it looks like you might have Brief Indexing on Journaling and Medium on Mailbox Archiving.

--wayne


Tobbe
Level 5
Hi Wayne.

Yes, they are definitely set to full indexing.

Wayne_Humphrey
Level 6
Partner Accredited Certified
Tobbe,

Could you please run this on SQL for me?

SELECT cc = count (*),
ii = sum (itemsize/1024),
jj = sum (itemsize/1024)/count (*)
INTO #temptable
FROM saveset
GROUP BY
iddatetime,idchecksumhigh,iduniqueno,idchecksumlow
ORDER BY
cc
DESC
SELECT "Total files archived" =  sum(cc),
"Number of files after sharing" = count (CC),
"Total size before sharing" = sum (ii),
"Total size after sharing" = sum (jj),
"Estimated size after archiving" = sum (jj) + (count (*)*5) + (sum (cc)*2)
FROM #temptable
DROP TABLE #temptable
The following SQL Sripts where ran on each vault store to gather relevant information

Tobbe
Level 5
Thank you Wayne.

This is the result from the vaultstore that handles mailbox archiving, I.E. that one that has 3.610GB of data when looking through explorer.

I got to admit that these figures confuses me. Is "total size before" and "after sharing" in KB and "Estimated size after archiving" in....what units?

Again, thanks for the efforts.

/Tobbe
MailboxArchive.JPG
.

Wayne_Humphrey
Level 6
Partner Accredited Certified
Tobie,

I don't quite understand where you got your figures from.

"The Journaling server has archived 3.280GB of data and its index is currently 120GB large" 

Are you trying to say that you only got 3.2GB of data in your Journal Vault Sore? That is impossible.  I think you should have about 1.5TB which then also ties up with the SQL DB and your indexing of 120GB 

So maybe elaborate where you got your figures from.

Tobbe
Level 5
Hi again Wayne.

No, I'm trying to say...hmm..lets put it this way,  that I have threethousandtwohundredandeighty GB of data in the journaling store, and that figure is what is presented to me through windows explorer.
This is LUN's presented to the journaling server from SATA drives on the SAN.
The Index is also located on the SAN, but on FC disks, and explorer reports that it is currently allocated with 120GB.

The vault LUN's only contains .dvs files and the index LUN only contains the eight recommended index folders which in turn contains the indexes so there isn't any other data that is occupying these drives.

The same layout goes for the mailbox archiving server.

I did however find a report in report manager "Vault store usage summary" that reports original size of data, and the size of the archived data.

Now, to add a bit more numbers to this confusion, that report states that EV has archived a total of 8.608GB of data in its original size, and that allocates 7.383GB in the archives, journaling and mailbox archiving combined.
What i see through explorer is 3.631GB from mailbox archiving and 3.280GB from the journaling which equals 6.911GB. That is a 472GB difference...?

Never the less, Now that I've found the report on the total size of the original data, 8.608GB and the total size of my indexes are 353GB (120+233) that gives me again, that my indexes only occupies 4% of the original data size.

So to wrap this up, the 12% estimate from Symantec - Is it reasonable, and typical in other implementations since that is obviously not what I am seeing unless I'm doing something fundamentally wrong here?

Wayne_Humphrey
Level 6
Partner Accredited Certified
"7383GB in the archives, journaling and mailbox archiving combined.
What i see through explorer is 3631GB from mailbox archiving and 3280GB from the journaling which equals 6911GB. That is a 472GB difference...?"

Compression and Single Instancing.....

Please could you include a screenshot of usage.asp?  Are you 100% sure the archives where set to full indexing when they where created and someone did not just switch it to full? because that does not work... The indexes will still be brief.

Tobbe
Level 5
Hi again, and thanks for not giving up on me :)

The difference cannot be compression and single instance because if you look at my earlier post the original size is reported as 8.608GB, and the size in the archive is reported as 7.383GB by EV. There's the SIS and compression.
However, looking in Explorer it only adds up to 6.911GB.

I know there's a lot of numbers and text in my posts but I'm trying hard to be specific.

But I'm curios about your statement about changing the indexing level. Are you saying that even if you change the indexing level, and I'm pretty sure the default is medium, it wont make any changes unless you change it during initial setup?  It that is correct it doesn't make any sense to have radio buttons for you to set the indexing level...?

If you mean that already archived items wont get re-indexed I totally agree, but even if I'm not the one who did the initial setup I got involved not many weeks after the setup so both vault stores has been set to Full indexing for a long time any way.

I've attached a slightly anonymized screen shot of the report.

2009-11-26 17-58-42.jpg

Wayne_Humphrey
Level 6
Partner Accredited Certified
Tobbe,

Please go to http:\\evserve\EnterpriseVault\usage.asp  I want those figures....

Tobbe
Level 5
Shure thing.



2009-11-26 20-43-45.jpg

Wayne_Humphrey
Level 6
Partner Accredited Certified
Tobbe,

Ill take a look at this tomorrow if I can.  I am not on my work PC so don't have access to my vm lab.


Tobbe
Level 5

Thanks Wayne.

I'll also try to get some figures from other implementations to see how they correspond and match the 3,8,12 percent guideline.
If others have had the strength to read all the way to this post I'd appreciate som reference data.

Thanks again,
/Tobbe

Tobbe
Level 5

Hi again.
Now I have more reference number from three other implementations but they don't make me any wiser. All of these implementations are, and have been running Full Indexing from start but the resulting size varies greatly.

Implementation 1: Index size is 6.5% of Vaultstorage
Implementation 2: Index size is 5.8% of Vaultstorage
Implementation 3: Index size is 15% of Vaultstorage

This makes future capacity planning...hmmm..fuzzy?

Liam_Finn1
Level 6
Employee Accredited Certified
 Tobbe,

You must remember that not every item can be indexed. Any item that is encrypted is not indexed the same goes for items like some images. Also any item over 50MB is also ignored for indexing.

The % of index to actual storage is a moving target.

Example we have a total storage of 33,253,084MB and our indexes are 1107763.2MB which is 3.33% of vault storage and we are running full indexing on all of our indexes



Tobbe
Level 5

Thanks Liam.

It nice to have more references that indicates that full indexing can generate such low ratio as 3%.