Forum Discussion

Stanley_Korivi's avatar
13 years ago
Solved

EV 9.0 Reports Discrepency

Hi, - When i ran Vault Storage usage summary report in Enterprise Vault SQL server reporting services it was showing size as approx 6.5 TB. But when i check the size of the two drives we host Store partitions the total drive size for both drives is only about 4TB & 3.5 TB is used space. Any ideas why this discrepency in the sizes. I need to provide a detailed report to my management and now I am confused as the what the actual size of archived items.

Totals for All Archives in Vault Store(s)

 Vault Store    Active Archives       Total items            Total Archives Size

Store 1          3964                     43416180             6,565,302.19               

Total Number of Active Archives: 3970

Average original size of items (KB): 192.37

Total size of archived items (MB): 6,566,018.63

Average size of archived items (KB): 154.82

 

Total original size of items (MB): 8,158,351.57

 

  • OK so what makes it harder is that you have multiple vault stores and multiple partitions, all sharing within each other.

    Typically an email over a certain size is comprised of three parts

    *.DVS which is the email and user information
    *.DVSSP which is the sharable part, which means usually its an attachment
    *.DVSCC which is the converted content, wich is an HTML version of the attachment thats used to be added to the indexes and such

    lets say you have the following set up


    VaultStoreGroup1 - Share across all vault stores
     - VaultStore1
      - VaultStore1Ptn1 - V:\Enterprise Vault Stores\VaultStore1\Ptn1
      - VaultStore1Ptn2 - W:\Enterprise Vault Stores\VaultStore1\Ptn2
    
     - VaultStore2
      - VaultStore2Ptn1 - E:\Enterprise Vault Stores\VaultStore2\Ptn1
      - VaultStore2Ptn2 - F:\Enterprise Vault Stores\VaultStore2\Ptn2
    

    Now lets say theres a company wide email sent out to everyone with a new org chart or something such as that and it's an 8MB PDF file to 2000 users.

    1000 of these users are on VaultStore1, and the other 1000 are on VaultStore2

    When Enterprise Vault first archives the item, it creates the DVS file, the DVSSP file and the DVSCC
    The DVS File will be about 21kb,
    The DVSSP will be the 8MB attachment compressed, and that becomes 3,442KB
    The DVSCC becomes 18kb (since a lot of the stuff in a PDF cant be converted to html)

    Altogether this 8MB email has becomes 3,481KB

    So lets say this is archived by one person to begin with and it gets stored as
    V:\Enterprise Vault Stores\VaultStore1\Ptn1\2012\01-17\5\0C0
    50C09CD0440E1947264B68B8542BF0F1.DVS (21kb)
    50C09CD0440E1947264B68B8542BF0F1~25~544A2CFC~00~1.DVSCC (18kb)
    50C09CD0440E1947264B68B8542BF0F1~25~544A2CFC~00~1.DVSPP (3,447kb)

    Now, when everyone else archives it, it will see that the item has already been shared.
    Overall you will have 2000 x DVS files @ 21kb, 1x DVSSP file @ 3,447kb, and 1x DVSCC file @ 18kb

    The Uncompressed Un-SIS'd email for all users in exchange would be
    2000 x 8MB = 16000MB (15.6GB)

    With Sharing and Compression:
    42MB of DVS files
    3.36MB of DVSSP files
    and 18kb of DVSCC files

    So lets say 500 people archived it in each partition:

    V:\Enterprise Vault Stores\VaultStore1\Ptn1 = 10.29MB (500 x DVS, 1 x DVSSP, 1 x DVSCC)
    W:\Enterprise Vault Stores\VaultStore1\Ptn2 = 10.25MB (500 x DVS)
    E:\Enterprise Vault Stores\VaultStore2\Ptn1 = 10.25MB (500 x DVS)
    F:\Enterprise Vault Stores\VaultStore2\Ptn2 = 10.25MB (500 x DVS)

    So now you can see that each vault store will be reporting that it has an email that is 8MB original, and 3.4MB after compression, however due to Single Instance Storage, Vault Store 2 doesn't have that shared part, as it belongs to Vault Store 1 Partition 1.

    This alone throws the metrics right out the door and is a good reason as to why the SQL queries will not come close to what you actually see on disk, and then if you throw in a device that does its own compression and deduplication, or you have compression on the NTFS level as well etc, it becomes very very very messy

5 Replies

  • The reports will never be accurate, you have to take in to account the whole OSIS things, and then you will have things such as Converted Content which will not be taken in to consideration and anything such as secondary migration etc.

    Your best bet is probably looking at the Single Instancing Report

  • does it mean that data what i see on the actual NTFS drives is the total compressed size of the archived data & not the total archived size? I ran a report on SIS reduction summary & some of the ouput below. I am assuming that i should present the compressed size data as our total NTFS drives size is not more than 4 TB?

     

    Vault Store    TotalDisk Size (GB)     Total Orig Size (GB)    Total Comp Size (GB)  Storage Reduction (GB)

    Store 1          6,412.128                8.633.847                 2,693.217                    5,940.630

  • OK so what makes it harder is that you have multiple vault stores and multiple partitions, all sharing within each other.

    Typically an email over a certain size is comprised of three parts

    *.DVS which is the email and user information
    *.DVSSP which is the sharable part, which means usually its an attachment
    *.DVSCC which is the converted content, wich is an HTML version of the attachment thats used to be added to the indexes and such

    lets say you have the following set up


    VaultStoreGroup1 - Share across all vault stores
     - VaultStore1
      - VaultStore1Ptn1 - V:\Enterprise Vault Stores\VaultStore1\Ptn1
      - VaultStore1Ptn2 - W:\Enterprise Vault Stores\VaultStore1\Ptn2
    
     - VaultStore2
      - VaultStore2Ptn1 - E:\Enterprise Vault Stores\VaultStore2\Ptn1
      - VaultStore2Ptn2 - F:\Enterprise Vault Stores\VaultStore2\Ptn2
    

    Now lets say theres a company wide email sent out to everyone with a new org chart or something such as that and it's an 8MB PDF file to 2000 users.

    1000 of these users are on VaultStore1, and the other 1000 are on VaultStore2

    When Enterprise Vault first archives the item, it creates the DVS file, the DVSSP file and the DVSCC
    The DVS File will be about 21kb,
    The DVSSP will be the 8MB attachment compressed, and that becomes 3,442KB
    The DVSCC becomes 18kb (since a lot of the stuff in a PDF cant be converted to html)

    Altogether this 8MB email has becomes 3,481KB

    So lets say this is archived by one person to begin with and it gets stored as
    V:\Enterprise Vault Stores\VaultStore1\Ptn1\2012\01-17\5\0C0
    50C09CD0440E1947264B68B8542BF0F1.DVS (21kb)
    50C09CD0440E1947264B68B8542BF0F1~25~544A2CFC~00~1.DVSCC (18kb)
    50C09CD0440E1947264B68B8542BF0F1~25~544A2CFC~00~1.DVSPP (3,447kb)

    Now, when everyone else archives it, it will see that the item has already been shared.
    Overall you will have 2000 x DVS files @ 21kb, 1x DVSSP file @ 3,447kb, and 1x DVSCC file @ 18kb

    The Uncompressed Un-SIS'd email for all users in exchange would be
    2000 x 8MB = 16000MB (15.6GB)

    With Sharing and Compression:
    42MB of DVS files
    3.36MB of DVSSP files
    and 18kb of DVSCC files

    So lets say 500 people archived it in each partition:

    V:\Enterprise Vault Stores\VaultStore1\Ptn1 = 10.29MB (500 x DVS, 1 x DVSSP, 1 x DVSCC)
    W:\Enterprise Vault Stores\VaultStore1\Ptn2 = 10.25MB (500 x DVS)
    E:\Enterprise Vault Stores\VaultStore2\Ptn1 = 10.25MB (500 x DVS)
    F:\Enterprise Vault Stores\VaultStore2\Ptn2 = 10.25MB (500 x DVS)

    So now you can see that each vault store will be reporting that it has an email that is 8MB original, and 3.4MB after compression, however due to Single Instance Storage, Vault Store 2 doesn't have that shared part, as it belongs to Vault Store 1 Partition 1.

    This alone throws the metrics right out the door and is a good reason as to why the SQL queries will not come close to what you actually see on disk, and then if you throw in a device that does its own compression and deduplication, or you have compression on the NTFS level as well etc, it becomes very very very messy

  • wow, that explanation is in true JW fashion. by far one of the best writeups i've seen this year. even worthy of being an article on its own. kudos to you, sir!