cancel
Showing results for 
Search instead for 
Did you mean: 

BE 2012: How to erase Deduplication space used?

MIXIT
Level 6
Partner Accredited

My BE media server was used last night to test Deduplication.  It was my first try at it.  The server has C: and F:, F: being the larger volume.  In BE I set thigns up to run a full backup of the entire system (physical server acting as Hyper-V host, and 4 VMs).  My destination was the F: drive.  I was hoping BE knows that even though the F: is part of the source of the backup, the dedupe folder on that same drive is the destination for this dedupe backup, so it hopefully knew not to go into a cycle and backup the destination as the soruce.  It didn't seem to from what I can tell. 

Three questions: 

1.  What is the best way to determine how much space the dedupe back up used?  A partner question to this:  is there any way to know exactly how well Deduplication worked?  An indepedent tool, the Deduplication Assessment Tool can predict how good dedupe would be, does BE give information like this after a dedupe backup is done? 

2.  I now would like to reclaim the disk space used by this dedupe as it ate up about 700GB and hasn't released it even affer I canclled the job (I ran it, but canclled it this morning as it was still on Verify). 

3.  Will dedupe backup jobs act like tape jobs whereby if you tell it to Overwrite media, it's going to remove the previous dedupe job and put in the new one?  So for this 700GB job, had I let it finish to it's considered intact, if I ran it again this evening would it first remove the existing backup, THEN start a new backup?  Or it is like SSR where it first has to write the newe backup, and then later dleete the old job, requiring you to have double the storage space for thjis swap activity. 

Thank you. 

2 ACCEPTED SOLUTIONS

Accepted Solutions

pkh
Moderator
Moderator
   VIP    Certified

Very briefly, here is how dedup works.  When you do a backup, the backup set is broken into small chunks and stored in the dedup folder.  When you do the next backup, it too is broken up into small chunks.  These chunks are compared to the chunks already stored in the dedup folder.  If a chunk is identical to a chunk which is in storage, then it is not stored and a pointer is created from the chunk to the second backup.  Thus this chunk has 2 pointers, one to the first backup and the other to the second backup.  This process is done for each subsequent backup.  As the backup increases, you would expect more and more chunks not to be stored because of existing chunks.  This is the space saving of dedup.

When a backup set expires, its links to the chunks are removed.  When a chunk has no more links to any backup, then it is removed during the next dedup maintenance cycle.  This maintenance cycle also reclaims the space taken up by these deleted chunks by compacting the containers in the dedup folders.

Thus dedup folder is different from other media.  To maintain all these pointers, etc., there are quite a bit of overheads and the size of the dedup folder has no direct correlation with the size of the backup sets stored in it.  So you cannot say that if the backup sets are 2TB, my dedup folder will be xTB.  The documents quoted earlier will give you some idea how the space is used and how it is reclaimed.

To know how good is your dedup process, interpret the dedup stats in the joblog using this document

http://www.symantec.com/docs/TECH146827

As I said previously, as more data is stored in the dedup folder, your dedup ratio should improve.

It is not advisable to put your dedup folder together with other data.  The dedup folder will grow when necessary and you cannot impose a limit.  You cannot just delete some files in the dedup folder to reclaim disk space.  The disk space reclamation is internal to the dedup folder.  The overall size of the dedup folder may not shrink when space is reclaimed.

As to your last question, see my earlier explanation of the dedup process.

View solution in original post

3 REPLIES 3

pkh
Moderator
Moderator
   VIP    Certified

Very briefly, here is how dedup works.  When you do a backup, the backup set is broken into small chunks and stored in the dedup folder.  When you do the next backup, it too is broken up into small chunks.  These chunks are compared to the chunks already stored in the dedup folder.  If a chunk is identical to a chunk which is in storage, then it is not stored and a pointer is created from the chunk to the second backup.  Thus this chunk has 2 pointers, one to the first backup and the other to the second backup.  This process is done for each subsequent backup.  As the backup increases, you would expect more and more chunks not to be stored because of existing chunks.  This is the space saving of dedup.

When a backup set expires, its links to the chunks are removed.  When a chunk has no more links to any backup, then it is removed during the next dedup maintenance cycle.  This maintenance cycle also reclaims the space taken up by these deleted chunks by compacting the containers in the dedup folders.

Thus dedup folder is different from other media.  To maintain all these pointers, etc., there are quite a bit of overheads and the size of the dedup folder has no direct correlation with the size of the backup sets stored in it.  So you cannot say that if the backup sets are 2TB, my dedup folder will be xTB.  The documents quoted earlier will give you some idea how the space is used and how it is reclaimed.

To know how good is your dedup process, interpret the dedup stats in the joblog using this document

http://www.symantec.com/docs/TECH146827

As I said previously, as more data is stored in the dedup folder, your dedup ratio should improve.

It is not advisable to put your dedup folder together with other data.  The dedup folder will grow when necessary and you cannot impose a limit.  You cannot just delete some files in the dedup folder to reclaim disk space.  The disk space reclamation is internal to the dedup folder.  The overall size of the dedup folder may not shrink when space is reclaimed.

As to your last question, see my earlier explanation of the dedup process.

MIXIT
Level 6
Partner Accredited

Thank you guys.  I have yet to reda the links themselves but I will do that today.  I'm also reading the BE admin guide, so far on page 140 of 1342 :) I think dedupe was discussed around 500 or 700 pg, can't remember.  Anyway, I will get a new drive just to try the dedupe stuff, but also will need to now reclaim this disk space used...it ate up about 700GB of space.  I am content with deleting the dedupe folder as a Storage Device perhaps but before taking any steps I will ensure to read the links provided. 

Thanks again gentlemen.