cancel
Showing results for 
Search instead for 
Did you mean: 

how does NBU uses/interpret hd space?

manatee
Level 6

NBU 7.6.0.3

a few days back, i have 3TB free in my master where backup images are stored. today. i found i only have 600GB free!

upon checking, i found the pending duplication images are waaay more than my available partition size for NBU backup images:

[root@ovmmanager admincmd]# ./nbstlutil report

Backlog of incomplete SLP Copies
        In Process (Storage Lifecycle State: 2):
                Number of copies:       8605
                Total expected size     64615866 MB

SLP Name: (state)                                 Number of copies: Size:
Daily_Policy (active)                                       7977    11515412 MB
Dec31_Policy (active)                                         51      529501 MB
Exchange_Daily_Policy (active)                                36    15424821 MB
Oracle_Daily_Policy (active)                                 520    37102487 MB
SLP-PureDisk-to-Tape (active)                                  5        4001 MB
Weekly_Policy (active)                                        16       39643 MB

Total:                                                      8605    64615865 MB

here is from my master server space issue:

[root@ovmmanager admincmd]# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/sda2              77G   30G   44G  41% /
tmpfs                  24G  112K   24G   1% /dev/shm
/dev/sda1             504M   61M  418M  13% /boot
/dev/sda3              68G  448M   64G   1% /opt
/dev/sda7              32G  176M   30G   1% /u01
/dev/sda5              68G  678M   64G   2% /var
/dev/mapper/mpathlp1  9.8T  9.2T  678G  94% /NBU_DSU1

i can't understand why my pending duplication is 37TB just for Oracle alone!?

3 ACCEPTED SOLUTIONS

Accepted Solutions

Marianne
Level 6
Partner    VIP    Accredited Certified

It's called deduplication...

In Job details you should see something like this:

... scanned: 493776469 KB, CR sent: 11777370 KB, dedup: 97.6%

In above example, the image size is 493GB, but only 11GB is written on the MSDP pool due to 97.6% data that already exisits.

Data is rehydrated when written to tape, therefore the full image size is reported in nbstlutil output.

All explained in NetBackup Deduplication Guide 

 You may also want to read up on this topic in the same manual: 

About MSDP storage capacity and usage reporting

And this recent post: 
https://vox.veritas.com/t5/NetBackup/How-to-clean-up-storage-on-the-DSU-for-puredisk-linux-storage/m...

View solution in original post

sdo
Moderator
Moderator
Partner    VIP    Certified

Do not delete the bhd container files.  They contain chunks of finger-printed and deduped backup image data.  See this post:

https://vox.veritas.com/t5/NetBackup/How-to-clean-up-storage-on-the-DSU-for-puredisk-linux-storage/m...

If you manually delete them then you will definitely corrupt at least one backup image, but possibly all of your backups.

View solution in original post

sdo
Moderator
Moderator
Partner    VIP    Certified

Where you say "regarding the bhd containers, while i may agree they may refer to backups on tape"... this is perhaps not quite the best way to describe it.

The bhd container files do not refer to tape.  The the entire set of bhd container files contain the entire set of "copies of backup images" residing within the dedupe disk pool.  The copies of backup images on tape are completely separate copies.  And, it is the "NetBackup Catalog" which knows about all backup images, and so... it is the NetBackup Catalog which therefore knows which backup images have copies within disk pools, and which backup images have copies on tape.

View solution in original post

12 REPLIES 12

manatee
Level 6

to add, in the directory where the backup images are stored, i see old directories since 2015 that occupies space in GBs. the files end in .bhd and .bin

what are they? can i delete them?

Marianne
Level 6
Partner    VIP    Accredited Certified

It's called deduplication...

In Job details you should see something like this:

... scanned: 493776469 KB, CR sent: 11777370 KB, dedup: 97.6%

In above example, the image size is 493GB, but only 11GB is written on the MSDP pool due to 97.6% data that already exisits.

Data is rehydrated when written to tape, therefore the full image size is reported in nbstlutil output.

All explained in NetBackup Deduplication Guide 

 You may also want to read up on this topic in the same manual: 

About MSDP storage capacity and usage reporting

And this recent post: 
https://vox.veritas.com/t5/NetBackup/How-to-clean-up-storage-on-the-DSU-for-puredisk-linux-storage/m...

sdo
Moderator
Moderator
Partner    VIP    Certified

Do not delete the bhd container files.  They contain chunks of finger-printed and deduped backup image data.  See this post:

https://vox.veritas.com/t5/NetBackup/How-to-clean-up-storage-on-the-DSU-for-puredisk-linux-storage/m...

If you manually delete them then you will definitely corrupt at least one backup image, but possibly all of your backups.

still reading the links. lots to process.

but what is the difference between "Netbackup backup images" and "Puredisk backup images"? my backups are going to a Puredisk disk using Netbackup so i don't see the point differentiating them.

sdo
Moderator
Moderator
Partner    VIP    Certified

Unless you have an actual old PDDO Appliance from the 50x0 range (long time discontinued) then if you are only using NetBackup MSDP or 52x0 Appliance(s) or 53x0 Appliance(s) with MSDP, then you are using PureDisk within MSDP, but the distinction is merged.

i.e.

standalone PDDO, or 50x0 Appliances = (raw) PureDisk

standalone MSDP, or 52x0/53x0 Appliances = MSDP (which uses PureDisk under the hood)

.

This is why you see the term "PureDisk" as a disk pool type, or storage server type when working with MSDP.

What I mean is, it is unlikley that you have the old (raw) PureDisk, and much more likely that you have MSDP, and so... for you... the terms MSDP and PureDisk are synonomous.

regarding the bhd containers, while i may agree they may refer to backups on tape, why do they have to be so huge? like i have 250GB for one. and there are several directories whose sizes are almost the same size. i thought the benefit of tape backup is to place everything on tape then why do i need to keep such huge directories on disk?

Marianne
Level 6
Partner    VIP    Accredited Certified

".... but what is the difference between "Netbackup backup images" and "Puredisk backup images? "

This is what I was trying to explain... 

... scanned: 493776469 KB, CR sent: 11777370 KB, dedup: 97.6%

In above example, the image size is 493GB. 
A total of 493GB was received from the client.

Only 11GB data is unique. The other 97.6% data  already exisits in "old directories since 2015".
Only 11GB is written, along with pointers to previous backups (in the old directories) that make up the full image size. 

When you duplicate to tape, data needs to be rehydrated - the full 493GB must be written to tape.

PureDisk images are stored on disk in 'container files'.

More about this in dedupe guide under these topics:
■ About MSDP storage capacity and usage reporting
■ About MSDP container files
■ Viewing storage usage within MSDP container files 

sdo
Moderator
Moderator
Partner    VIP    Certified

Where you say "regarding the bhd containers, while i may agree they may refer to backups on tape"... this is perhaps not quite the best way to describe it.

The bhd container files do not refer to tape.  The the entire set of bhd container files contain the entire set of "copies of backup images" residing within the dedupe disk pool.  The copies of backup images on tape are completely separate copies.  And, it is the "NetBackup Catalog" which knows about all backup images, and so... it is the NetBackup Catalog which therefore knows which backup images have copies within disk pools, and which backup images have copies on tape.

after going over the dedup man pages over lunch, i think i got it now (i hope).

those containers are used by the dedup process and they contain subsets of backup images. as such, they may go back to the beginning when NBU was first deployed/used  and they are referred to during duplication process (and backup as well) so that NBU doesn't have to do another backup if such data (or part thereof) can be found in the bhd containers.

the only time the bhd containers gets completely destroyed, by NBU, is when any backup associated with a bhd container are all expired.

sdo
Moderator
Moderator
Partner    VIP    Certified

You got it!

Just one more minor point from me...

...where you say:

"so that NBU doesn't have to do another backup if such data (or part thereof) can be found in the bhd containers."...

...I would state this as:

"so that MSDP doesn't have to store/save/write chunks of backups if such data is found in the bhd containers."

Marianne
Level 6
Partner    VIP    Accredited Certified
Glad to hear.
Do you understand now why I've been trying to tell you for months now that you need longer retention on MSDP?

yes. it affects performance.

i was kinda looking for ways to save on space that's why my retention period on MSDP is short.

today i really hit the limit of my two tape drives so now i have a more compelling reason, plus the MSDP data usage, to get more hardware resources.

thanks guys for all the help!