cancel
Showing results for 
Search instead for 
Did you mean: 

High disk space usage - tried manual reclaim

unkn0wnn
Level 4

Hi,

Disk space usage is high on my PureDisk media server, I've checked and on disk I currently should have less than 1TB of data used for Copy 1 however after doing manual reclaim (processqueue + garbage collection then processqueue twice), only reclaimed about 500MB.

On disk I only store copy 1 and oldest backup is from last week (Catalog GUI).

Also spotted a lot of old files back to 2014 in /storage/data directory even a tough all copy 2 are on tapes:

-rw-r----- 1 root root 256M Jan 15 2014 6239.bin
-rw-r----- 1 root root 256M Jan 15 2014 6238.bin
-rw-r----- 1 root root 256M Jan 15 2014 6237.bin
-rw-r----- 1 root root 256M Jan 15 2014 6236.bin
-rw-r----- 1 root root 256M Jan 15 2014 6235.bin
-rw-r----- 1 root root 1.3M Jan 15 2014 6225.bhd
-rw-r----- 1 root root 256M Jan 15 2014 6225.bin
-rw-r----- 1 root root 694K Jan 15 2014 6224.bhd
-rw-r----- 1 root root 558K Jan 15 2014 6223.bhd
-rw-r----- 1 root root 828K Jan 15 2014 6222.bhd
-rw-r----- 1 root root 717K Jan 15 2014 6221.bhd
-rw-r----- 1 root root 256M Jan 15 2014 6224.bin
-rw-r----- 1 root root 256M Jan 15 2014 6223.bin
-rw-r----- 1 root root 256M Jan 15 2014 6222.bin
-rw-r----- 1 root root 256M Jan 15 2014 6221.bin
-rw-r----- 1 root root 867K Jan 15 2014 6209.bhd
-rw-r----- 1 root root 256M Jan 15 2014 6209.bin
-rw-r----- 1 root root 571K Jan 15 2014 6199.bhd
-rw-r----- 1 root root 256M Jan 15 2014 6199.bin
-rw-r----- 1 root root 410K Jan 15 2014 6198.bhd
-rw-r----- 1 root root 256M Jan 15 2014 6198.bin
-rw-r----- 1 root root 362K Jan 15 2014 6197.bhd
-rw-r----- 1 root root 256M Jan 15 2014 6197.bin
-rw-r----- 1 root root 256M Jan 3 2014 1596.bin
-rw-r----- 1 root root 256M Jan 3 2014 745.bin
-rw-r----- 1 root root 637K Sep 29 2013 3997.bhd
-rw-r----- 1 root root 16M Sep 29 2013 3997.bin

n5220w:/disk/data # du -h
8.0K ./749/0/_0
8.0K ./749/0
8.0K ./749
1003M ./journal
3.1T .

Below output /usr/openv/pdde/pdcr/bin/crcontrol --dsstat:

************ Data Store statistics ************
Data storage Raw Size Used Avail Use%
4.5T 4.4T 3.3T 1.0T 76%

Number of containers : 18600
Average container size : 177409748 bytes (169.19MB)
Space allocated for containers : 3299821316678 bytes (3.00TB)
Space used within containers : 3205753046031 bytes (2.92TB)
Space available within containers: 94068270647 bytes (87.61GB)
Space needs compaction : 11411996203 bytes (10.63GB)
Reserved space : 199918100480 bytes (186.19GB)
Reserved space percentage : 4.0%
Records marked for compaction : 309850
Active records : 43121730
Total records : 43431580

What is possibly wrong? I'm concerned it could be due to some orphaned images, Can anyone please advise ?

16 REPLIES 16

Michal_Mikulik1
Moderator
Moderator
Partner    VIP    Accredited Certified

Hello,

this topic has been discussed more times here on the forum. When you are in a deduplication environment, you cannot apply practice based on a non-deduplication env experience.


A few related points to mention here:
- when your backups are deduplicated well for a certain client/DB, you will see space release only after you move/expire ALL its backup images, not only SOME of even MOST images
- you will see space release after moving/expiring of some individual images only in the case that the client/DB deduplicates wrong
- data/blocks from 2014 can remain in the system up to 2023 when it has been continuously founded as a dedup baseline for any other backups

Regards

M.

Thanks Michal, once a week I have 1 DB server like 3TB scanned and deduplication rate is like 75-80% (sent around 600MB).

I understand that you are suggesting there are dependant fragments in the DB and filesystem.

Apart from extending what's the best action to take next?

Hi 

I would suggest logging a case with support to check for orphaned images. They will provide a utility to perform a scan on the MSDP pool (and additionally collect data from the master server). From that they will look to identify backup images in the MSDP pool that no longer exist in the NetBackup catalog. These can then be deleted (which is done with a script and data file provided by support). 

The first phase is called PoGather - this will need to be run on your system. 

David

Thaknks David, catalog shows only last 7 days worth of backups, how about expiring existing images first in case if any dependant images with accelerator/catalog data ?

Hi @unkn0wnn 

You can do some investigation on the MSDP catalog to see if there are images contained that should have been expired. It may take some effort as you would need to review as many client/policy combinations as needed to determine if you have a problem. 

The command to use is catdbutil (/usr/openv/pdde/pdcr/bin/catdbutil). Now to use you point it at the MSDP catalog files which should reside in this location:
<path to MSDP>/databases/catalog/2/<client>/<policy> using this syntax:

/usr/openv/pdde/pdcr/bin/catdbutil --list --dbpath <path to MSDP>/databases/catalog/2/<client>/<policy>

This will list the entries for that client/policy combination. What will be listed includes the name of the backupid. From this you should be able to determine if the backupid exists in NetBackup (there should be nothing in the MSDP catalog that isn't in the NetBackup catalog). If you find extra images in the MSDP catalog, then call support for help - cleaing up is not something simple to do.

For instance in a lab I have this is the start of the output from the above:

# /usr/openv/pdde/pdcr/bin/catdbutil --list --dbpath /mnt/msdp/databases/catalog/2/nbuclnt.local.ad/Linux-Sys | head
[/mnt/msdp/databases/catalog/2/babita.bable.ad/Bable-Linux]
2|/nbuclnt.local.ad/Linux-Sys|nbuclnt.local.ad_1687089601_C1_F1.fmk|0|c672b8d1ef56ed28ab87c3622c5114069bdd3ad7b8f9737498d0c01ecef0967a|010c09c98a2144b9d58447cbd2acc61b|||0|M|0|0100600|0||260|1687089617|1687089617|1687089617|0||||0||PDVFS_2_F_0_ID_3_RT_0|0|1|0|0|
2|/nbuclnt.local.ad/Linux-Sys|nbuclnt.local.ad_1687089601_C1_F1.hdr|0|308e5409452bc608d641f526d3859382239ae02ed562a5fd948ee179113ef793|01226f61f28d7f97e57159b7cf53b993|||0|N|0|0100600|0||260|1687089606|1687089617|1687089617|32768||||0||PDVFS_2_F_0_ID_6_RT_0|0|7309421|0|0|
2|/nbuclnt.local.ad/Linux-Sys|nbuclnt.local.ad_1687089601_C1_F1.img|0|c672b8d1ef56ed28ab87c3622c5114069bdd3ad7b8f9737498d0c01ecef0967a|01b0929e8714af2d1b48f22f58a2ff43|||0|M|0|0100600|0||260|1687089606|1687089606|1687089606|0||||0||PDVFS_2_F_0_ID_5_RT_0|0|1|0|0|
2|/nbuclnt.local.ad/Linux-Sys|nbuclnt.local.ad_1687089601_C1_F1.info|0|665bbbc6ed0b013afbae09f67a90cf8db728b99c35aad5352a4996d487cf1e76|011519ad17e6e29c18e97d04b14608a2|||0|N|0|0100600|0||260|1687089617|1687089617|1687089617|578||||0||PDVFS_2_F_0_ID_4_RT_0|0|7309421|0|0|
2|/nbuclnt.local.ad/Linux-Sys|nbuclnt.local.ad_1687089601_C1_F1.map|0|c1cd1d987ce396fc5f7bec325b88d2efe94774ab2ca711647b39c4ed6300b318|015e4375521cd64edab7cbdc71879020|||0|N|0|0100600|0||260|1687089606|1687089617|1687089617|32||||0||PDVFS_2_F_0_ID_7_RT_0|0|7309421|0|0|
2|/nbuclnt.local.ad/Linux-Sys|nbuclnt.local.ad_1687089601_C1_HDR.img|0|7e3597ebff91c68055433ceed3e34d75b953237b3ec672e5070ac8939c3942a1|01ab735b4d86fee0c44e3535c410e5a7|||0|N|0|0100600|0||260|1687089605|1687089605|1687089605|1024||/ver=1/pt=0/st=1||0||PDVFS_2_F_0_ID_2_RT_0|0|7309421|0|0|
2|/nbuclnt.local.ad/Linux-Sys|nbuclnt.local.ad_1687089601_C1_HDR.info|0|810ee0bd344bd8539dc89b6be96896effc5734f531a9b47216647d9f3f34fedc|01ba224ade068068f25f6ea383d2200d|||0|N|0|0100600|0||260|1687089605|1687089606|1687089606|578||||0||PDVFS_2_F_0_ID_1_RT_0|0|7309421|0|0|

From the above, I can see a backupid of  nbuclnt.local.ad_1687089601 which I could check against my NetBackup catalog to see if it exists. 

Hope this helps. 

David

Hi davidmoline,

Thanks for this, I will definitely try it.

Just checking if catdbutil can be run safely ?

HI @unkn0wnn 

If you just use the options I gave - yes it is safe.

David

Hi davidmoline,

Is dbutil tool ok to try instead? I can't find catdbutil tool on media server.

Hi @unkn0wnn 

What NetBackup version are you using? I'm not aware of what dbutil does (nor where it is found - it is not on any system I have ready access to which includes 8.1.2 and this includes catdbutil). 

Is there even a /usr/openv/pdde/pdcr directory?

David

Hi @unkn0wnn 

Is you media server by chance a PDDO system and not a NetBackup media server with MSDP?

David

It is separated server used for deduplication, pure disk.

Its netbackup 7.5, Yes, directory exists on media server, listing below:

5220netb:/usr/openv/pdde/pdcr/bin # ls -altr|more
total 180
-rwxr-xr-x 1 root bin 224 Feb 9 2013 wslog
-rwxr-xr-x 1 root bin 224 Feb 9 2013 wsget
-rwxr-xr-x 1 root bin 221 Feb 9 2013 stat
-rwxr-xr-x 1 root bin 227 Feb 9 2013 spoold
-rwxr-xr-x 1 root bin 236 Feb 9 2013 splogscan
-rwxr-xr-x 1 root bin 236 Feb 9 2013 spextract
-rwxr-xr-x 1 root bin 230 Feb 9 2013 spauser
-rwxr-xr-x 1 root bin 224 Feb 9 2013 spadb
-rwxr-xr-x 1 root bin 221 Feb 9 2013 spad
-rwxr-xr-x 1 root bin 230 Feb 9 2013 reroute
-rwxr-xr-x 1 root bin 248 Feb 9 2013 report62splog
-rwxr-xr-x 1 root bin 227 Feb 9 2013 prdate
-rwxr-xr-x 1 root bin 239 Feb 9 2013 pdstresscr
-rwxr-xr-x 1 root bin 230 Feb 9 2013 pddecfg
-rwxr-xr-x 1 root bin 227 Feb 9 2013 pddeDR
-rwxr-xr-x 1 root bin 227 Feb 9 2013 pdconf
-rwxr-xr-x 1 root bin 230 Feb 9 2013 nstpack
-rwxr-xr-x 1 root bin 218 Feb 9 2013 md5
-rwxr-xr-x 1 root bin 224 Feb 9 2013 genpo
-rwxr-xr-x 1 root bin 230 Feb 9 2013 fileput
-rwxr-xr-x 1 root bin 230 Feb 9 2013 fileget
-rwxr-xr-x 1 root bin 230 Feb 9 2013 filedel
-rwxr-xr-x 1 root bin 230 Feb 9 2013 dsiddel
-r-xr-xr-x 1 root bin 4853 Feb 9 2013 dr_createlist.sh
-rwxr-xr-x 1 root bin 236 Feb 9 2013 delayscan
-rwxr-xr-x 1 root bin 227 Feb 9 2013 dcscan
-rwxr-xr-x 1 root bin 233 Feb 9 2013 dcpcrypt
-rwxr-xr-x 1 root bin 227 Feb 9 2013 dbutil
-rwxr-xr-x 1 root bin 230 Feb 9 2013 crstats
-rwxr-xr-x 1 root bin 227 Feb 9 2013 crstat
-rwxr-xr-x 1 root bin 233 Feb 9 2013 crrepair
-rwxr-xr-x 1 root bin 236 Feb 9 2013 crrecover
-rwxr-xr-x 1 root bin 224 Feb 9 2013 crget
-rwxr-xr-x 1 root bin 221 Feb 9 2013 crfp
-rwxr-xr-x 1 root bin 239 Feb 9 2013 crdoreader
-rwxr-xr-x 1 root bin 236 Feb 9 2013 crcontrol
-rwxr-xr-x 1 root bin 236 Feb 9 2013 crcollect
-r-xr-xr-x 1 root bin 2606 Feb 9 2013 cdrinit.sh
-rwxr-xr-x 1 root bin 236 Feb 9 2013 cacontrol
-rwxr-xr-x 1 root bin 230 Feb 9 2013 ca_test
-rwxr-xr-x 1 root bin 230 Feb 9 2013 adler32
drwxr-xr-x 2 root bin 4096 May 1 2013 .bin
drwxr-xr-x 5 root bin 4096 May 1 2013 ..
drwxr-xr-x 3 root bin 4096 May 1 2013 .
5220netb:/usr/openv/pdde/pdcr/bin # ls *util*
dbutil

HI @unkn0wnn 

Given it is a PureDisk server (PDDO), I am unable to assist (no in-depth knowledge of this retired technology). 

I also see why you haven't approached support. 

Good luck - maybe someone else may have an idea on PDDO systems.

David

Hi @davidmoline so things changed a lot when it comes to deduplication / puredisk media server ?

Hi @unkn0wnn 

Not sure how much has changed, but I recall working as an admin on a PureDisk (PDDO) system back in 2013. At that stage it was nearing end of life. The technology is much the same, I do not know about the management of the data - I wasn't involved with the day to day operations much.

David

How do I know it is not media server with MSDP ? What is the difference between media server pure disk and media server with MSDP ?