cancel
Showing results for 
Search instead for 
Did you mean: 

Puredisk 6.6.1.2.EEB20 - compaction

gc_bus
Level 4

Hi!

We have two Puredisk 6.6.1.2.EEB20 SPA's; we have seven content routers per SPA and each SPA sits at around 59% utilisation of 185TB usable space per SPA. We only use Puredisk as back-end storage for Netbackup PDDO backups. (Netbackup version 7.1.0.2).

My query is why does it take so long for compaction to re-start on each of the content routers after a re-boot? I'm currently sitting at around 2.5TB or more per content router of "Space Needs Compaction" from the dsstat output and rising e.g.:

 

 

************ Data Store statistics ************
Data storage       Size   Used  Avail Use%
                  31.0T  18.3T  12.7T  59%
Number of containers             : 110550
Average container size           : 233847667 bytes (223.01MB)
Space allocated for containers   : 25851859603206 bytes (23.51TB)
Space used within containers     : 22641489320601 bytes (20.59TB)
Space available within containers: 3210370282605 bytes (2.92TB)
Space needs compaction           : 3125596138713 bytes (2.84TB)
Records marked for compaction    : 69014736
Active records                   : 354953541
Total records                    : 423968277
 
Use "--dsstat 1" to get more accurate statistics
 
After a re-boot, compaction will eventually kick in after about two weeks post the re-boot. After that it works fine.  Compaction is all ready to run as evidenced by the compactstate output e.g.:
 
Data store compaction: ON, DeleteSpaceThreshold: 30%, CompactLBound: 4MB
Compaction busy: No
 
I can start compaction manually with --compactstart but that only processes one container and then stops.
 
I realise that CRQP stops compaction whilst it runs but compaction simply refuses to do anything even when nothing else is running even when no PDDO Netbackup jobs are running.
 
Anyone else noticed this? Once compaction does start regularly then there is a heap of stuff to compact and eventually release back to the O/S which, of course, often slows down the PDDO backups as a result.
 
Thanks,
 
Glyn.
 

 

 

7 REPLIES 7

VirtualED
Level 5
Employee Accredited Certified

Compaction does not run based on your crcontrol --dsstat.  There is a threshold of 30% that you saw that will kick off compaction.  That 30% is based on df -h output not at dsstat out.  If your disk grows beyond a certain point, compaction will kick off to bring it back down to a certain point.

If EEB20 the latest binary you have?  Or do you have EEB21?

gc_bus
Level 4

Thanks for your reply. Certainly the 30% is already far exceeded on most CR's from a df -h perspective so I don't think that's it. As I say, compaction eventually starts on one or two of the CR's in our SPA and then starts on all of them after a week or more. The bigger the CR the longer it takes. It's almost as if there's something else running in the background that it's waiting for. We have EEB20 as the latest. 

VirtualED
Level 5
Employee Accredited Certified

There is a chance that someone has modified your default values in your contentrouter.cfg.  Also at EEB21, there is a known issue with compation.  So you would need to upgrade to 6.6.3 to obtain an EEB to fix that issue.

f25
Level 4

The solution was/is:

 

  1. Run the following commands to check the status:

# /opt/pdcr/bin/crcontrol --dsstat
# df -k 

  1. When there is a lot of unallocated space within containers and we are short on disk space, or we simply need more free disk space, run the following command (the parameters are the key!):

# /opt/pdcr/bin/crcontrol --compactstart 0 0 1

Data store compaction: ON, DeleteSpaceThreshold: 0%, CompactLBound: 0MB
Compaction busy: Yes 

  1. Use this command to check if the compaction has completed:

# watch -n 20  /opt/pdcr/bin/crcontrol --compactstate

Once  the compaction completes:

Compaction busy: No

 

  1. Run the following commands to check the status:

# /opt/pdcr/bin/crcontrol --dsstat
# df -k

Then the allocated space should be close to the used disk space.

Space allocated for containers   : 11945175272705 bytes (10.86TB)
Space used within containers     : 11903081645497 bytes (10.83TB)
Filesystem            Size  Used Avail Use% Mounted on
                       21T   11T   10T  53% /Storage/data

 

Good Luck!

Michał

gc_bus
Level 4

Thanks Michal, I will try that :)

f25
Level 4

One more thing,
I recently was in need to rebuild a CR storage (/Storage/data volume), so it was necessary to "clean it".

Simple removing the Content Router role from the node in the GUI -> Topology did not solve the issue.

I had CR utilisation at 0%, but disk utilisation around 40%.

Then the parameters like:

# /opt/pdcr/bin/crcontrol --compactstart 80 0 1

helped to vacum the CR storage to "back-up-able" size. (80% of expected compaction ratio, 0, 1== true for "try hard").

Cheers!

gc_bus
Level 4

Moving up to Puredisk version 6.6.3a plus EEB bundle V6 has meant that the compaction kicks in earlier after a re-start/re-boot i.e. after a few days rather than weeks. Mind you, 6.6.3a has had to have its fair share of fixes too - particularly regarding log rotation and CR performance.