cancel
Showing results for 
Search instead for 
Did you mean: 

Pre vs. Post Deduplication report that breaks down by client?

D_Flood
Level 6

Does anyone have a modified version of the Pre vs. Post Duplication report that breaks it down (in tabular form) by client (and pulls the client list automatically)?


I know I can edit the report and put in a specifc client (or clients) but if multiple clients are selcted then it totalls them all together.  I need (for Management) a report that tells them "where did all the space go and why can't you reclaim it by just moving some backups off to tape?"...

 

1 ACCEPTED SOLUTION

Accepted Solutions

payners
Level 4

 

SELECT 
clientname as 'Client',
COALESCE(CAST(SUM(preSISSize)/1024.0/1024.0 AS NUMERIC (20,2)), 0) AS 'Pre Dedup Size (MB)', 
COALESCE(CAST(SUM(preSISSize)/1024.0/1024.0/1024.0 AS NUMERIC (20,2)), 0) AS 'Pre Dedup Size (GiB)', 
COALESCE(CAST(SUM(bytesWritten)/1024.0/1024.0 AS NUMERIC(20,2)), 0) AS 'Post Dedup Size (MB)', 
COALESCE(CAST(SUM(bytesWritten)/1024.0/1024.0/1024.0 AS NUMERIC(20,2)), 0) AS 'Post Dedup Size (GiB)', 
COALESCE(CAST((SUM(preSisSize) - SUM(bytesWritten))/1024.0/1024.0/1024.0 AS NUMERIC(20,2)), 0) AS 'Total Savings (GiB)'
FROM 
domain_JobArchive 
WHERE 
presisSize != bytesWritten and 
presissize !=32768
GROUP BY 
clientName,
policyname
ORDER BY 
clientName

View solution in original post

5 REPLIES 5

payners
Level 4

 

SELECT 
clientname as 'Client',
COALESCE(CAST(SUM(preSISSize)/1024.0/1024.0 AS NUMERIC (20,2)), 0) AS 'Pre Dedup Size (MB)', 
COALESCE(CAST(SUM(preSISSize)/1024.0/1024.0/1024.0 AS NUMERIC (20,2)), 0) AS 'Pre Dedup Size (GiB)', 
COALESCE(CAST(SUM(bytesWritten)/1024.0/1024.0 AS NUMERIC(20,2)), 0) AS 'Post Dedup Size (MB)', 
COALESCE(CAST(SUM(bytesWritten)/1024.0/1024.0/1024.0 AS NUMERIC(20,2)), 0) AS 'Post Dedup Size (GiB)', 
COALESCE(CAST((SUM(preSisSize) - SUM(bytesWritten))/1024.0/1024.0/1024.0 AS NUMERIC(20,2)), 0) AS 'Total Savings (GiB)'
FROM 
domain_JobArchive 
WHERE 
presisSize != bytesWritten and 
presissize !=32768
GROUP BY 
clientName,
policyname
ORDER BY 
clientName

D_Flood
Level 6

Thank you Very, Very, Very much!

I do note a few entries at the top of the returned list with no client name.  Any idea what might be causing that?

And what's the field name for master server so I can expand the WHERE to filter by master server name?

And finally, is there a public document (or schema) that lists the database fields?

 

Again, Thanks!

 

Dan_Giberson
Level 5

Hey guys,

Is there a way to generate this report while backing up to a DataDomain? When I run it (or the canned report) I just get "The report did not return any data".

 

Appreciate the help.

 

Running OpsCenter 7.5.0.3 Analytics

new2nbu
Level 4

Hello,  

Thank you so much for sharing this SQL query - I too have Data Domains and been wanting to use OpsCenter to help expose some valuable reports have not been able to.  I ran the SQL command above though and just got 1 client that we have configured with client side dedup but no Data Domain info.  Have you gotten it to report your DD information?  Thanks!

RonCaplinger
Level 6

Data Domains are, by nature, target-based deduplication, meaning that NetBackup is essentially unaware that deduplication is taking place.  This is similar to encryption of tapes on LTO4 and higher using library-based encryption and a non-NetBackup encryption key management system; NetBackup is unaware of the encryption and will not be able to report whether the data is encrypted or not.

In order to report deduplication percentages, the OST driver would need to communicate the deduplication data points back to NetBackup.  If you take a look at the Job Details of a backup to a Data domain, there is no mention of deduplication factores or savings, but if you are using MSDP, NetBackup *is* aware of the amounts and is reporting them.  The way I believe it currently works is that the Data Domains are simply reporting back the overall, device-wide storage available and storage in use to the NetBackup bpdm process. I believe there is no current way to pull the data you are requesting from the Data Domains.

I could be wrong about this, as I don't know the inner workings of the OST drivers.