09-24-2015 05:50 PM
I'm being asked by a manager to supply a report that I don't think is possible to provide ... at least not in the way that person is hoping to see it. :\
Let's say that in a 24-hour period I back up 50 TB of data. That (to me...) means that the sum total of the jobs that complete successfully (or, well, partially successfully) is 50 TB.
I'm being asked to show what that amount of data occupies *on tape*, without factoring in tape hardware compression. Stay with me here.
In my mind, to get at least a reasonably accurate idea of how much tape this 50 TB backup set is using, I'd just need to know what my average tape compression ratio is, for my LTO5 tapes, and divide the 50 TB number by whatever my average per-tape capacity is (using that compression ratio.) But again, that involves factoring in compression, and I'm being told not to do that for this report.
Aside from Accelerator or Dedupe backups, I cannot think of another way to do this.
Anyone else?
Solved! Go to Solution.
09-28-2015 02:55 AM
09-24-2015 10:47 PM
The first querstion I would ask is : "What problem does this reporting solve?" Useless questions and reports from mangers are just that - "usless" and time wasters
09-24-2015 11:54 PM
That made my head hurt for a bit.
There is no possible way to do this unless you've got a full tape, and also only if you've completely used that tape for 1 backup and it didn't span onto another tape. When the tape is full, we'll know about it just because the tape drive reached the end of the tape and asked for another. So at that stage you can look at bpmedialist and see you put 5TB on a 1.6TB. For a tape that is not full we have no idea how much of the tape has been written on. Lets not get into multiplexing.....
09-25-2015 12:33 AM
There is no way of telling how much 50TB occupies on tape without compression. There is simply no tools to report that.
Will the report be used for invoicing ?
Youre process of using a average would also be my best advice.
09-25-2015 01:21 AM
Nicolai my friend .... you are misssing the obvious.
50MB of data, occupies on tape without compression, 50MB does it not ?
If, it writes without error that is - if there are 'recoverable' errors, the drive will re-write the data, invisible to NBU and the OS, but will use slightly more tape.
09-25-2015 01:26 AM
I'm confused though, and my head hurts too Riaan ...
Given that drives use compression, what use is a report that doesn't tape compression into account ???
This info is obtainable from the bptm log, you can see the size of each fragment sent. As this is on the NBU side, it is the value before compression.
You could also get the size of the fragements for a given backupid, from the catalog.
09-25-2015 01:47 AM
Absolutely :)
But as I understand the question, the user ask how the data occupy on tape "backend" so to say, and thus how much remaining.
The only thing we can report is front end data, the amount of data we protect. How much space it occupy on on a given media - either tape,MSDP, Data Domain - we can't say directly. They are "black boxes" to us.
And because compression is always on - how could we report data uncompressed - without knowing the average compression per image (we don't).
The only way - very simple and stupid - I can think of is dividing image size with 512 byes - that the sector size of SCSI. Thus a 100MB backup would occupi 200 blocks of data on tape. But what in the world would that pice of information be good for ...
Hope this helps :)
09-25-2015 06:07 AM
Its a bit confusing but what the manager wanted to know is how much each backup took up on tape. Well that is what I understood. So assume the following
Oracle backup is put on LTO 4, we manage to stick 2.4 TB on the 800GB rated media. Magically the data stream ended right at (or before) the end of tape marker. So we know the compression is 3:1 to and the backup is taking up 800GB of storage.
But what if that job had 1 Megabyte more to write. A new tape will be loaded and the backup will complete. But we won't be able to determine how much of the LTO media was used. (or will we?).
To add to that, what happens to the 8 other backups we put on that tape? They all have have different compression ratios (assuming they come from different backup sources). So we'll have 9 different backups with 9 different, and unknown, compression ratios occupying on tape.
When the end of tape marker rolls up and the last backup is finished there is no way to calculate the storage used (per backup) as in the first example.
So you can really only report this in increments of 800GB.
Unless I'm missing something.
09-25-2015 06:18 AM
But how many microns of magnetic flux were utilized?
That is the important question. 8>
09-25-2015 06:26 AM
The answer is always 27
09-25-2015 07:21 AM
What is the value in knowing this? That is the question that should be asked!
09-25-2015 07:33 AM
Not 42?
09-25-2015 08:05 AM
Since we know that answer is 27, the next questions could be why ask silly questions (as Revarooo pointed out)
09-25-2015 12:59 PM
The block position of each fragment is recored (at least the starting offset from the beginning of the tape) in the catalog, seen on the FRAG lines of bpimage output, so in theory, you could work out how many blocks each fragment took _ I think ...
You have also got some position data in the .f file, per file, which I think is how the fast block positioning works ... BUT
1. This is a wild idea I've just thought of, havn't tested, researched or thought through ...
2. If multiplexing is used, I suspect all bets are off ...
3. I might be wrong ...
Proving this could be fun, you need to take say 1 jpg image of a decent size (because jpgs are pretty much uncompressible) or a good sized .zip file (again, uncompressible and write it to a unassigned tape. zip file is better as you can get decent size easily, I'm talking something like a couple of gigs.
Then back it up again, to the same tape.
Next, backup (again to an unassigned tape) something nice and compressible (like a .txt file) of the same size as the zip file. As before back it up again (to the same tape).
THen compare the start block off set of the 1st fragment of the 2nd backup on each tape, it should be different as one image compressed and one didn't.
Then with some maths and knowing the blocksize you might be able to come to some conclustion.
Not sure if this would show anything useful though ...
09-25-2015 10:58 PM
Too clever. Do we know how big a block is?
09-26-2015 07:31 AM
Yep, whatever SIZE_DATA_BUFFERS is set to ...
However, not sure it's going to help as it's still compressed ...
09-28-2015 12:25 AM
Interesting ... +1
09-28-2015 02:55 AM
09-28-2015 04:20 AM
Hi Martin
Thanks for the that super technical explanation. Can you explain what each line refers to. I'm not certain I understand how we know how much tape it used. How do the blocks get layed out?
09-28-2015 06:43 AM