Query: Finding failed backup image fragments
Hi team,
I'm wondering if it is possible to identify both valid and invalid (e.g. from failed backups) image fragments on a tape?
Example scenario:
Take an MPX LTO5 tape (native capacity: 1.5TB) and assume no compression. The tape has has two generic windows file servers streaming to it. However, server 2 fails at some point - say 500GB down the road - for whatever reason. Server one completes without issue, sucking up the remaining 1TB.
At 1.5TB the tape is classified as full by NetBackup (again - for simplicities sake, there is no compression). 1TB = server one. 500GB = server two failed backup. = 1.5TB total.
Now my understanding is that because server one's backup is succesful, it's images are valid and the entire tape will remain classified as full. The failed image fragments are not "freed" per say, given the how the tape is written too in a linear nature (https://vox.veritas.com/t5/Backup-Recovery-Community-Blog/Understanding-how-NetBackup-writes-to-a-tape/ba-p/784790) - and so "technically" these failed spaces are taking up capacity on that tape.
My questions are:
1. is there a way to identify those failed image fragements somehow ?
2. Or a way to determine what are the only valid images are on that tape?
e.g. can we assume the "images on tape" report will return only the valid and successful images on that tape, so i know that if i wanted to move them to another tape, i could safely scratch the old one and reclaim the 500GB of lost tape capacity.
Yes - correct
If you have two data streams each of 500GB, perfect multiplexed and one completes and one fails at 497 GB, the 497GB capacity is "lost" until backup lifetime is expired.
Matthew_Longmui wrote:
So i'm wondering if my logic here is sound. And thus, "images on tape" does provide the valid list of successful images on the tape that i can relocate (if i wanted - and that is it's own question; effort vs just riding it out until all images expire vs buying new tapes) to another MPX tape with space to fit them, and reclaim all that lost 900GB+.
I agree with your logic.
You need to weigh up 'effort vs just riding it out'.I guess it boils down to how often this happens - 1 out of 20 tapes? or more?
If more, the assumption is that you are seeing quite a high backup failure rate.
It might be worth your while to investigate the failures and address the cause rather than spending too much effort on duplicating tapes.Another option might be to consider disk as backup storage with duplication afterwards to tape.
Backup failure to disk will always be deleted after image cleanup.
Chances of failures during duplication are much less than backups.Sorry, mis-read ... yes, my explanation s only true for non-mpx backups ....
I willl edit my answer and add a note.
So, indeed, for mpx jobs the failed fragments would remain until the tape expires. There is no way around this.