cancel
Showing results for 
Search instead for 
Did you mean: 

NBU allows overwrite of tapes from another domain?

andrew_mcc1
Level 6
   VIP   

I have a customer with two NBU domains (two separate Master Servers) both with tape backups that are sent off-site. It seems that if they recall an unexpired tape for restore and put it in the library on the “wrong” site, it gets put into the scratch pool once the Robot Inventory and volume configuration have run which means it is very likely to be overwritten. Is this normal behaviour? 

Customer is concerned an operator error is likely to lead to data loss. Apart from the media write-protect switch, is there a way to stop NBU treating unknown NBU tapes as available new media? I think the Robot Inventory Advanced Options could help but I suspect this could also be error prone?

NBU seems to leave tar and CPIO etc. tapes alone by default and I always thought it did for NBU tapes if it wasn't sure. Thanks, Andrew

1 ACCEPTED SOLUTION

Accepted Solutions

mph999
Level 6
Employee Accredited

Once you inventry the library, and BB1234 becomes 'known' to the system (it appears in the list of media) - lets call this master_server_A.  at this point it is given a mediaID, and so it is not an unknown mediaID at this point.

It contains images (from the other system, master_server_B) on the tape, but there are no images in the NBU catalog for the tape on master_server_A.

If you then consider an expired tape from, master_server_A (AA1234), it contains no images in the catalog, and thus from the 'catalog view' it is the same as BB1234.  Both tapes have data, but no images in the catalog.

If you were to use either tape, they would both be overwritten.

The NBU tape header is checked, to be sure the media ID is matching what it should be, but no other checks are made, and there is no way to tell from the tape header if the tape 'really' contains valid data that should not be overwritten.

 

Phase 1/2 does not really resolve the issue - the images would become 'valid' again and so the tape would not be overwritten until it next expires.  A phase 1 takes 15 or so minutes at a guess, a phase 2 several hours if the tape is full - not really an option.

If the media IDs on system A use different characters of the barcode than system B (first 6 instead of last 6 ) then this would stop the tape being overwritten.  However, it would also make it much harder to move and import (phase 1/2 ) tapes in to the system if that ever was a requirement.

 

View solution in original post

21 REPLIES 21

mph999
Level 6
Employee Accredited

NBU would treat the media as a new media and overwrite it.

If you consider this:

NBU writes to a tape - AA1234

The images on the tape expire, the tape mves back to scratch

NBU catalog contains no data about images on the tape (as they expired)

___________________

You add another tape into the library, it has data on it from another NBU domain, tape BB1234

You inventory the library, NBU adds the tape and puts it in the scratch pool

NBU catalog contains no data about images on the tape

Consider AA1234 and BB1234, from NBU viewpoint they are the same ...  no images in the catalog and 'data' on the tapes.

If NBU refused to write to tape BB1234 (as it was added from another system), it would also refuse to write to AA1234 - you can see how this would be an issue, NBU would be unable to overwrite and tape that had expired and was valid for re-use.

The way around this is to use completly seperate barcodes - AAxxxx for one site and BBxxxx for another, so it is clear where tapes go.

 

andrew_mcc1
Level 6
   VIP   

Thanks for this but I'm a bit unclear

Consider AA1234 and BB1234, from NBU viewpoint they are the same

I would have thought they are different as there is an existing Media ID record for AA1234 whereas BB1234 is completely unknown so may contain valid backup images. I would expect there could be at least an option to Freeze or Suspend unknown tapes with valid NBU headers if they are detected. This would be unlike new media which will be unlabeled and can safely have a Media ID generated and put into use.

Also does Phase I or 2 Import avoid this issue? Unfortunately the customer has tapes in the same barcode range across both sites. 

Andrew 

mph999
Level 6
Employee Accredited

Once you inventry the library, and BB1234 becomes 'known' to the system (it appears in the list of media) - lets call this master_server_A.  at this point it is given a mediaID, and so it is not an unknown mediaID at this point.

It contains images (from the other system, master_server_B) on the tape, but there are no images in the NBU catalog for the tape on master_server_A.

If you then consider an expired tape from, master_server_A (AA1234), it contains no images in the catalog, and thus from the 'catalog view' it is the same as BB1234.  Both tapes have data, but no images in the catalog.

If you were to use either tape, they would both be overwritten.

The NBU tape header is checked, to be sure the media ID is matching what it should be, but no other checks are made, and there is no way to tell from the tape header if the tape 'really' contains valid data that should not be overwritten.

 

Phase 1/2 does not really resolve the issue - the images would become 'valid' again and so the tape would not be overwritten until it next expires.  A phase 1 takes 15 or so minutes at a guess, a phase 2 several hours if the tape is full - not really an option.

If the media IDs on system A use different characters of the barcode than system B (first 6 instead of last 6 ) then this would stop the tape being overwritten.  However, it would also make it much harder to move and import (phase 1/2 ) tapes in to the system if that ever was a requirement.

 

andrew_mcc1
Level 6
   VIP   

Thanks again. I guess Phase 1 & 2 Imports will help as any tape could not be reused while Phase 2 is running so the exposure would be small. 

Thanks again, Andrew

sclind
Moderator
Moderator
   VIP   

I have setup my Media Rules on each server so if the 'wrong' tape is put in a tape silo it gets assigned a Media Type that we  dont use.  That is, we use HCART for our LTO4 tapes that start with "T".  If any "W" tapes (from the other silo) gets injected that tape gets defined a DLT.  Wo dont have any DLT drives so the tape wont get used (and I have time to fix the problem the next morning).

andrew_mcc1
Level 6
   VIP   

Thanks for this. Yes that seems a good plan, however the customer has tapes in the same barcode range across both sites. I guess their options are write-protecting media going offsite plus the Inventory Robot Advanced Setting to put new media into None Volume Pool for injects. However there still remains the possibility of operator error which is their real concern.

Andrew

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified

Operator error should be eliminated through proper training, policies and procedures, along with documented consequences. 

Honestly - how does Master1 needs tape XXXXX for restore, tape come onsite and operator 'accidently' inserts the tape into Master2 tape library? Do they not immediately realize that the restore tape is still reported as 'non-robotic' on Master1?

Your customer will also need to take reponsibility to somehow split the media - slowly phase out the shared barcodes and move/duplicate to dedicated barcodes. 
Some years ago we had a customer that went as far as purchasing different colour tapes for the different robots, along with different range labels.

Another customer had a large STK tape library that was shared amongst more than one master. The operators thought it good to 'borrow' tapes from other environments when they ran out. 
After some real data loss due to overwritten tapes, they were called in and explained the consequences. Only at this point did the 'accidents' stop.

An expansion of @Marianne's idea is that once the barcodes are split, deliberately put the wrong tapes into the other library, inventory the library, and then assign them to an unused pool like "other_site_tapes".  Then you don't have to rely on rules, phase 1/2 inports, or anything else.  If a tape goes to the wrong site it just drops into the existing pool and doesn't get reused.  But, should you ever need to phase 1/2 import and then restore it it's there and in a pool already.

 

 

andrew_mcc1
Level 6
   VIP   

Thanks. I've suggested split barcode range as his seems the best option though I'm not sure this is possible with customers existing media. I'll also try and review operational procedures with them.

However I am still surprised NBU is overwriting "unknown" NBU labelled media, or at least there isn't a way to control this as there is for tar, CPIO etc... 

Thanks again, Andrew

sclind
Moderator
Moderator
   VIP   

It would be interesting to have an option in NB to accept 'new' tapes only if they were blank/unformatted and to reject them if they had *any* data at all on them.

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified

@sclind wrote:

It would be interesting to have an option in NB to accept 'new' tapes only if they were blank/unformatted and to reject them if they had *any* data at all on them.


If this was the case, it would mean that someone would have to manually re-label each expired tape before it could be overwritten. 

If this 'someone' forgot, can you imagine all the Frozen tapes the next morning? Along with status 96's? 

sclind
Moderator
Moderator
   VIP   

Marianne - I think you have misunderstood what I wrote.  Tapes that are going through the normal use --> expire --> reuse process would not be affected.

Only when a new/unknown tape is inrtroduced into the environment would this check come into play.  If the company has inserted fresh new tapes (which are blank) - no problem.  But if they have accidentaly inserted tapes from another system (which have data) the customer would have the option to have them rejected.

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified

The moment a tape from another environment, is added to NBU via Inventory, it becomes 'known' to NBU. 
If Barcode Rules add all new tapes to Scratch by default, their status is Unassigned, Scratch.

So, exactly like any other tape going through the normal use --> expire --> Unassigned, Scratch. 

andrew_mcc1
Level 6
   VIP   

Well I have to (very respectfully) disagree I'm afraid; yes the tape does suddenly becomes "known", but this NBU master cannot know whether it contains unexpired backup images or not as it hasn't been in control of writing and expiring them. This is the exact point my customer is making and personally I think it is valid.

I'll post an enhancement request for some way to handle this (if I can find how enhancements are being logged this month...) Thanks, Andrew

mph999
Level 6
Employee Accredited

You will find, all backup software works in a similar fashion - it can only 'protect' you from so much.  In NBU case, it checks the mediaid written to the tape is the one expected, which prevents tapes being overwritten, if for eample the barcode labels were swapped (which believe it or not, does ocassionally happen).

NBU cannot protect in the way you suggest, as it would never be able to overwrite expired tapes.

An enhancement request to protect in the way you describe is not likely to happen as there is a perfectly good way of preventing this, which as mentioned, is to use spllit barcodes per site.  Given there is a way around this, I think any enhancement request would be rejected.

Put bluntly, and I apologise for this, it comes down to system design.

You could introduce split barcodes which resolves the issue, as tapes expire, delete them and change the barcode.

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified
I totally agree with @mph999.
Veritas/Symantec/Veritas have been warning against accidental overwriting of tapes from other environments for imports or Recovery Without Import. These tapes must be write protected and put into non-backup pool. There is ample documentation about this.

NetBackup is working as designed.

Just a recap of media selection - extract from NBU Admin Guide II:

NetBackup searches the media catalog for a volume that is already mounted in a drive and meets the following criteria:

Configured to contain backups at the retention level that the backup schedule requires. However, if the NetBackup Media host property Allow multiple retentions per media is specified for the server, NetBackup does not search by retention level.

In the volume pool that the backup job requires.

Not in a FULL, FROZEN, IMPORTED, or SUSPENDED state.

Of the same density that the backup job requested, and in the robot that the backup job requested.

Not currently in use by another backup or a restore.

Not written in a protected format. NetBackup detects the tape format after the volume is mounted. If the volume is in a protected format, NetBackup unmounts the volume and resumes the search.

If a suitable volume is found, NetBackup uses it.

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified

andrew_mcc1
Level 6
   VIP   

Marianne, mph999, thanks for the inputs.

I agree NBU is working as designed, I just think a option to stop overwrites in this specific scenario would be beneficial. As sclind points out, tapes going through the normal use --> expire --> reuse cycle should not be affected. Again, many thanks, Andrew

mph999
Level 6
Employee Accredited

Hi Andrew,

The problem is, there is no way to detect a tape that has come from another environment with valid images, as opposed to one that has expired.

Once the tape is inventoried into the library, it looks exactly the same as a tape that has expired, from both a 'data on tape' view, and from the NBU catalog view. 

The only way I can think you could spot such a tape is from vmquery command (eg. vmquery -m <media id> or vmquery -a ) showing Number of mounts as 0.  In which case the customer could perhaps put together a small procedure to run the command and look for tapes with 0 mounts.

eg

Set barcode rules so  tapes move to an unused volume pool (stops NBU grabbing them as soon as they are available)

Review vmquery output for tapes with 0 mounts

Once happy, move tapes into scratch or other requite volume pool

As I have mentioned, I cannot see Engineering making a code change for this, as split barcodes are the way to avoid such an issue.