Forum Discussion

sriharishkandul's avatar
12 years ago

Media Freezing

Hi Guys,

Recently i have encountered a problem while backing up of data.

The following error was observed in the detailed status of the job:

 

Error bptm (pid:3876) FREEZING media id <media id>, Externl event caused rewind during write, all data on the media is lost

 

Could someone pls explain me the reason for the below error, and will the previous data is all lost and if so how can i recover it.

 

thanks

  • Sorry for the late reply guys..being busy with a few Audit check.

     

    Thanks for your valuable support..the issue is now resolved.

     

    The issue was mainly due to a couple of faulty tape drives which were replaced and even we had upgraded the library and drive firmware to the latest version.

     

    Now everything is fine as of now..Hope the continues even in the future.

  • Externl event caused rewind during write?

    Can you try to list all the server access the tape library?

    If some server have zone to the tape library, but do not install backup software.Please remove the server from the zone.

    This issue is (potentially) serious and requires immediate investigation, as data can be lost. NetBackup will display this error if the block position calculation check by NetBackup does not match the position reported by the drive. It will not be certain that a full rewind has occurred (impossible to tell from a simple blockcheck), but it will mean that the position check has failed, and most likely that the calculated position is less than the expected position.

     

    detail info ,please reference     http://www.symantec.com/docs/TECH169477

     

     

  • That was the error which i have received in the detailed status of the job.

    Could you please let me know hw to check the server access to the tape library and also how to check whether the server is zoned to the tape library or not.

  • This may be a SAN zoning issues. When zoning is performed in the SAN, the HBA should be zoned to each drive using a separate zone. Do not put the HBA and all the tape drives in the same zone.

    If HBA and drives are all in one zone, SCSI bus reset, drive reset etc will propagate to all the other drives. If you deploy a one HBA to one drive zone strategy such errors will be contained in the zone.

    I would shelf the media and wait for the data to expires, and then unfreeze the tape to put it back into rotation.

  • sriharishkandula-  please suggest

    Whether this is NDMP backup or not?

    NB version-

    Os-

    is this VTL?

     

     

    This is serious problem...Many times it is caused due to H/w, firmware issue.

    1. Need to ensure proper tape drive drivers are installed & are latest version.

     

    2. If the tape drives are san connected then ensure the store ports drivers (HBA) are updated to latest.

     

     

     

    External event caused rewind - worst case is the media was rewound mid backup and the

    Header is overwritten which is data lost, this is caused by multiple issues

    all outside nbu. Common causes are if the media server drives are shared

    With ndmp filer and the SCSI reservation type is set differently on the filer than in netbackup

    If no ndmp filer than other possible causes include hba fault or firmware issue

    or a san issue.

     

    Best case is not data loss and there has been no rewind, but there has

    been a positioning error so the backup probably failed.

     

     

     

     

  • If you look in the bptm log you swill see lines like this :

    00:00:21.366 [10057] <2> io_terminate_tape: block position check: actual 1304, expected 1304
     
    NBU knows how many blocks of data should have been written to the tape drive, it requests that the drive gives it position and they should match.  If they do not, it gives out the error you see, which can be mis-leading.
     
    If there really was a scsi rewind, then something has sent this over the san, and the data will be lost as the drive did a rewind during the backup (invisible to NBU and the operating system) and then continued to write from the beginning of the tape which overwrote the tape header.  This is easy to check, 
     
    bpmedialist -m <media id> -mcontents
     
    If this mounts the tape there was no scsi rewind.
     
    If there was a scsi rewind, two likely causes
     
    1.  Something sent a scsi rewind  (difficult to find unfortunately)
    2.  If you have ndmp devices sharing the drives with media servers, if the scsi reservation type set in NBU is different to that set on the ndmp devices, this issue can occur.
     
    More likely (in my experience) is that there is just a position failure, in this case the cause will be either a tape driver issue or drive firmware issue.
     
    Martin
  • Hi Guys sorry for the late reply..
     

    @Ankit: We are currently using 7.1.4 Netbackup version and a 2008 Server.

    We dont have any VTL but we use Quantum i-scalar 600 library.

     

    So Ankit you mean to say by installing the latest drive firmware and also HBA ports latest firmware the issue can be resolved right?

    Also is there any way in which i can retrieve the lost data from thoses medias which have encountered this problem?

  • Please have another look at Martin's post (seems you totally ignored it??):

    https://www-secure.symantec.com/connect/forums/media-freezing#comment-8792431

    Your question:
    ... is there any way in which i can retrieve the lost data from thoses medias which have encountered this problem?

    Martin:
    This is easy to check, 
    bpmedialist -m <media id> -mcontents

    You:
    .... they were NDMP backups

    Martin:
    2.  If you have ndmp devices sharing the drives with media servers, if the scsi reservation type set in NBU is different to that set on the ndmp devices, this issue can occur.

  • Sorry for the late reply guys..being busy with a few Audit check.

     

    Thanks for your valuable support..the issue is now resolved.

     

    The issue was mainly due to a couple of faulty tape drives which were replaced and even we had upgraded the library and drive firmware to the latest version.

     

    Now everything is fine as of now..Hope the continues even in the future.