cancel
Showing results for 
Search instead for 
Did you mean: 

Backup is downing the drive

DrStrangelove
Level 2

For some reason that I have been unable to diagnose the drive keeps crashing on my Imation Tape Library when conducting Backups with error codes of 910, 912, 913, and 915.  Most of the files are backed up but it show as a red error and downs the drive which I have to manually restart the server or try and restart the drive.

1 ACCEPTED SOLUTION

Accepted Solutions

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified

This looks like Application log, right?

Fatal open error on IBM.ULTRIUM-HH4.000 (device 0, \\.\Tape0): The device is not connected.  DOWN'ing it 

Anything in System log?

It seems as if the OS is losing connectivity to the tape drive.

To fix this, you need to check system log for errors. See if error is reported by HBA driver or tape driver.
Issue can be caused by anything in the data path - 
HBA (firmware, driver, faulty card, etc), loose cable, switch port, etc... etc... 

You will have to examine each piece of the data path to pinpoint the error.
Get OS team and Library support team involved.

View solution in original post

5 REPLIES 5

mph999
Level 6
Employee Accredited

What does the bptm log show for the time the drive goes down / messages log of system event log would be useful as well.

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified

Please show us all text in Details tab of failed jobs?

I cannot find status 910 in the Status Code manual, but  912, 913, and 915 all seem to be Resource Broker issues related to disk volumes.

So, it is really strange that you are seeing these errors when writing to tape. Seeing text in Details tab may help us to get a better understanding of what is going wrong here.

Please also check Windows Event Viewer System and Application logs for errors. If so, please copy the text in the error message and post here.

To enable logging from NBU point of view, please create bptm folder under <install-path>\Veritas\Netbackup\logs\.  bptm log files will contain NBU I/O actions and errors.
Also add VERBOSE entry to <install-path>\Veritas\volmgr\vm.conf and restart NetBackup Device Management service. This will log NBU Media Manager issues to Event Viewer.

 

DrStrangelove
Level 2

Marianne,

Here is the Event Viewer information I could not find the bptm.log.

Event Viewer Application Errors

Daemon terminating because ltid is no longer active

emmlib_UpdateDriveRuntime failed, status=258

Operator/EMM server has DOWN'ed drive IBM.ULTRIUM-HH4.000 (device 0)

Fatal open error on IBM.ULTRIUM-HH4.000 (device 0, \\.\Tape0): The device is not connected.  DOWN'ing it

TLD(0) [4984] timed out after waiting 842 seconds for ready, drive 1

 

DrStrangelove
Level 2

Found the bptm.log

05:57:44.291 [5752.6020] <2> bptm: INITIATING (VERBOSE = 0): -delete_all_expired
05:57:44.291 [5752.6020] <8> vnet_same_host_and_update: [vnet_addrinfo.c:2820] name2 is empty 0 0x0
05:57:44.291 [5752.6020] <4> bptm: emmserver_name = ibr2ccofs003
05:57:44.291 [5752.6020] <4> bptm: emmserver_port = 1556
05:57:44.322 [5752.6020] <2> Orb::init: Enabling ORBNativeCharCodeSet UTF-8(Orb.cpp:594)
05:57:44.322 [5752.6020] <2> Orb::init: initializing ORB EMMlib_Orb with: bptm -ORBSvcConfDirective "-ORBDottedDecimalAddresses 0" -ORBSvcConfDirective "static Resource_Factory '-ORBNativeCharCodeSet UTF-8'" -ORBSvcConfDirective "static PBXIOP_Factory '-enable_keepalive'" -ORBSvcConfDirective "static EndpointSelectorFactory ''" -ORBSvcConfDirective "static Resource_Factory '-ORBProtocolFactory PBXIOP_Factory'" -ORBSvcConfDirective "static Resource_Factory '-ORBProtocolFactory IIOP_Factory'" -ORBDefaultInitRef '' -ORBSvcConfDirective "static PBXIOP_Evaluator_Factory '-orb EMMlib_Orb'" -ORBSvcConfDirective "static Resource_Factory '-ORBConnectionCacheMax 1024 '" -ORBSvcConf nul -ORBSvcConfDirective "static Server_Strategy_Factory '-ORBMaxRecvGIOPPayloadSize 268435456'"(Orb.cpp:825)
05:57:44.338 [5752.6020] <2> Orb::init: caching EndpointSelectorFactory(Orb.cpp:840)
05:57:44.338 [5752.6020] <2> Orb::setOrbConnectTimeout: timeout seconds: 60(Orb.cpp:1481)
05:57:44.338 [5752.6020] <2> Orb::setOrbRequestTimeout: timeout seconds: 1800(Orb.cpp:1490)
05:57:44.400 [5752.6020] <2> bptm: EXITING with status 0 <----------
06:02:44.451 [3628.3472] <2> bptm: INITIATING (VERBOSE = 0): -rptdrv -jobid -1405382642 -jm
06:02:44.451 [3628.3472] <2> drivename_open: Called with Create 0, file IBM.ULTRIUM-HH4.000
06:02:44.451 [3628.3472] <2> drivename_checklock: Called
06:02:44.451 [3628.3472] <2> drivename_checklock: File is locked
06:02:44.451 [3628.3472] <2> report_drives: DRIVE = IBM.ULTRIUM-HH4.000 LOCK = TRUE CURTIME = 1405429364
06:02:44.451 [3628.3472] <2> drivename_close: Called for file IBM.ULTRIUM-HH4.000
06:02:44.466 [3628.3472] <2> report_drives: MODE = 0
06:02:44.466 [3628.3472] <2> report_drives: TIME = 1405427538
06:02:44.466 [3628.3472] <2> report_drives: MASTER = ibr2ccofs003
06:02:44.466 [3628.3472] <2> report_drives: SR_KEY = 0 1
06:02:44.466 [3628.3472] <2> report_drives: PATH = {4,0,3,0}
06:02:44.466 [3628.3472] <2> report_drives: MEDIA = 0132L4
06:02:44.466 [3628.3472] <2> report_drives: REQID = -1405382638
06:02:44.466 [3628.3472] <2> report_drives: ALOCID = 221
06:02:44.466 [3628.3472] <2> report_drives: RBID = {6E8B533B-90DA-4F88-9950-B7EB6AD9E2E9}
06:02:44.466 [3628.3472] <2> report_drives: PID = 5268
06:02:44.466 [3628.3472] <2> report_drives: FILE = C:\Program Files\Veritas\NetBackup\db\media\tpreq\drive_IBM.ULTRIUM-HH4.000
06:02:44.466 [3628.3472] <2> main: Sending [EXIT STATUS 0] to NBJM
06:02:44.466 [3628.3472] <2> bptm: EXITING with status 0 <----------
 

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified

This looks like Application log, right?

Fatal open error on IBM.ULTRIUM-HH4.000 (device 0, \\.\Tape0): The device is not connected.  DOWN'ing it 

Anything in System log?

It seems as if the OS is losing connectivity to the tape drive.

To fix this, you need to check system log for errors. See if error is reported by HBA driver or tape driver.
Issue can be caused by anything in the data path - 
HBA (firmware, driver, faulty card, etc), loose cable, switch port, etc... etc... 

You will have to examine each piece of the data path to pinpoint the error.
Get OS team and Library support team involved.