cancel
Showing results for 
Search instead for 
Did you mean: 

Tape drives non-functional

NBU35
Level 6

Hi 

we have master server solairs 10, VCS cluster, NBU 7.7.3

Media Server : HPUX B.11.31 U ia64 , NBU 7.7.3

tape library - SL3000, which has 4 LTO4 and 20 LTO5 tape drives.

My robot control has only 18 LTO5 tape drives configured on it. 
As 4 LTO4 are not shared and only configured on specific media server (These drives are working fine).

2 LTO5 are specifically configured on another media server and are not share.

now out of these 18 tape drives, 8 tape drives are not working fine.

they are forever in down-tld mode, if we make them up and run backups, backups fail with tape-rbot error,

Now concern is even robtest is not able to dismount tapes, it gives scsi errors.

m d16 s262

Initiating MOVE_MEDIUM from address 1015 to 2261

move_medium failed

sense key = 0x5, asc = 0x3a, ascq = 0x0, MEDIUM NOT PRESENT

We have checked with tape library vendor according to them there is no hardware issue with tape drives and library.

 ALso all devices are in claimed state and visible on OS. 

Please suggest what i need to look for to fix this issue.

1 ACCEPTED SOLUTION

Accepted Solutions

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified

If the device name was present before, then it means that 'something' happened that caused the OS to lose connectivity to the device. 
NBU uses the OS for device access.

So, you need to troubleshoot as OS-level. 
Check output of '/usr/openv/volmgr/bin/scan' and 'ioscan -f'.

Check syslog file for errors. 

View solution in original post

4 REPLIES 4

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified

The tape needes to be ejected from the drive before it can be moved back to slot.

So, if the tape is still in the drive, then the 'MEDIUM NOT PRESENT ' error is correct. 
There is no 'unloaded' tape to be picked up by the robot hand.
The media server that loaded the tape needs to unload the tape first. 
Robot control host cannot do this if the drive is still reserved by the media server. 
bptm on media server will send unload/eject to the tape drive when the job is done. 
Evidence can be seen in bptm log on the media server.
I suggest level 3 log for troubleshooting (level 5 only upon request from Veritas Support).

To see why drives are DOWN'ed, you need to add 
VERBOSE
to vm.conf on all affected media servers and restart NBU/ltid.

Try to UP the drives after this.
Next time drive is DOWN'ed, the exact reason will by logged in System log on the media server.
(e.g. /var/adm/syslog/syslog.log on HP-UX media server,  /var/adm/messages on Solaris media server.)

 

I am getting following error during unload command 

 

unload d17
Opening /dev/rtape/tape73_BESTnb, on the local host, please wait...
Error - cannot open /dev/rtape/tape73_BESTnb (No such device or address)

However this drive is ther in tpconfig -d 

18 TLB5-KJC-DRIVE17 hcart2 TLD(2) DRIVE=17
/dev/rtape/tape73_BESTnb DOWN

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified

If the device name was present before, then it means that 'something' happened that caused the OS to lose connectivity to the device. 
NBU uses the OS for device access.

So, you need to troubleshoot as OS-level. 
Check output of '/usr/openv/volmgr/bin/scan' and 'ioscan -f'.

Check syslog file for errors. 

Thanks, i have fixed the issue.

You guided me right towards syslogs, there were some SCSI RESERVATION CONFLICTs.

 

Operator requested SCSI Release of Drive TLB5-KJC-DRIVE11 returned RESERVATION CONFLICT

I removed them using st command.

procedure was as follows:

found drive path using tpconfig -d 
then found equivalent sctl path using ioscan -m dsf | grep -i "path"

then st -f drivepath -r and vmoprcmd -crwalreleasebyname.

now everything is working fine.