Scan -changer not showing the Robot on Netbackup Media Server
Hi All,
We have a Linux Master Server and a Linux Media Server
Server Details (Media ):-
uname -a
Linux <Server name> 2.6.32.12-0.7-default #1 SMP 2010-05-20 11:14:20 +0200 x86_64 x86_64 x86_64 GNU/Linux
Following command does not show the Robot.
However all the tapes are visible.
Backups and Restores are however running fine.
scan -changer
Hi Tulika,
I think I have answered all these questions when we took a closer look at the system.
We found the issue was intermittant - therefore, when the robot became available, NBU would be able to load/ unload tapes etc ...
You asked,
"When the SCSI command is successfull the Server detects the Robot in the SCAN Command ,else it fails."
This is simple - the commands :
scan / tpautoconf -r / scsi_command etc ... are all very very similar. Sure, the output is different, but they all work by sending scsi commands to the devices. So, when one breaks they all break, when one works, they all work. We found that scsi_copmmand -d <device file to robot> failed intermittantly. For a given time period, say an hour, we found it was working for 50+ minutes, but would work for just a few minutes.
You keep looking at NetBackup, forget it - until scsi_commnd -d (+ the other commands) are working 100% of the time you cannot consider NetBackup will work.
We found that even when scsi_command was not working for the robot, it did always work for a tape drive ( scsi_command -d <tape drive device file> ) - this is shows that issue is related to something between the OS and the robot.
I found that that the meaning of this :
scsi inquiry command failed
status 2h, key bh, ASC 4eh, ASCQ 0hsense 0x0b, asc 0x4e, ascq 0x00 occuredis "OVERLAPPED COMMANDS ATTEMPTED"Searching the NBU database I found multiple previous calls ...(1)SOLUTION:Refer to hardware vendor to address SCSI errors.TROUBLESHOOTING STEPS:Sep 17 11:54:50 scsi: [ID 107833 kern.notice] ASC: 0x4e (overlapped commands attempted), ASCQ: 0x0, FRU: 0xf5Sep 17 23:01:27 last message repeated 2 timesSep 18 04:42:33 scsi: [ID 107833 kern.warning] WARNING: /pci@8,600000/fibre-channel@1/st@7,0 (st23):Sep 18 04:42:33 SCSI transport failed: reason 'tran_err': giving up(2)Sep 8 11:41:48 bptm[7620]: [ID 832037 daemon.error] scsi command failed, may be timeout, scsi_pkt.us_reason = 3Sep 8 11:41:48 scsi: [ID 107833 kern.warning] WARNING: /pci@8,700000/fibre-channel@2/st@4,2 (st270):Sep 8 11:41:48 Error for Command: rezero/rewind Error Level: FatalSep 8 11:41:48 scsi: [ID 107833 kern.notice] Requested Block: 0 Error Block: 0Sep 8 11:41:48 scsi: [ID 107833 kern.notice] Vendor: QUANTUM Serial Number: W $ iSep 8 11:41:48 scsi: [ID 107833 kern.notice] Sense Key: Aborted CommandSep 8 11:41:48 scsi: [ID 107833 kern.notice] ASC: 0x4e (overlapped commands attempted), ASCQ: 0x0, FRU: 0x0SOLUTION: Customer will have their SAN group take a look at the storage area hardware.(3)Command overlapp: The library detected another command from an initiator> while one was already> in process. This is more often caused by third party monitoring software> polling the peripheral devicesSOLUTION: We found that customer installed 'Sun Management Center Agent' on this server. Agent was disabled and no errors occured during monitoring period.(4)"Key = 0xb, asc = 0x4e, ascq = 0x0, OVERLAPPED COMMANDS ATTEMPTED"inquiry error: cannot determine robot typeslot read_element_status errorfollowed by move medium failed errors.Advised customer since this was working before and is not a new library, there is nothing in NBU which will cause this issue. Issue lies either with the library or the communication path between library and server.(5)From Technote S:TECH183410Error/var/log/syslogOVERLAPPED COMMANDS ATTEMPTED. Disconnect during command processing. SCSI Error - ASC[0x4E], ASCQ[0x00]EnvironmentSolaris MasterSolaris Media serverStorageteck SL LibraryCauseVendor hardware related SCSI error with cartidge access port (CAP).SolutionLibrary Hardware vendor required fixNone of the previous times we ave seen this issue have had NBU as a cause.We know the NBU config is correct, because it does work sometimes, if the config was wrong it would never work.The issue is outside of NetBackupMartin