Netbackup 7.0.1 robot displaying down-tld
Netbackup 7.0.1. Robot displaying down-tld does not respond to reset of drive...or...cleaning of drive. Resarted daemons...no difference. Powered off/on tape library...no difference. Initial check of logs did not show any clear issues. If anyone has some ideas I would appreciate it. Thank you in advance.
All the advice above is excellent.
Generally in NBU if the drive(s) have been working correctly and nothing has been changed, then it is very unlikely NBU is the cause, the main reason being, NBU does not write to drives, it is all done by the OS.
Just occassionally, I find the completely removing and reconfiguring the drive brings it back to life - and by that mean remove it from the OS and NBU completly, then put it back.
If that does not promote it back into life, then no amount of prodding, poking or tickling it under the chin is likely to make a difference, and it is likely that it needs to go off to the 'tape drive hospital' for some treatment.
NBU has minimum contact with tape drives, the only thing it does is send a few scsi commands and apart from some versions of unix/ linux, even these go via the OS, and even then, these scsi commands are only used so NBU knows when the tape is in the drive. After that point, it's all OS (NBU just passes the data to the OS, which then writes it onto tape).
It is for this reason, as pointed out by Marianne, that tape drive issue investigation, should start at the os.
We can look in this file /usr/openv/netbackup/db/media/errors (or win <install>\veritas\netbackup\db\media\errors ) and get some idea if there is any pattem to the errors on the drive or media.
If you have access to Solaris, you can download tperr.sh and run the media/errors file against it, full instructions and download here :
https://www-secure.symantec.com/connect/downloads/tperrsh-script-solaris-only
(the errors file is on each media server).
The system log should show some detail (follow Mariaanes instructions) - if you see any thing that mentions io_ioctl / ASC/ ASCQ or Tapealert - it is almost 100% certain you have a faulty drive).
Martin