02-02-2016 02:59 AM
Hi All,
On one of our master server, We are unable to take inventory of robot. But I am able to do the same from Library web GUI and robtest is working fine.
Can anyone has faced the same issue? No error in library console and robtest.
Thanks
Solved! Go to Solution.
02-02-2016 05:32 AM
Firstly, you have an ill library
slot 1 (addr 4096) contains Cartridge = no
Sense code = 0x83, Code qualifier = 0x0
You should not be getting Sense codes against the slots.
Robtest connected via /dev/sg4
scan shows sg4
=========== Missing Device or no local control path (Robot) ===========
Defined as robotic TLD(0)
Inquiry = "IBM 3573-TL B.20"
Serial Number = 00X2U78K0627_LL0
Robot Path = /dev/sg4
Drive = 2, Drive Name = IBM.ULT3580-HH4.000, Serial Number = 1K10024956
Hosts configured for this device:
Host = meuubkpprd01.monsanto.com
Either scan failed when it ran - but the connectuion is intermittant so when you ran robtest, that was at a time the path was working.
Another possiblity is the robot isn't responding to the scsi_commands send by scan, but does to the commands sent by robtest.
We seem to have another device path floating around, not 100% sure what this is as on Linux, far as I know we use /dev/sg
[8:0:0:1] mediumx IBM 3573-TL B.20 /dev/sch0
I'd be tempted to reboot everything as a first port of call
02-02-2016 04:02 AM
What do these query commands show:
nbemmcmd -listhosts tpconfig -d
02-02-2016 04:08 AM
[ccmn_na_p@meuubkpprd01 admincmd]$ sudo ./nbemmcmd -listhosts
NBEMMCMD, Version: 7.7
The following hosts were found:
server meuubkpprd01.monsanto.com
master meuubkpprd01.monsanto.com
foreign_media la6232bkp01
virtual_machine meuwvcprd01.la.ds.monsanto.com
virtual_machine la2589esx02.monsanto.com
virtual_machine la2589esx03.monsanto.com
ndmp meuubkpprd01.monsanto.com
Command completed successfully.
[ccmn_na_p@meuubkpprd01 admincmd]$ cd ../../../..
[ccmn_na_p@meuubkpprd01 usr]$ cd openv/volmgr/bin
[ccmn_na_p@meuubkpprd01 bin]$ ./tpconfig -d
Access to perform the operation was denied
[ccmn_na_p@meuubkpprd01 bin]$ sudo ./tpconfig -d
Id DriveName Type Residence
Drive Path Status
****************************************************************************
0 IBM.ULT3580-HH4.000 hcart3 TLD(0) DRIVE=2
/dev/nst1 UP
1 IBM.ULT3580-HH4.001 hcart3 TLD(0) DRIVE=1
/dev/nst0 UP
Currently defined robotics are:
TLD(0) robotic path = /dev/sg4
EMM Server = meuubkpprd01.monsanto.com
[ccmn_na_p@meuubkpprd01 bin]$
02-02-2016 04:12 AM
Is the path to th elibrary definately correct ?
On the robot control host, find the path to the library from tpconfig -d (near the bottom)
Then run
echo 'mode' |tldtest -r /path/to/library (this might work)
and
/usr/openv/volmgr/bin/scsi_command -f <path/to/library>
It looks like the robot is failing to respond to the scsi commands that are sent that would cause it to run an inventory, this is completly seperate from the 'library console inventory' and that fact that works is of course good, but has no significance on if the inventory related to NBU works.
All that happenes in a NBU inventory, is a scsi command(s) is sent to the library, and the library responds to NBU and tells NBU where the tapes are - hence why this is very very very unlikely to be a NetBackup issue.
I'd suggest as a starting point (apart from the command above and sdos comments, rebooting the library via a proper power off/ on, and then trying again, and the media server that is the robot control host.
If (path is correct) and above still fails after the reboots, you'll probably need to be calling the hardware vendor.
02-02-2016 04:15 AM
Ok - looks like one master/media server.
What do these query commands show:
scan tpautoconf -report_disc
.
BTW - do NOT run any of the other variants of "tpautoconf" - just yet. Only run the above query commands please.
02-02-2016 04:24 AM
Okay here is sdo command output
[ccmn_na_p@meuubkpprd01 bin]$ ./scan
************************************************************
*********************** SDT_TAPE ************************
*********************** SDT_CHANGER ************************
************************************************************
[ccmn_na_p@meuubkpprd01 bin]$ ./tpautoconf -report_disc
======================= Missing Device (Drive) =======================
Drive Name = IBM.ULT3580-HH4.000
Drive Path = /dev/nst1
Inquiry = "IBM ULT3580-HH4 89B1"
Serial Number = 1K10024956
TLD(0) definition Drive = 2
Hosts configured for this device:
Host = meuubkpprd01.monsanto.com
======================= Missing Device (Drive) =======================
Drive Name = IBM.ULT3580-HH4.001
Drive Path = /dev/nst0
Inquiry = "IBM ULT3580-HH4 C7QJ"
Serial Number = 1K10031782
TLD(0) definition Drive = 1
Hosts configured for this device:
Host = meuubkpprd01.monsanto.com
=========== Missing Device or no local control path (Robot) ===========
Defined as robotic TLD(0)
Inquiry = "IBM 3573-TL B.20"
Serial Number = 00X2U78K0627_LL0
Robot Path = /dev/sg4
Drive = 2, Drive Name = IBM.ULT3580-HH4.000, Serial Number = 1K10024956
Hosts configured for this device:
Host = meuubkpprd01.monsanto.com
======================================================================
To try mhp999 suggestion, I am getting permission denied for tldtest command, So i need other team support once done I will post the results
02-02-2016 04:40 AM
SO the paths to the drives and robot are missing.
This is at OS level
Try rebooting the server
02-02-2016 04:40 AM
I thought you said robtest is working fine?
robtest is calling tldtest.
There is no way robtest/tldtest can be working fine if 'scan' cannot see the robot and 'tpautoconf -report_disc' is reporting missing devices.
Please show output of 'cat /proc/scsi/scsi' as well.
02-02-2016 04:41 AM
Scan shows nothing, meaning that the OS cannot see any medium changer or tape drives. This is not a NetBackup config error. This means the OS cannot see the devices, and therefore NetBackup cannot do anything at all.
The tpautoconf -report_disc means... "Hey Netbackup, show me the discrepencies/differences between what the OS reports, and what my NetBackup idea of my configuration says?". In a normal working condition scenario this command would report nothing, i.e.no discrepencies. But, because it does report discrepencies, then this means that NetBackup can no longer find the devices listed.
So, scan is empty.
And NetBackup reports it cannot find the devices.
You need to figure out from an OS level what is wrong.
.
What changed recently?
02-02-2016 04:48 AM
Hi All,
I tried to list the scsi please check the output.
[0:2:0:0] disk DELL PERC H710 3.13 /dev/sda
[0:2:1:0] disk DELL PERC H710 3.13 /dev/sdb
[5:0:0:0] cd/dvd PLDS DVD-ROM DS-8D9SH JD51 /dev/sr0
[8:0:0:0] tape IBM ULT3580-HH4 C7QJ /dev/st0
[8:0:0:1] mediumx IBM 3573-TL B.20 /dev/sch0
[8:0:1:0] tape IBM ULT3580-HH4 89B1 /dev/st1
cat /proc/scsi/scsi
Attached devices
and
Host: scsi0 Channel: 02 Id: 00 Lun: 00
Vendor: DELL Model: PERC H710 Rev: 3.13
Type: Direct-Access ANSI SCSI revision: 05
Host: scsi0 Channel: 02 Id: 01 Lun: 00
Vendor: DELL Model: PERC H710 Rev: 3.13
Type: Direct-Access ANSI SCSI revision: 05
Host: scsi5 Channel: 00 Id: 00 Lun: 00
Vendor: PLDS Model: DVD-ROM DS-8D9SH Rev: JD51
Type: CD-ROM ANSI SCSI revision: 05
Host: scsi8 Channel: 00 Id: 00 Lun: 00
Vendor: IBM Model: ULT3580-HH4 Rev: C7QJ
Type: Sequential-Access ANSI SCSI revision: 03
Host: scsi8 Channel: 00 Id: 00 Lun: 01
Vendor: IBM Model: 3573-TL Rev: B.20
Type: Medium Changer ANSI SCSI revision: 05
Host: scsi8 Channel: 00 Id: 01 Lun: 00
Vendor: IBM Model: ULT3580-HH4 Rev: 89B1
Type: Sequential-Access ANSI SCSI revision: 03
Thanks.
I can do the robtest, here is the output I dont know how.
[ccmn_na_p@meuubkpprd01 bin]$ sudo ./robtest
Configured robots with local control supporting test utilities:
TLD(0) robotic path = /dev/sg4
Robot Selection
---------------
1) TLD 0
2) none/quit
Enter choice: 1
Robot selected: TLD(0) robotic path = /dev/sg4
Invoking robotic test utility:
/usr/openv/volmgr/bin/tldtest -rn 0 -r /dev/sg4
Opening /dev/sg4
MODE_SENSE complete
Enter tld commands (? returns help information)
s d
drive 1 (addr 256) access = 1 Contains Cartridge = no
drive 2 (addr 257) access = 1 Contains Cartridge = no
READ_ELEMENT_STATUS complete
Invalid command
s s
slot 1 (addr 4096) contains Cartridge = no
Sense code = 0x83, Code qualifier = 0x0
slot 2 (addr 4097) contains Cartridge = yes
Source address = 4097
Barcode = 000132L4
slot 3 (addr 4098) contains Cartridge = yes
Source address = 4098
Barcode = 000148L4
slot 4 (addr 4099) contains Cartridge = yes
Source address = 4099
Barcode = 000174L4
slot 5 (addr 4100) contains Cartridge = no
Sense code = 0x83, Code qualifier = 0x0
slot 6 (addr 4101) contains Cartridge = yes
Source address = 4101
Barcode = 000123L4
slot 7 (addr 4102) contains Cartridge = yes
Source address = 4102
Barcode = 000142L4
slot 8 (addr 4103) contains Cartridge = yes
Source address = 4103
Barcode = 000246L4
<< Press return to continue, or q and return to stop >>
slot 9 (addr 4104) contains Cartridge = no
Sense code = 0x83, Code qualifier = 0x0
slot 10 (addr 4105) contains Cartridge = yes
Source address = 4105
Barcode = 000128L4
slot 11 (addr 4106) contains Cartridge = yes
Sense code = 0x30, Code qualifier = 0x3
Source address = 4106
Barcode =
slot 12 (addr 4107) contains Cartridge = yes
Source address = 4107
Barcode = 000826L4
slot 13 (addr 4108) contains Cartridge = no
slot 14 (addr 4109) contains Cartridge = yes
Sense code = 0x30, Code qualifier = 0x3
Source address = 4109
Barcode =
slot 15 (addr 4110) contains Cartridge = no
slot 16 (addr 4111) contains Cartridge = yes
Source address = 4111
Barcode = 000130L4
<< Press return to continue, or q and return to stop >>
slot 17 (addr 4112) contains Cartridge = no
slot 18 (addr 4113) contains Cartridge = yes
Source address = 4113
Barcode = 000121L4
slot 19 (addr 4114) contains Cartridge = yes
Source address = 4114
Barcode = 000825L4
slot 20 (addr 4115) contains Cartridge = no
slot 21 (addr 4116) contains Cartridge = no
slot 22 (addr 4117) contains Cartridge = no
slot 23 (addr 4118) contains Cartridge = yes
Source address = 4118
Barcode = 000820L4
slot 24 (addr 4119) contains Cartridge = no
READ_ELEMENT_STATUS complete
q
Robot Selection
---------------
1) TLD 0
2) none/quit
Enter choice: 2
[ccmn_na_p@meuubkpprd01 bin]$
02-02-2016 05:05 AM
So - earlier on the devices were 'missing'.
What did you do to get them back?
Reboot?
Have you tried to Inventory again?
02-02-2016 05:32 AM
Firstly, you have an ill library
slot 1 (addr 4096) contains Cartridge = no
Sense code = 0x83, Code qualifier = 0x0
You should not be getting Sense codes against the slots.
Robtest connected via /dev/sg4
scan shows sg4
=========== Missing Device or no local control path (Robot) ===========
Defined as robotic TLD(0)
Inquiry = "IBM 3573-TL B.20"
Serial Number = 00X2U78K0627_LL0
Robot Path = /dev/sg4
Drive = 2, Drive Name = IBM.ULT3580-HH4.000, Serial Number = 1K10024956
Hosts configured for this device:
Host = meuubkpprd01.monsanto.com
Either scan failed when it ran - but the connectuion is intermittant so when you ran robtest, that was at a time the path was working.
Another possiblity is the robot isn't responding to the scsi_commands send by scan, but does to the commands sent by robtest.
We seem to have another device path floating around, not 100% sure what this is as on Linux, far as I know we use /dev/sg
[8:0:0:1] mediumx IBM 3573-TL B.20 /dev/sch0
I'd be tempted to reboot everything as a first port of call
02-02-2016 09:23 PM
02-03-2016 11:12 PM
Hi All,
Sorry for the delay, Struck with some production issue, The robtest output which has been shared in earlier comments are taken after the scan comman output where we found missing devices, But now we are not getting the robtest output, We are gonna reboot the library and server after the necessary approvals which takes another two or three days.
Will keep you posted, Please share any suggestions which doesnt require reboot,
Thanks
02-03-2016 11:43 PM
02-04-2016 12:52 AM
Hi Marianne,
Yes we do have emergency change control, I thought of rebooting the library and then the server, Hence we need approval as other applications may reside on the server.
As of now I am trying to get hold of site contact, Once done will post the results.
02-04-2016 01:12 PM
Try rebooting your change control people ...
If the library isn't working, you're not going to make it any worse ...
02-04-2016 07:57 PM
02-04-2016 11:14 PM
Hi All,
We tried to reboot the library but it didnt come up, Raised case with vendor, They took it to their lab for further invistigation.
02-04-2016 11:30 PM
Lesson learned:
NetBackup is merely reporting errors as far as hardware access is concerned.
Martin (who is the number 1 Veritas expert on device-related problems) said 3 days ago:
"Firstly, you have an ill library"