cancel
Showing results for 
Search instead for 
Did you mean: 

Unable to inventory robot: Inventory failed: Unable to initialize robot (204)

Chidananda91
Level 4
Certified

Hi All,

 

On one of our master server, We are unable to take inventory of robot. But I am able to do the same from Library web GUI and robtest is working fine.

Can anyone has faced the same issue? No error in library console and robtest.

 

Thanks 

1 ACCEPTED SOLUTION

Accepted Solutions

mph999
Level 6
Employee Accredited

Firstly, you have an ill library

slot 1 (addr 4096) contains Cartridge = no
Sense code = 0x83, Code qualifier = 0x0

You should not be getting Sense codes against the slots.

Robtest connected via /dev/sg4


scan shows sg4 

 

=========== Missing Device or no local control path (Robot) ===========
 Defined as robotic TLD(0)
 Inquiry = "IBM     3573-TL         B.20"
 Serial Number = 00X2U78K0627_LL0
 Robot Path = /dev/sg4
 Drive = 2, Drive Name = IBM.ULT3580-HH4.000, Serial Number = 1K10024956
 Hosts configured for this device:
 Host = meuubkpprd01.monsanto.com

Either scan failed when it ran -  but the connectuion is intermittant so when you ran robtest, that was at a time the path was working.

Another possiblity is the robot isn't responding to the scsi_commands send by scan, but does to the commands sent by robtest.

We seem to have another device path floating around, not 100% sure what this is as on Linux, far as I know we use /dev/sg

[8:0:0:1]    mediumx IBM      3573-TL          B.20  /dev/sch0

I'd be tempted to reboot everything as a first port of call

 

View solution in original post

19 REPLIES 19

sdo
Moderator
Moderator
Partner    VIP    Certified

What do these query commands show:

nbemmcmd -listhosts

tpconfig -d

Chidananda91
Level 4
Certified

[ccmn_na_p@meuubkpprd01 admincmd]$ sudo ./nbemmcmd -listhosts
NBEMMCMD, Version: 7.7
The following hosts were found:
server           meuubkpprd01.monsanto.com
master           meuubkpprd01.monsanto.com
foreign_media    la6232bkp01
virtual_machine  meuwvcprd01.la.ds.monsanto.com
virtual_machine  la2589esx02.monsanto.com
virtual_machine  la2589esx03.monsanto.com
ndmp             meuubkpprd01.monsanto.com
Command completed successfully.
[ccmn_na_p@meuubkpprd01 admincmd]$ cd ../../../..
[ccmn_na_p@meuubkpprd01 usr]$ cd openv/volmgr/bin
[ccmn_na_p@meuubkpprd01 bin]$ ./tpconfig -d
Access to perform the operation was denied
[ccmn_na_p@meuubkpprd01 bin]$ sudo ./tpconfig -d
Id  DriveName           Type   Residence
      Drive Path                                                       Status
****************************************************************************
0   IBM.ULT3580-HH4.000  hcart3 TLD(0)  DRIVE=2
      /dev/nst1                                                        UP
1   IBM.ULT3580-HH4.001  hcart3 TLD(0)  DRIVE=1
      /dev/nst0                                                        UP

Currently defined robotics are:
  TLD(0)     robotic path = /dev/sg4

EMM Server = meuubkpprd01.monsanto.com

[ccmn_na_p@meuubkpprd01 bin]$
 

mph999
Level 6
Employee Accredited

Is the path to th elibrary definately correct ?

On the robot control host, find the path to the library from tpconfig -d  (near the bottom)

Then run

echo 'mode' |tldtest -r /path/to/library  (this might work)

and

/usr/openv/volmgr/bin/scsi_command -f <path/to/library>

It looks like the robot is failing to respond to the scsi commands that are sent that would cause it to run an inventory, this is completly seperate from the 'library console inventory' and that fact that works is of course good, but has no significance on if the inventory related to NBU works.

All that happenes in a NBU inventory, is a scsi command(s) is sent to the library, and the library responds to NBU and tells  NBU where the tapes are - hence why this is very very very unlikely to be a NetBackup issue.

I'd suggest as a starting point (apart from the command above and sdos comments, rebooting the library via a proper power off/ on, and then trying again, and the media server that is the robot control host.

If (path is correct) and above still fails after the reboots, you'll probably need to be calling the hardware vendor.

sdo
Moderator
Moderator
Partner    VIP    Certified

Ok - looks like one master/media server.

What do these query commands show:

scan

tpautoconf -report_disc

.

BTW - do NOT run any of the other variants of "tpautoconf" - just yet.  Only run the above query commands please.

Chidananda91
Level 4
Certified

Okay here is sdo command output 

 

[ccmn_na_p@meuubkpprd01 bin]$ ./scan
************************************************************
*********************** SDT_TAPE    ************************
*********************** SDT_CHANGER ************************
************************************************************
[ccmn_na_p@meuubkpprd01 bin]$ ./tpautoconf -report_disc
======================= Missing Device (Drive) =======================
 Drive Name = IBM.ULT3580-HH4.000
 Drive Path = /dev/nst1
 Inquiry = "IBM     ULT3580-HH4     89B1"
 Serial Number = 1K10024956
 TLD(0) definition Drive = 2
 Hosts configured for this device:
  Host = meuubkpprd01.monsanto.com
======================= Missing Device (Drive) =======================
 Drive Name = IBM.ULT3580-HH4.001
 Drive Path = /dev/nst0
 Inquiry = "IBM     ULT3580-HH4     C7QJ"
 Serial Number = 1K10031782
 TLD(0) definition Drive = 1
 Hosts configured for this device:
  Host = meuubkpprd01.monsanto.com
=========== Missing Device or no local control path (Robot) ===========
 Defined as robotic TLD(0)
 Inquiry = "IBM     3573-TL         B.20"
 Serial Number = 00X2U78K0627_LL0
 Robot Path = /dev/sg4
 Drive = 2, Drive Name = IBM.ULT3580-HH4.000, Serial Number = 1K10024956
 Hosts configured for this device:
  Host = meuubkpprd01.monsanto.com
======================================================================

To try mhp999 suggestion, I am getting  permission denied for tldtest command, So i need  other team support once done I will post the results

 

 

mph999
Level 6
Employee Accredited

SO the paths to the drives and robot are missing.

This is at OS level

Try rebooting the server

Marianne
Level 6
Partner    VIP    Accredited Certified

I thought you said robtest is working fine?

robtest is calling tldtest.

There is no way robtest/tldtest can be working fine  if 'scan' cannot see the robot and 'tpautoconf -report_disc' is reporting missing devices.

Please show output of 'cat /proc/scsi/scsi' as well.

sdo
Moderator
Moderator
Partner    VIP    Certified

Scan shows nothing, meaning that the OS cannot see any medium changer or tape drives.  This is not a NetBackup config error.  This means the OS cannot see the devices, and therefore NetBackup cannot do anything at all.

The tpautoconf -report_disc means... "Hey Netbackup, show me the discrepencies/differences between what the OS reports, and what my NetBackup idea of my configuration says?".   In a normal working condition scenario this command would report nothing, i.e.no discrepencies.  But, because it does report discrepencies, then this means that NetBackup can no longer find the devices listed.

So, scan is empty.

And NetBackup reports it cannot find the devices.

You need to figure out from an OS level what is wrong.

.

What changed recently?

Chidananda91
Level 4
Certified

Hi All,

 

I tried to list the scsi please check the output.

 

[0:2:0:0]    disk    DELL     PERC H710        3.13  /dev/sda
[0:2:1:0]    disk    DELL     PERC H710        3.13  /dev/sdb
[5:0:0:0]    cd/dvd  PLDS     DVD-ROM DS-8D9SH JD51  /dev/sr0
[8:0:0:0]    tape    IBM      ULT3580-HH4      C7QJ  /dev/st0
[8:0:0:1]    mediumx IBM      3573-TL          B.20  /dev/sch0
[8:0:1:0]    tape    IBM      ULT3580-HH4      89B1  /dev/st1

cat /proc/scsi/scsi
Attached devices

and

 

Host: scsi0 Channel: 02 Id: 00 Lun: 00
  Vendor: DELL     Model: PERC H710        Rev: 3.13
  Type:   Direct-Access                    ANSI  SCSI revision: 05
Host: scsi0 Channel: 02 Id: 01 Lun: 00
  Vendor: DELL     Model: PERC H710        Rev: 3.13
  Type:   Direct-Access                    ANSI  SCSI revision: 05
Host: scsi5 Channel: 00 Id: 00 Lun: 00
  Vendor: PLDS     Model: DVD-ROM DS-8D9SH Rev: JD51
  Type:   CD-ROM                           ANSI  SCSI revision: 05
Host: scsi8 Channel: 00 Id: 00 Lun: 00
  Vendor: IBM      Model: ULT3580-HH4      Rev: C7QJ
  Type:   Sequential-Access                ANSI  SCSI revision: 03
Host: scsi8 Channel: 00 Id: 00 Lun: 01
  Vendor: IBM      Model: 3573-TL          Rev: B.20
  Type:   Medium Changer                   ANSI  SCSI revision: 05
Host: scsi8 Channel: 00 Id: 01 Lun: 00
  Vendor: IBM      Model: ULT3580-HH4      Rev: 89B1
  Type:   Sequential-Access                ANSI  SCSI revision: 03

Thanks.

I can do the robtest, here is the output  I dont know how.

[ccmn_na_p@meuubkpprd01 bin]$ sudo ./robtest
Configured robots with local control supporting test utilities:
  TLD(0)     robotic path = /dev/sg4

Robot Selection
---------------
  1)  TLD 0
  2)  none/quit
Enter choice: 1

Robot selected: TLD(0)   robotic path = /dev/sg4

Invoking robotic test utility:
/usr/openv/volmgr/bin/tldtest -rn 0 -r /dev/sg4

Opening /dev/sg4
MODE_SENSE complete
Enter tld commands (? returns help information)
s d
drive 1 (addr 256) access = 1 Contains Cartridge = no
drive 2 (addr 257) access = 1 Contains Cartridge = no
READ_ELEMENT_STATUS complete

Invalid command
s s
slot 1 (addr 4096) contains Cartridge = no
Sense code = 0x83, Code qualifier = 0x0
slot 2 (addr 4097) contains Cartridge = yes
Source address = 4097
Barcode = 000132L4
slot 3 (addr 4098) contains Cartridge = yes
Source address = 4098
Barcode = 000148L4
slot 4 (addr 4099) contains Cartridge = yes
Source address = 4099
Barcode = 000174L4
slot 5 (addr 4100) contains Cartridge = no
Sense code = 0x83, Code qualifier = 0x0
slot 6 (addr 4101) contains Cartridge = yes
Source address = 4101
Barcode = 000123L4
slot 7 (addr 4102) contains Cartridge = yes
Source address = 4102
Barcode = 000142L4
slot 8 (addr 4103) contains Cartridge = yes
Source address = 4103
Barcode = 000246L4
<< Press return to continue, or q and return to stop >>

slot 9 (addr 4104) contains Cartridge = no
Sense code = 0x83, Code qualifier = 0x0
slot 10 (addr 4105) contains Cartridge = yes
Source address = 4105
Barcode = 000128L4
slot 11 (addr 4106) contains Cartridge = yes
Sense code = 0x30, Code qualifier = 0x3
Source address = 4106
Barcode =
slot 12 (addr 4107) contains Cartridge = yes
Source address = 4107
Barcode = 000826L4
slot 13 (addr 4108) contains Cartridge = no
slot 14 (addr 4109) contains Cartridge = yes
Sense code = 0x30, Code qualifier = 0x3
Source address = 4109
Barcode =
slot 15 (addr 4110) contains Cartridge = no
slot 16 (addr 4111) contains Cartridge = yes
Source address = 4111
Barcode = 000130L4
<< Press return to continue, or q and return to stop >>

slot 17 (addr 4112) contains Cartridge = no
slot 18 (addr 4113) contains Cartridge = yes
Source address = 4113
Barcode = 000121L4
slot 19 (addr 4114) contains Cartridge = yes
Source address = 4114
Barcode = 000825L4
slot 20 (addr 4115) contains Cartridge = no
slot 21 (addr 4116) contains Cartridge = no
slot 22 (addr 4117) contains Cartridge = no
slot 23 (addr 4118) contains Cartridge = yes
Source address = 4118
Barcode = 000820L4
slot 24 (addr 4119) contains Cartridge = no
READ_ELEMENT_STATUS complete
q

Robot Selection
---------------
  1)  TLD 0
  2)  none/quit
Enter choice: 2
[ccmn_na_p@meuubkpprd01 bin]$
 

 

Marianne
Level 6
Partner    VIP    Accredited Certified

So - earlier on the devices were 'missing'.
What did you do to get them back?

Reboot? 

Have you tried to Inventory again?

mph999
Level 6
Employee Accredited

Firstly, you have an ill library

slot 1 (addr 4096) contains Cartridge = no
Sense code = 0x83, Code qualifier = 0x0

You should not be getting Sense codes against the slots.

Robtest connected via /dev/sg4


scan shows sg4 

 

=========== Missing Device or no local control path (Robot) ===========
 Defined as robotic TLD(0)
 Inquiry = "IBM     3573-TL         B.20"
 Serial Number = 00X2U78K0627_LL0
 Robot Path = /dev/sg4
 Drive = 2, Drive Name = IBM.ULT3580-HH4.000, Serial Number = 1K10024956
 Hosts configured for this device:
 Host = meuubkpprd01.monsanto.com

Either scan failed when it ran -  but the connectuion is intermittant so when you ran robtest, that was at a time the path was working.

Another possiblity is the robot isn't responding to the scsi_commands send by scan, but does to the commands sent by robtest.

We seem to have another device path floating around, not 100% sure what this is as on Linux, far as I know we use /dev/sg

[8:0:0:1]    mediumx IBM      3573-TL          B.20  /dev/sch0

I'd be tempted to reboot everything as a first port of call

 

Marianne
Level 6
Partner    VIP    Accredited Certified
And then there was silence...

Chidananda91
Level 4
Certified

Hi All,

 

Sorry for the delay, Struck with some production issue, The robtest output which has been shared in earlier comments are taken after the scan comman output where we found missing devices, But now we are not getting the robtest output, We are gonna reboot the library and server after the necessary approvals which takes another two or three days.

Will keep you posted, Please share any suggestions which doesnt require reboot, 

 

Thanks 

 

Marianne
Level 6
Partner    VIP    Accredited Certified
The robot is not functioning at the moment. Why do you need lengthy approval system to reboot the robot? Don't you have something like 'emergency change control'? I cannot see anything else that will re-initialize the robot. Maybe you could ask your robot support partner for suggestions.

Chidananda91
Level 4
Certified

Hi Marianne,

 

Yes we do have emergency change control, I thought of rebooting the library and then the server, Hence we need approval as other applications may reside on the server.

As of now I am trying to get hold of site contact, Once done will post the results.

 

 

mph999
Level 6
Employee Accredited

Try rebooting your change control people ...

If the library isn't working, you're not going to make it any worse ...

 

Marianne
Level 6
Partner    VIP    Accredited Certified
Chances are that only the robot needs a reboot. It is hardly ever that an Unix/Linux server needs to be rebooted to reset device connectivity.

Chidananda91
Level 4
Certified

Hi All,

 

We tried to reboot the library but it didnt come up, Raised case with vendor, They took it to their lab for further invistigation.

 

 

Marianne
Level 6
Partner    VIP    Accredited Certified

Lesson learned:

NetBackup is merely reporting errors as far as hardware access is concerned.

Martin (who is the number 1 Veritas expert on device-related problems) said 3 days ago:

"Firstly, you have an ill library"