cancel
Showing results for 
Search instead for 
Did you mean: 

Robot not working

andrich
Level 4

Hello Everybody

My robot stopped to work. I can see it here :

-bash-3.00# cfgadm -al -o show_FCP_dev
Ap_Id                          Type         Receptacle   Occupant     Condition
c1                             fc-private   connected    configured   unknown
c1::210000008740f0b9,0         disk         connected    configured   unknown
c1::500000e011c37d11,0         disk         connected    configured   unknown
c3                             fc           connected    unconfigured unknown
c5                             fc-fabric    connected    configured   unknown
c5::2002000e1111677d,0         tape         connected    configured   unknown
c5::2002000e1111677d,1         med-changer  connected    configured   unknown
c5::2008000e1111677d,0         tape         connected    configured   unknown
c5::500601613ce01c8e,0         disk         connected    configured   unknown
c5::500601683ce01c8e,0         disk         connected    configured   unknown
c6                             fc-fabric    connected    configured   unknown
c6::500104f000597437,0         med-changer  connected    configured   unknown
c6::500104f000597446,0         tape         connected    configured   unknown
c6::500104f00059744a,0         tape         connected    configured   unknown
c6::500104f00059744c,0         tape         connected    configured   unknown
c6::500104f00059744f,0         tape         connected    configured   unknown
c6::500104f000597455,0         tape         connected    configured   unknown
c6::500143801605d386,0         tape         connected    configured   unknown
c6::500143801605d386,1         med-changer  connected    configured   unknown
 
Here too:
 
-bash-3.00# /usr/openv/volmgr/bin/sgscan
/dev/sg/c0tw2002000e1111677dl0: Tape (/dev/rmt/7): "IBM     ULT3580-TD4"
/dev/sg/c0tw2002000e1111677dl1: Changer: "IBM     3573-TL"
/dev/sg/c0tw2008000e1111677dl0: Tape (/dev/rmt/1): "IBM     ULT3580-TD4"
/dev/sg/c0tw500104f000597437l0: Changer: "STK     L180"
/dev/sg/c0tw500104f000597446l0: Tape (/dev/rmt/3): "IBM     ULTRIUM-TD4"
/dev/sg/c0tw500104f00059744al0: Tape (/dev/rmt/8): "IBM     ULTRIUM-TD4"
/dev/sg/c0tw500104f00059744cl0: Tape (/dev/rmt/5): "HP      Ultrium 4-SCSI"
/dev/sg/c0tw500104f00059744fl0: Tape (/dev/rmt/0): "HP      Ultrium 4-SCSI"
/dev/sg/c0tw500104f000597455l0: Tape (/dev/rmt/2): "IBM     ULTRIUM-TD2"
/dev/sg/c1tw500143801605d386l0: Tape (/dev/rmt/4): "HP      Ultrium 4-SCSI"
 
And here:
 
-bash-3.00# cd /dev/sg
-bash-3.00# ls -la
total 56
drwxr-xr-x   2 root     root          19 Mar 10 16:19 .
drwxr-xr-x  24 root     sys          324 Mar 10 16:53 ..
lrwxrwxrwx   1 root     root          73 Mar 10 16:19 c0tw2002000e1111677dl0 -> ../../devices/pci@8,700000/SUNW,emlxs@3/fp@0,0/sg@w2002000e1111677d,0:raw
lrwxrwxrwx   1 root     root          73 Mar 10 16:19 c0tw2002000e1111677dl1 -> ../../devices/pci@8,700000/SUNW,emlxs@3/fp@0,0/sg@w2002000e1111677d,1:raw
lrwxrwxrwx   1 root     root          73 Mar 10 16:19 c0tw2008000e1111677dl0 -> ../../devices/pci@8,700000/SUNW,emlxs@3/fp@0,0/sg@w2008000e1111677d,0:raw
lrwxrwxrwx   1 root     root          71 Mar 10 16:19 c0tw210000008740f0b9l0 -> ../../devices/pci@9,600000/SUNW,qlc@2/fp@0,0/sg@w210000008740f0b9,0:raw
lrwxrwxrwx   1 root     root          71 Mar 10 16:19 c0tw500000e011c37d11l0 -> ../../devices/pci@9,600000/SUNW,qlc@2/fp@0,0/sg@w500000e011c37d11,0:raw
lrwxrwxrwx   1 root     root          73 Mar 10 16:19 c0tw500104f000597437l0 -> ../../devices/pci@8,700000/SUNW,emlxs@5/fp@0,0/sg@w500104f000597437,0:raw
lrwxrwxrwx   1 root     root          73 Mar 10 16:19 c0tw500104f000597446l0 -> ../../devices/pci@8,700000/SUNW,emlxs@5/fp@0,0/sg@w500104f000597446,0:raw
lrwxrwxrwx   1 root     root          73 Mar 10 16:19 c0tw500104f00059744al0 -> ../../devices/pci@8,700000/SUNW,emlxs@5/fp@0,0/sg@w500104f00059744a,0:raw
lrwxrwxrwx   1 root     root          73 Mar 10 16:19 c0tw500104f00059744cl0 -> ../../devices/pci@8,700000/SUNW,emlxs@5/fp@0,0/sg@w500104f00059744c,0:raw
lrwxrwxrwx   1 root     root          73 Mar 10 16:19 c0tw500104f00059744fl0 -> ../../devices/pci@8,700000/SUNW,emlxs@5/fp@0,0/sg@w500104f00059744f,0:raw
lrwxrwxrwx   1 root     root          73 Mar 10 16:19 c0tw500104f000597455l0 -> ../../devices/pci@8,700000/SUNW,emlxs@5/fp@0,0/sg@w500104f000597455,0:raw
lrwxrwxrwx   1 root     root          73 Mar 10 16:19 c0tw500143801605d386l0 -> ../../devices/pci@8,700000/SUNW,emlxs@3/fp@0,0/sg@w500143801605d386,0:raw
lrwxrwxrwx   1 root     root          73 Mar 10 16:19 c0tw500143801605d386l1 -> ../../devices/pci@8,700000/SUNW,emlxs@3/fp@0,0/sg@w500143801605d386,1:raw
lrwxrwxrwx   1 root     root          73 Mar 10 16:19 c0tw500601613ce01c8el0 -> ../../devices/pci@8,700000/SUNW,emlxs@3/fp@0,0/sg@w500601613ce01c8e,0:raw
lrwxrwxrwx   1 root     root          73 Mar 10 16:19 c0tw500601683ce01c8el0 -> ../../devices/pci@8,700000/SUNW,emlxs@3/fp@0,0/sg@w500601683ce01c8e,0:raw
lrwxrwxrwx   1 root     root          73 Mar 10 16:19 c1tw500143801605d386l0 -> ../../devices/pci@8,700000/SUNW,emlxs@5/fp@0,0/sg@w500143801605d386,0:raw
lrwxrwxrwx   1 root     root          73 Mar 10 16:19 c1tw500143801605d386l1 -> ../../devices/pci@8,700000/SUNW,emlxs@5/fp@0,0/sg@w500143801605d386,1:raw
 
But when I try to Add New Robot, by the GUI, the /dev/sg is not showed and I have to add the path mannually.
 
I can add, but the robot not work.
 
Mar 10 17:15:31 svbackup tldd[3147]: [ID 610949 daemon.error] TLD(3) unavailable: initialization failed: Unable to initialize robot
 
 
Any ideias?
 
Thanks
Alexandre Andrich
 
 
 
 
 
 
 
1 ACCEPTED SOLUTION

Accepted Solutions

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified

/dev/sg/c1tw500143801605d386l0: Tape (/dev/rmt/4): "HP      Ultrium 4-SCSI"

The above is NOT a robot - it is a tape drive.

This is a robot:

/dev/sg/c0tw2002000e1111677dl1: Changer: "IBM     3573-TL"

and this one too:

/dev/sg/c0tw500104f000597437l0: Changer: "STK     L180"

You never told us your Solaris version?
Or NBU version?

Do you have the latest device mappings installed? 

I find it strange that cfgadm and sgscan output seems to be totally different.

Do you have persistent binding in place?

My suggestion is to cleanup device entries with devfsadm -C -c tape and ensure that persistent binding is in place.

Then perform reconfiguration reboot and confirm device entries - compare cfgadm and sgscan output.

Let us know how it goes....

View solution in original post

4 REPLIES 4

SymTerry
Level 6
Employee Accredited

Did you just upgrade firmware?

If everything was working before, you might want to try and reboot the robot and restart ltid. See if it syncs back up. If not remove the devices and readd with the wizard. If all else fails, you will want to rebuild your SG drivers (TECH63095). It would not hurt to even do that first.

 

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified

/dev/sg/c1tw500143801605d386l0: Tape (/dev/rmt/4): "HP      Ultrium 4-SCSI"

The above is NOT a robot - it is a tape drive.

This is a robot:

/dev/sg/c0tw2002000e1111677dl1: Changer: "IBM     3573-TL"

and this one too:

/dev/sg/c0tw500104f000597437l0: Changer: "STK     L180"

You never told us your Solaris version?
Or NBU version?

Do you have the latest device mappings installed? 

I find it strange that cfgadm and sgscan output seems to be totally different.

Do you have persistent binding in place?

My suggestion is to cleanup device entries with devfsadm -C -c tape and ensure that persistent binding is in place.

Then perform reconfiguration reboot and confirm device entries - compare cfgadm and sgscan output.

Let us know how it goes....

mph999
Level 6
Employee Accredited

From cfgadm I see 3 changers

c5::2002000e1111677d,1         med-changer  connected    configured   unknown
c6::500104f000597437,0         med-changer  connected    configured   unknown
c6::500143801605d386,1         med-changer  connected    configured   unknown

The changer is (from cfgadm)

 
c6::500143801605d386,1         med-changer  connected    configured   unknown
 
The WWN is the same, but the changer is one lun more - typical of IBM libraries (and correct).
 
From sgscan:
 
Only two libraries appear :
 
/dev/sg/c0tw2002000e1111677dl1: Changer: "IBM     3573-TL"
/dev/sg/c0tw500104f000597437l0: Changer: "STK     L180"
 
I'm guessing it's as IBM library, as in cfgadm it appears as 1 lun more than one of the tape drives which it is using as its control path.
 
/dev/sg/c1tw500143801605d386l0: Tape (/dev/rmt/4): "HP      Ultrium 4-SCSI"
 
So, I presume this will fail :
 
/usr/openv/volmgr/bin/scsi_command -r /dev/sg/c1tw500143801605d386l1
 
(It should return the inquiry string from the library, eg. IBM xxx changer ...)
 
Now this part in really important:
 
If you run look in /usr/openv/volmgr/bin/driver at the files sg.links and sg.scan.
Each file should have 2 lines that contain the WWN 500143801605d386
The only difference between the two lines (in each file) will be the lun number (0 and 1).
 
If this is the case, then the NBU config should be correct - I'm just playing safe and making this check.
 
I suspect you have a robot fault (as if the sg driver is correct, then this is about the only other possibility).  It is perfectly possible that it appears in cfgadm, but cannot resond to scan/ sgscan.
 
It would be interesting to see what :
 
/usr/openv/volmgr/bin/scsi_command -r /dev/sg/c1tw500143801605d386l1   shows.
 
Even if this file was wrong, providing you do not rebuild the sg driver it shouldn't matter, these two files are only used when sg.install is run.
 
You do not say that the sg.driver has been rebuilt, so I don't think this is the issue,
 
It can be rebuilt, but, this can add complications if it doesn't work properly - and if the root cause of the issue is something else, then you have the danger of an 'now incorret' sg driver which could cause the changer not to appear as well as the other cause - so a right mess that is difficult to sort out, as until the original issue is fixed, the scan / sgscan will never work so you can't confirm the sg driver is correct.
 
Certainly from above, it looks correct, and as I mentioned, you haven't said it has been rebuilt, so there is no reason for it to be wrong, it cannot 'change' on its own.
 
It would be interesting to see what scsi_command above shows.  You could also try it using the path to this robot as shown in tpconfig -d on the robot control host.
 
scsi_command  / scan / sgscan / tpautocof / device wizard all rely on the device responding correctly to scsi commands, there are no 'NetBackup' commands sent - hence why this won;t be a NBU issue (providing the sg driver is correct, and as explained I don't recommend touching this if there is no evidence it has been changed).
 
If the scsi command above doesn't respond (try it against the path for the other libraries to see what kind of output you should get) then there is almost certainly a robot issue.
 
ALso what does this show :
 
1.  Get the robot path from tpconfig -d for this library from the robot control host, we'll call this lib_path
2.
 
echo 'mode' |/usr/openv/volmgr/bin/tldtest -r <lib_path>
 
Example:
 
root@nbmaster00 tmp $ echo 'mode' |tldtest -r /dev/sg2
Opening /dev/sg2
MODE_SENSE complete
Enter tld commands (? returns help information)
First transport addr = 256, Number transport elements = 1
First storage addr = 1024, Number storage elements = 30
First media access port addr = 512, Number media access port elements = 4
First drive addr = 1, Number drive elements = 4
Library does have a barcode reader
MODE_SENSE complete
 
 
Martin

 

mph999
Level 6
Employee Accredited

See my point in my post about rebuilding the sgdriver.

If there is no reason to believe it has changed, and it shouldn't have done, on the 'off-change' that a mistake is made when rebuilding it, it won't work,  Now, if the sg driver wasn't the cause of the problem in the first place, you now have two faults that give the same symptoms.

That, is then very very diffilcult to fix, as there is no way of confirming the sg driver is correct, because of the original fault.

I think first, we confirm what (if anything has been changed).

If the sg.links and .conf files are correct (in /usr/openv/volmgr/bin/drivers) then the sg driver can be rebuilt with just these commands:

#!/usr/bin/ksh
modunload -i $(echo $(modinfo |grep "sg (SCSA" |awk '{print $1}'))
mv /kernel/drv/sg.conf /kernel/drv/sg.conf.old
/usr/openv/volmgr/bin/driver/sg.install
 
If the firmware has changed, then yes, perhpas the device mappings will help.
 
One point I did forget about, but Marianne mentioned, is to remove any stale entries from cfgadm with the -C option.  That might not resolve the issue, but it might make the entry for the library disappear from the OS, showing without doubt the issue is outside NBU.