cancel
Showing results for 
Search instead for 
Did you mean: 

Behaviour Of Media Server

Possible
Level 6
Accredited Certified

Hi All,

I am having ockward scenario :

I am having a media server (B : Not a robot control host) which is showing all the drives MISSING PATH in tpconfig o/p.

Now I go to device configuration wizard and I can see the drives (shred) in the screen. Here, A (consider) is robot control host which can see all drives wih proper path.

Now i go to server B and use tpconfig -multiple_delete -drive to delete all the drives which are having MISSING PATH and having success (The drives are deleted from the media server only and not from whole drive databse as they are seen by robot control host A).

I go backup to device configuration wizard to scan the drives in order to reconfig them but i am not able to see any drive over there now (Server A can see them in wizard ).

My question is where those drive gone ?? I am not having a single drive for that particular robot now? How could I bring them back?

I looks abnormal to me....please help

 

27 REPLIES 27

Possible
Level 6
Accredited Certified

correction......

I am having a media server (B : Not a robot control host) which is showing all the drives MISSING PATH in tpconfig o/p for a particular TLD (other robots can see drives with proper paths.)

mph999
Level 6
Employee Accredited

If volmgr/bin/scan is not seeing the devices (which effectively is what the device wizard runs) then if the os is NOT solaris, then almost certainly, it is a problem between the operating system and the devices, that is, the os cannot see them.

If the OS is Solaris, and scan isn't working,  it can be the same as above (os cannot see devices) but it can also be the the sg driver needs rebuilding.

However, first, prove for 100% that the OS is seeing the correct drives AND all the paths work (eg, use mt -f /dev/rmt/xxx stat )

The majority of cases like this have nothing to do with NBU (well, yes, it might need reconfigering) as the root cause is the os missing the drives, or the paths changing.  NBU doesn't suddenly lose drives, something has happened at the os level.

So, remember, forget NBU first, and concentrate on the os.

For Solaris to rebuild sg driver , first you need to work out if the drives are configered via WWN or target/ lun (or combination of both)

Example

sgscan shows just x4 tape drives, there should be x6 ...

 

 
/dev/sg/c0tw2101000d772420c8l0: Changer: "ATL     P1000"
/dev/sg/c0tw2101000d772420c8l1: Tape (/dev/rmt/0): "QUANTUM DLT8000"
/dev/sg/c0tw2101000d772420c8l2: Tape (/dev/rmt/1): "QUANTUM DLT8000"
/dev/sg/c0tw2101000d772420c8l3: Tape (/dev/rmt/4): "QUANTUM DLT8000"
/dev/sg/c0tw2101000d772420c8l4: Tape (/dev/rmt/5): "QUANTUM DLT8000"
 
 
Lets look in cfgadm ...
 
rdgv240sol22 # cfgadm -al -o show_FCP_dev |grep -i tape
c3::2101000d772420c8,1         tape         connected    configured   unknown
c3::2101000d772420c8,2         tape         connected    configured   unknown
c3::2101000d772420c8,3         tape         connected    configured   unknown
c3::2101000d772420c8,4         tape         connected    configured   unknown
c3::2103000d77640cc8,1         tape         connected    configured   unknown
c3::2103000d77640cc8,2         tape         connected    configured   unknown
 
 
... and we see that yes, there are x6 drives
 
 
Check we see /dev/rmt/devices for the drives :
 
rdgv240sol22 # ls -al /dev/rmt/*cbn
lrwxrwxrwx   1 root     other         72 Sep  3  2010 /dev/rmt/0cbn -> ../../devices/pci@1d,700000/SUNW,qlc@1/fp@0,0/st@w2101000d772420c8,1:cbn
lrwxrwxrwx   1 root     other         72 Sep  3  2010 /dev/rmt/1cbn -> ../../devices/pci@1d,700000/SUNW,qlc@1/fp@0,0/st@w2101000d772420c8,2:cbn
lrwxrwxrwx   1 root     other         72 Sep  3  2010 /dev/rmt/2cbn -> ../../devices/pci@1d,700000/SUNW,qlc@1/fp@0,0/st@w2103000d77640cc8,1:cbn
lrwxrwxrwx   1 root     other         72 Sep  3  2010 /dev/rmt/3cbn -> ../../devices/pci@1d,700000/SUNW,qlc@1/fp@0,0/st@w2103000d77640cc8,2:cbn
lrwxrwxrwx   1 root     root          72 Sep  3  2010 /dev/rmt/4cbn -> ../../devices/pci@1d,700000/SUNW,qlc@1/fp@0,0/st@w2101000d772420c8,3:cbn
lrwxrwxrwx   1 root     root          72 Sep  3  2010 /dev/rmt/5cbn -> ../../devices/pci@1d,700000/SUNW,qlc@1/fp@0,0/st@w2101000d772420c8,4:cbn
 
 
We see x6 drives, and we can see that we configure these by WWN, not by target and lun
 
 
Just to show you, these is a tape drive on another system ...
 
From cfgadm -a
c3::rmt/0                      tape         connected    configured   unknown
 
ls -al /dev/rmt/*cbn
lrwxrwxrwx   1 root     root          73 Nov 12  2010 /dev/rmt/0cbn -> ../../devices/pci@1e,600000/pci@0/pci@9/pci@0,2/pci@1/scsi@2,1/st@5,0:cbn
 
You can see the difference here, at the end, we have st@5,0  - this means the device is target5, lun0  - nice and simple .  The cfgadm looks different also ..
 
So, it should be easy to work out from cfgadm and ls -al on /dev/rmt, if we are configuring devices using target and luns, or, wwns, or perhaps a combination of both
 
 
Unload the sg module
 
modunload -i $(echo $(modinfo |grep "sg (SCSA" |awk '{print $1}'))
 
Check it has gone ..
 
modinfo |grep -i "sg (SCSA"
 
rm the /dev/sg devices and sg/ sg.conf in kernel/drv (on customer system, just move these)
rm /kernel/drv/sg.conf
 
To rebuild the sg driver ...
 
cd /usr/openv/volmgr/bin/driver
 
../sg.build all -mt 5 -ml 5 (this would create sg.conf /links files for sg devices for target 0 - 5 , each with lun 0 - 5 )  **
sg.install
 
 
** If you have identified that the drives are configered via WWN not target/ lun then just use ../sg.build all
 
 
Martin

Possible
Level 6
Accredited Certified

Hi,

The OS is able to see drives but Netbackup sees them with MISSING PATH for that TLD.

Please consider the sequence of actions i have gone through.

 

Thanks,
Giri

Yogesh9881
Level 6
Accredited

can you plz post o/p of below commands.

run on master --- /usr/openv/netbackup/bin/admincmd/nbemmcmd -getemmserver

run on server B --- /usr/openv/volmger/bin/scan

mph999
Level 6
Employee Accredited

OK, if you run /usr/openv/volmgr/bin/scan, on the media server, do you see the drives ?

If you do, delete the drives in NBU, and re-run the wizard.

If not, report back here.

Do the drives work if you use mt -f /dev/rmt/xxxcbn stat  (load a tape into the drive using robtest first)

Martin

Marianne
Level 6
Partner    VIP    Accredited Certified

The OS is able to see drives

Please tell us WHICH OS and what makes you confident that the OS can access and use the devices.

Please remember that most OS'es do not remove device files when access is lost.

The test is to remove OS devices, rescan at OS level and see if devices are re-added.

Possible
Level 6
Accredited Certified

@mph999

U mentioned....

OK, if you run /usr/openv/volmgr/bin/scan, on the media server, do you see the drives ?

If you do, delete the drives in NBU, and re-run the wizard.

Exactly what i performed as i have mentioned earlier....check below...

 

Now i go to server B and use tpconfig -multiple_delete -drive to delete all the drives which are having MISSING PATH and having success (The drives are deleted from the media server only and not from whole drive databse as they are seen by robot control host A).

I go backup to device configuration wizard to scan the drives in order to reconfig them but i am not able to see any drive over there now (Server A can see them in wizard ).

Thanks.

 

Marianne
Level 6
Partner    VIP    Accredited Certified

We STILL don't know which OS and how/where you see the devices in the OS and how you are sure that they are USABLE at OS level.

As per my previous post:

Please remember that most OS'es do not remove device files when access is lost.

The test is to remove OS devices, rescan at OS level and see if devices are re-added.

NBU can only discover devices that the OS presents to it.

mph999
Level 6
Employee Accredited

You need to follow Mariannes advice in the post above.

Now, when you run the wizard, it effectivly run scan -all.

If the device wizards doesn't show the drives on the media server, usually, this means that if you run scan on the media server, it doesn't show the devices - which is why I am confused when you say scan works, but the wizard doesn't.

We also need to know the os type of the servers.

Martin

Possible
Level 6
Accredited Certified

Maria,

Its Linux.

When I seen the drives in MISSING PATH in NBU...I was able to see them in scan

But when i deleted from server by tp command then they were not found in config wizard too (Before deleting i was able to see them there in wizard as shared drives).

Thanks.

 

Marianne
Level 6
Partner    VIP    Accredited Certified

Please confirm that devices are accessible/usable at OS level.

Check device names in /dev/rmt.
Next, issue 'mt -f /dev/nstX status' for each of the device names seen in /dev/rmt.

You can also follow the steps in the Linux chapter of NBU Device Configuration Guide http://www.symantec.com/docs/DOC3656 to check/verify devices at OS level.

Extract from p. 76:

If the operating system detects the SCSI devices, NetBackup can discover them.

 

PS:  MISSING PATH status is seen when OS has lost connectivity to devices.
tpconfig command only removed devices from NBU config, which was necessary because of the MISSING PATH status.

Possible
Level 6
Accredited Certified

OK...I got confused between what tpconfig can see and what OS can see.

Now one thning is clear "If OS detects SCSI devices , Netbackup Can discover them".

Means if MISSING PATH comes in future OS should not be able to see the devices attached.

OK i got it...but how i can get them back now (Is there any way from robot control host to share these lost drives to be seen by media server A) or i hv to reconfig from OS only (if yes then how...pls tell).

And is there any situation where OS sees the drives and NBU unable to see.??

Thanks.

Marianne
Level 6
Partner    VIP    Accredited Certified

Have you tried the Device Config Guide as suggested?

Try to reboot the media server.
Check physical connection.
Check zoning.
Check /var/log/messages for errors.
Verifying OS device configuration: cat /proc/scsi/scsi

 

And is there any situation where OS sees the drives and NBU unable to see?

Only seen this on Solaris where sg driver needs to be rebuilt (as explained by Martin above) but this does not happen 'out of the blue' - only needed as initial step.

mph999
Level 6
Employee Accredited

 

(1)

OK...I got confused between what tpconfig can see and what OS can see.

No problem ...

(2)

tpconfig is a real bad command for troubleshooting why things are missing.

tpconfig shows what is configered, NOT, what can be seen.

For example, I can configure a drive on my server that does not exist, tpconfig will show it, because it is in the configeration - that does not mean it will work.

(3)

Now one thning is clear "If OS detects SCSI devices , Netbackup Can discover them"

Usually yes.

(4)

Means if MISSING PATH comes in future OS should not be able to see the devices attached.

Yes

(5)

OK i got it...but how i can get them back now (Is there any way from robot control host to share these lost drives to be seen by media server A) or i hv to reconfig from OS only (if yes then how...pls tell)

THe robot control host does prevent other servers seeing the drives

(6).

And is there any situation where OS sees the drives and NBU unable to see.??

Yep, but not usually on Linux, in fact, I've only seen it on Solaris when you first need to rebuild the sg driver.

I have also seen a case where the OS could see the device (it was solaris as it happens) over the san, but was unable to communicate correctly - due to a SAN issue.

Next steps.  Back to basic - forget NBU

As per mariannes post :

 

Check device names in /dev/rmt.
Next, issue 'mt -f /dev/nstX status' for each of the device names seen in /dev/rmt.

You can also follow the steps in the Linux chapter of NBU Device Configuration Guidehttp://www.symantec.com/docs/DOC3656 to check/verify devices at OS level.

 

For each drive, prove that he OS can see and use it using mt command (you may have to install this as a package if not available, it is a Linux command)

Once you have proved the OS can see the devices, try running scan - if this works, the wizard should then work.

If scan does not work, try this ...

scsi_command -d <device file>

 

martin

 

 

Possible
Level 6
Accredited Certified

Can u help me in below nowsad

Drive with serial no 1304927336 is visible at OS.

[root@AAA ~]#  /usr/openv/volmgr/bin/scan |grep 1304927336
Serial Number: "1304927336"
Device Identifier: "IBM     ULT3580-TD4     1304927336"
Drive 16 Serial Number      : "1304927336"

Now NBU not showing Path for the same serial number

[root@AAA ~]# vmoprcmd -h AAA-autoconfig -t |grep 1304927336
TPAC60 IBM     ULT3580-TD4     4C17 1304927336 -1 -1 -1 -1 - - -
[root@PG997 ~]#
 

other drive shows path well

TPAC60 IBM     ULTRIUM-TD3     8711 G353880066 -1 -1 -1 -1 /dev/nst14 - -
 

Thanks,

 


 

Possible
Level 6
Accredited Certified

to be more specific...

tpconfig -d o/p

98  Drive572             hcart  TLD(8)  DRIVE=16
      MISSING_PATH:3:0:5:16:1304927336                                 DOWN


Thanks

Giri

Marianne
Level 6
Partner    VIP    Accredited Certified

Please show us FULL output of scan.

Yogesh9881
Level 6
Accredited

As marianne asked ....

I am also waiting for output of scan command. (i already requested to you)

Marianne
Level 6
Partner    VIP    Accredited Certified

Giri

The reason why we need FULL scan output:

If media server can see the robot, the robot itself will report all the drive serial numbers.
This only proofs that robot can see drives via the control path. So robot knows about the drives and positions so that it can mount and dismount tapes.

That does not proof that OS st driver can see and use the tape drives.
We call this the data path. We need to see if scan reports the data paths, e.g.
Device name: /dev/rmt/xxxxx
and
Passthru Name: /dev/sg/xxxxx.