04-03-2014 11:43 PM
Hi there,
Want to keep this as simple as possible, I have a master and media that use the same robot, master is the control host, last 3-4 days both have been struggling to communicate with robot, in NBU paths are all up, not showing any signs of issues/errors, master seems to be the one that has much more issues with comm's with the robot as the media, did all the basic checks and troubleshooting, most of the jobs are in a "mounting" state and never start writing, in the logs the tape gets re-requested the whole time, I can see the tape gets assigned and its in the drive, however still just mounting and never starts.
Yesterday as an example the master suddenly started writing, only one job though.
Very confused and because jobs failing, causing me fairly big ammount of problems with business. Assistance please guys if you can.
NBU 7.5.0.5 Server 2008 R2 Standard
DELL ML-6000 library FC connected
04-04-2014 12:15 AM
Do you see the robot going to down state or is it a tape drive only issues ?
04-04-2014 12:21 AM
Have you tried to test tape mounts outside of NBU with robtest?
See this TN:
Robtest commands that can be used to test the SCSI functionality of a robot
http://www.symantec.com/docs/TECH83129
Once a tape is loaded, open another cmd and issue 'vmoprcmd -d' command.
(Command is in same path as robtest - <install-path>\veritas\volmgr\bin)
Does the output reflect the media-id in the correct drive?
From your description, it sounds if Persistent Binding is not in place. Without this, there is a big possibility that device paths will change during a reboot.
Check the Support site for the hba's in your master and media servers for Persistent Binding instructions.
For further troubleshooting, ensure all of the following logging is enabled on master and all media servers:
bptm folder (under netbackup\logs)
VERBOSE entry in volmgr\vm.conf (restart Device management service after adding this entry).
NBU tape management activity will be logged in bptm logs, while Media Manager activity will be logged in Event Viewer System and Application logs.
04-04-2014 01:05 AM
Followup on Marianne fine suggestion. Does the windows event log show any issues regarding SCSI connectivity ?
04-04-2014 01:39 AM
Hi there,
Al logging enabled and ready to receive.
To respond to some of the suggestions:
During no time dis NBU ever show that there is path/drive/robot down state's, all are green and showing healthy, I can see when tapes mount and write and dismount(in some cases) like I said I even saw it happening once or twice for the x2 media servers in question, without issues, however the master that controls the robot and also writes to it, is the one that mostly does not communicate properly, the other media that shares this does comm's better and almost consistent.
Marianne - I did run that cmd command while media loaded and writing, I can confirm they match, are in the right drives, and are receiving data and writing, noted that both of those drives working are for the media server that does not have that much issues.
running the robtest command returns to say that the device is in use or locked, so it cannot run test.
regarding the persistent path thing, I believe there might be truth to that and it could very well be what we see happening here, however this is the very first time I am seeing this scenario, and no changes was made to anything, there is the occational server reboot days, but if it has in such a long time never affected the drives/robot as it has now, what would have changed that its broken it now?
04-04-2014 02:02 AM
See event viewer entry below:
Netbackup event error - EMM interface initialization failed, status = 334
Event ID – 13853 Unknown robot number 1, from host xxx(hostname)
Event ID – 5126 TLD(0) [1940] Could not get DOS path from PnP path \\?\scsi#changer&ven_adic&prod_scalar_i500#5&467c9ca&0&000201#{53f56310-b6bf-11d0-94f2-00a0c91efb8b}, using existing path
Event ID – 14032 TLD(0) [1940] cannot open \\.\Changer0, SCSI coordinates {4,0,2,1}: The requested resource is in use.
04-04-2014 03:47 AM
Does the time match when you ran robtest ?. That entry may very well bein you using robtest - so check timestamp.
Under normal conditions a changer should never be "in use". Can you do a tpconfig -d" and attach the output to a post - as a file.
04-04-2014 04:28 AM
Nicolai is right - robtest should only be used when no backups are active.
See this note in the robtest TN:
04-04-2014 07:37 AM
Does the robot also have admin gui/console? Might be worth checking the hardware out as well.
Jim
06-01-2014 10:16 PM
It seems that the woes with this library are not over yet: