cancel
Showing results for 
Search instead for 
Did you mean: 

New Tape Robot/Drive - Cannot Backup

jkois
Level 2
Greetings,
Due to a hardware failure, I have recently replaced our Dell PowerVault 122T (LTO2) with a PowerVault 124T (LTO3). I then removed all instances of the LTO2 drive/robot from Netbackup and configured for the new robot/drive.

So, here are the details regarding the current configuration:
RHEL4 (Update 8)
Dell PowerVault 124T Autoloader (LTO3) at /dev/sg1
Quantum Ultrium 3 Drive (Internal to robot) at /dev/nst0
NetBackup 6.0 MP7

I have everything configured to the point that I can successfully carry out most operations (namely inventory, inquiry, label, drive and robot diagnostics). However, when I attempt to run a backup process, everything goes sour. In particular, the following happens:

    02/11/2010 15:41:20 - requesting resource storage-unit-hcart3
    02/11/2010 15:41:20 - requesting resource hemi.NBU_CLIENT.MAXJOBS.cygnus
    02/11/2010 15:41:20 - requesting resource hemi.NBU_POLICY.MAXJOBS.http_mail
    02/11/2010 15:41:21 - granted resource hemi.NBU_CLIENT.MAXJOBS.cygnus
    02/11/2010 15:41:21 - granted resource hemi.NBU_POLICY.MAXJOBS.http_mail
    02/11/2010 15:41:21 - granted resource 0002L3
    02/11/2010 15:41:21 - granted resource QUANTUM.ULTRIUM3.000
    02/11/2010 15:41:21 - granted resource storage-unit-hcart3
    02/11/2010 15:41:22 - started process bpbrm (pid=2466)
    02/11/2010 15:41:22 - connecting
    02/11/2010 15:41:22 - connected; connect time: 0:00:00
    02/11/2010 15:41:26 - mounting 0002L3
    02/11/2010 15:43:00 - mounted 0002L3; mount time: 0:01:34
    02/11/2010 15:43:00 - positioning 0002L3 to file 1
    02/11/2010 15:43:09 - positioned 0002L3; position time: 0:00:09
    02/11/2010 15:43:09 - begin writing
    02/11/2010 16:13:14 - Error bptm (pid=2467) media manager terminated by parent process
    02/11/2010 16:21:24 - Error bptm (pid=2467) ioctl (MTBSF) failed on media id 0002L3, drive index 0, Input/output error (bptm.c.8229)
    02/11/2010 16:21:24 - end writing; write time: 0:38:15
    network connection timed out (41)


Following this, the devices produce the following:
    The Robot gets listed as "Enabled - No" and diagnostics produces "Failed - Communication failure with robot control daemon." across the board.
    The Drive is listed as "Enabled - Yes" with a Drive Path of "MISSING_DRIVE:<serial>" and diagnostics fail with "Drive already in use, aborting test."

To resolve this, it seems I have to remove and readd the drive, and a power cycle of the robot seems to allow the robot to be communicated with once again.

I have been Googling to no end with very little success. If anyone has any insight, I would be most appreciative. Please do not hesitate to let me know if more information is required.

Thanks!

3 REPLIES 3

Anonymous
Not applicable
I would run the nbsu support tool and go through each of the output files, as you may still have some references 'hungover'.

For latest version go here and run on your master server, first and review output. http://seer.entsupport.symantec.com/docs/338509.htm

This can help understand also how netbackup commands can be used to get information rather than me give you a long list of commands to try - but I will anyway!

Is the library/drives shared by other media servers other than the master?

Here are some of the obvious commands to verify device discovery:
/usr/openv/volmgr/bin/tpautoconf -r
/usr/openv/volmgr/bin/tpautoconf -t
/usr/openv/volmgr/bin/scan

Output what devices netbackup has in its device tables

# /usr/openv/volmgr/bin/ltid -tables
# cat -s /usr/openv/volmgr/misc/ltitables

Better command to list all device data in NetBackup environment from the master is this command:
# tpconfig -emm_dev_list [-noverbose]
noverbose option displays condensed view of info

Other thoughts are update the device mappings file for Linux?
http://seer.entsupport.symantec.com/docs/339826.htm

Does your media type defined in the storage unit match your media volumes. eg hcart3 vs hcart


tordoc
Not applicable

My tape robot will not function after adding on a disk array.

We added a new disk array to the server last week and I noticed all the backup jobs were failing.  When I attempted to inventory the robot I recieved the error "Inventory failed: Unable to open robotic path (201)."

I rebooted the library.
Verified i could see it in the windows device manager
Logged into the tape library GUI and asked the GUI tool to run an inventory which completed successfully but Netbackup cannot seem to utilize the resource.

I followed support document http://support.veritas.com/docs/269409 and was able to run through all of the steps without any problems

when I ran the command tpautoconf -r it sees my device without a problem.  The advice in this help document tells me that if something changed with the server or the devices were rebooted that the path to the robot may have changed.  It says specifically.... "The OS is plug and play and can adapt, but Netbackup is not.  Re-browsing for the robot object will update NetBackup's Device Databases with the new SCSI path to the robot object"

I removed the robot and the drives from the nebackup administration console.
Did bpdown
then bpup
I then ran through the wizard in Netbackup to discover the robot and the drives which it does successfully.  However I still cannot get anywhere.  The inventory fails with the same error message. 
I have also tried to "Synchronize Enterprise Media Manager Database" which it will not even give me the option to do.
From the GUI I was able to run a test on the robot and I recieved a failure  "communication failure with robot control daemon"
In the properties of the robot it states "Robot Path: MISSING_PATH>>>>"

Any advice would be appreciated.

jim_dalton
Level 6

Sounds like you have a device mapping conflict/issue.
If you remove netbackup from the picture that might might simplify things:try writing some arbitrary data (several Gb if you can so it runs a while) to each of your drives in turn, dont use netbackup, use tar on cmdline and manually mount media or use the inbuilt robotics movement options and report back.
You could also try using bits of netbackup ...robtest...to just do tape movements...see what it leads to.