05-03-2012 11:30 AM
Hi!
I hope all is well with you.
To give a bit of background to my problem, our setup is the following: two Red Hat servers connected to a tape drive (IBM TS3200 I believe) via SCSI. At the same time, they are both connected to a RAID drive via fiber. It's an active/standby setup, so only one of the servers has access to the tape/raid at a time. We are running Netbackup 6.5.5. When I started working with this system (2 days ago), it seemed like the archive functionality was working fine: there's a script that runs daily and archives stuff to tape. The script was reporting success, and indeed Netbackup also reported success, and you could see the filenames in the library.
Now, when restoring on one server (server1), it seemed as if everything was working fine: you could hear the robot going, and the log would show that the correct tape was being mounted. However, it got stuck on the "begin reading" message, and then it would time out after about 15 minutes.
If I switched to server2, the problem was different. It would say that it's not able to locate the desired media, and that administrative action may be required (but later in the same message it would say that no action was required). I tried moving the tape around to see if I could somehow help it, but that didn't work either.
Another thing that I was seeing was that in the "Drives" section, both drives said something about "MISSING PATH" and "/dev/nst0" or "/dev/nst0:89378947" (I made up the numbers, but they seemed random to me).
I'm at the point where I would try anything to get it to work. This system was working back when I left it a couple years ago, but since then the red hat version and the netbackup version (6.0 to 6.5.5) have been upgraded, and the state is how I described. Matter of fact, I made things worse by deleting and re-adding the drives and robots, which fixed the "missing path" message, but now it won't restore OR archive stuff to tape. :(
What are your thoughts? How could I proceed? I can explain exactly how I installed this the last time I worked on it, if that would help in your diagnostic.
05-03-2012 11:40 AM
05-03-2012 12:00 PM
Awesome, thanks! I'll go down to the lab and try that asap.
05-03-2012 12:25 PM
Ok, so I tried that, and I got the same message as I was getting on server2 (if I look at the positive side, at least both servers are being consistent with their error):
<timestamp>: awaiting resource A00003
A pending request has been generated for this resource request.
Operator action may be required. Pending action: No action.
Media ID: A00003, Barcode: A00002L3, Density: hcart3, Access Mode: Read,
Action Drive Name: N/A, Action Media Server: N/A, Robot Type(Number): 0(N/A),
Volume Group: 000_00000_TLD, Action Acs: N/A, Action Lsm: N/A
There's a lot of N/As there :\ Any thoughts?
Thanks again,
Pedro
05-03-2012 02:25 PM
05-03-2012 03:43 PM
Yep, they're hcart3. The results of that command showed me this:
Id: 0
Drive Name: IBM-ULT3580-TD3.000
Drive Path: /dev/nst0
Type: hcart3
Residence: TLD(0) DRIVE=2
Status: UP
Currently defined robotics are: TLD(0) robotic path = /dev/sg5
05-03-2012 11:44 PM
Doesn't seem like it is getting the drive resource assigned and sits there.
I would recommend you raise a case with Symantec to investigate this.
05-04-2012 12:11 AM
OK, back to basics ...
Note - this may not help that much, but lets' give it a go ...
bpreq -m <media id> -f /tmp/TAPE
This 'should' load a tape into the drive.
If you wait a minute, then run vmoprcmd, do we see the tape in one of the drives ?
If so, unload like this -
tpunmount -f /tmp/TAPE
>>> This just confirms we can/ can not load tapes.
Now, if it does not load, can you do this whilst actuallt phyicallty watching the library - what happens ?
If it does load, great, report back here and we'll continue.
The reconfig of the drives was the correct thing to do by the way - I would have suggested that ....
Martin
05-04-2012 07:20 AM
Tape labels and tape recogniton springs to mind: do they align? I notice you have L3 suffix on barcode which is why I mention it. 999mph mentions a cmdline to isloate just the tape mount logic, thats the place to go. Equally you could 'organise' a tape by using the robotics UI if you have one or netbackups robtest when it gets requested. If the robot cant recognise the correct media then of course it follows that both backup and restore wont work.
We had use of IBM robotics once...did the same thing:had to config the robot to ignore the last 2 digits (media type identifier).
Then it worked perfect.
Jim
05-04-2012 07:30 AM
bpreq = tpreq :)