08-03-2011 12:51 AM
Been getting a lot of the following errors recently:
timed out waiting for media manager to mount volume(52)
I'm currently running Netbackup 6.5.4 Windows 2008 (Clustered) Master Server with a mixture of Windows 2008 and AIX 6.1 Media servers. The tape library is an IBM TS3500 with 11 3592 drives. IBM have checked the library and drives and can't see any issues there. I've had the fibre switches checked and they aren't showing any errors either. I don't want to log a case with Symantec support unless I really have to. What logs would be best to look at to try and see what the issue could be?
08-03-2011 01:10 AM
let us know how many media server with thr OS
Are you faceing this issue for all STU (media server) or only for 1 master/media ?
Are you facing this issue for all tape drives or just for 3-4 drives ?
is thr any activity happend on master server in last 3-4 days ??
08-03-2011 01:51 AM
Ensure you have bptm log on all media servers as well as VERBOSE entry in vm.conf on all of them (restart NBU to take effect).
You will need these logs to troubleshoot - bptm as well as system logs (syslog on Unix servers and Event Viewer System and Application log on Windows media servers.
As Nicolai explained - incorrect device mapping can cause this: Robot mounts tape in drive 1. If this drive device path is wrong in NBU config, the mount confirmation will not be received via OS device path. Ensure persistent binding is in place - especially on the Windows media servers.
08-03-2011 02:56 AM
Verify that the drive mapping is valid. If Netbackup think a drive is located in the first slot of the library but in reality located in the second you may receive the error you are facing.
Upon mount a tape is loaded to the wrong drive (e.g. slot 2) and Netbackup wait for the tape drive to become ready, but because tape has been loaded in slot 1 this will never happen.
The best way to test is by using Netbackup tpreq command and test each location. For Unix you can use the "mt -f /dev/rmt/??? status" command to test if a tape is mounted and ready.
08-03-2011 03:00 AM
I've setup the bptm logs on the media servers and checked the HBA config. I've attached the a screen shot of the HBA config also
08-03-2011 03:38 AM
"Automapping: Enabled "
Change to persistent binding. Automapping will cause device paths to change when servers are rebooted.
PLEASE also remember to add the VERBOSE entries to all vm.conf files on all servers. This will log device/media manager errors.
08-03-2011 04:47 AM
Hi
You recently did any patch update or any changes in master as well media server side if so check the compatibility list on both sides IBM and symantec, if not Pls check with density type of drive .
Are you faceing this issue for all STU or only for 1 master/media.
How long you are getting this issues? or is running environment throwing this issue?
08-04-2011 03:48 AM
I've finally managed to get the bptm logs from the media server that seems to be having problems. I keep seeing the following in the log:
process_tapealert: TapeAlert returned 0x00000000 0x00000000 (from io_terminate_tape)
04:27:27.147 [27918426] <2> io_terminate_tape: writing empty backup header, drive index 8, copy 2
04:27:27.147 [27918426] <2> io_terminate_tape: reposition to previous tapemark and rewrite header
04:27:27.147 [27918426] <2> io_ioctl: command (12)MTBSF 1 from (bptm.c.8661) on drive index 8
Is it anything to worry about?
08-05-2011 12:32 AM
All the messages are information only <2> you need to look for messages with <8> <16> or <32>.
10-14-2011 04:41 AM
Also please check your media physically,drag the tab ,it should not be write protected,otherwise it will give such errors.
10-14-2011 05:05 AM
Make sure you are using persistent binding everywhere and also that your windows Media Servers are not doing constant tape checks which can reset the tape drives.
This is done by adding or editing a registry key - a DWORD named AutoRun with a decimal value of 0 into the tape drivers key.
See this tech note (says Windows 2003 but applys to 2008 as well i believe)
http://support.microsoft.com/kb/842411
It needs a reboot once done and the re-check it as some drives will re-apply a setting of 1 after a reboot and you may need to re-install the drive from a command line to stop this happening
10-14-2011 08:07 AM
Recently saw some issues with LTO4 drives that hang SCSI commands and you get stuck with the SSO reservations, maybe you have the same issue, check if you have the latest firmware in all your drives and also confirm if the backplane firmware is the right too.
Enable all the loggin in your media server:
(1) Add VERBOSE to the /usr/openv/volmgr/vm.conf file. If this file does not exist, just create it.
(2) mkdir /usr/openv/volmgr/debug/robots
(3) mkdir /usr/openv/volmgr/debug/daemon
(4) mkdir /usr/openv/volmgr/debug/ltid
(5) mkdir /usr/openv/volmgr/debug/oprd
(6) mkdir /usr/openv/volmgr/debug/reqlib
(7) mkdir /usr/openv/volmgr/debug/tpcommand
(8) /usr/openv/netbackup/logs/bptm ( and pop VERBOSE=5 into bp.conf )
You could also add these files ...
touch /usr/openv/volmgr/DEBUG
touch /usr/openv/volmgr/ROBOT_DEBUG
touch /usr/openv/volmgr/AVRD_DEBUG
touch /usr/openv/volmgr/MH_DEBUG
touch /usr/openv/volmgr/database/DEBUG
touch /usr/openv/volmgr/VM_MQ_DEBUG
touch /usr/openv/volmgr/DRIVE_DEBUG
check messages file for any hardware problems.