03-26-2015 05:18 AM
Hi,
We are having an issue with NetBackup with our NDMP backup job. The job seems to timeout at the Mounting stage. Below is an example of the error:-
25/03/2015 14:55:02 - Info bptm(pid=308) Waiting for mount of media id HJ4279 (copy 1) on server wok-backup01.emea.local.
25/03/2015 14:55:02 - mounting HJ4279
25/03/2015 15:09:48 - Error bptm(pid=308) error requesting media, TpErrno = Robot operation failed
25/03/2015 15:09:51 - Warning bptm(pid=308) media id HJ4279 load operation reported an error
25/03/2015 15:09:51 - current media HJ4279 complete, requesting next resource Any
25/03/2015 15:33:40 - Error bptm(pid=308) NBJM returned an extended error status: All compatible drive paths are down but media is available (2009)
25/03/2015 15:33:40 - Error ndmpagent(pid=7948) NDMP backup failed, path = UNKNOWN
25/03/2015 15:33:45 - Info bptm(pid=308) EXITING with status 252 <----------
25/03/2015 15:33:45 - Info ndmpagent(pid=0) done. status: 252: extended error status has been encountered, check logs
25/03/2015 15:33:45 - end writing
extended error status has been encountered, check logs(252)
When this error occurs the NDMP path is showing as Down when checking in Device Monitor.
Now for some background to this. We initially had our Netbackup system located in a differnet datacentre and it has now been physically moved to another location. At the previous location the setup was as follows:-
1 x Windows 2008R2 Media server with NetBackup v7.60.0.1 installed
1x Dell ML6030 tape library with 6 LTO6 drives
1 NetApp Filer
All connected via FibreChannel via a Cisco Nexus 500 switch with zoning configured.
As part of the physical move we have purchased a new Cisco Nexus 500 switch. The Above componest (Media Server, NBetApp Filer, Tape library) were all moved to the new location and physically cabled up via FC to the new Nexus 5000 switch.
The zoning is configured as follows:-
device-alias mode enhanced
device-alias database
device-alias name TAPEDRIVE_1 pwwn 50:03:08:c3:a1:cf:50:01
device-alias name TAPEDRIVE_2 pwwn 50:03:08:c3:a1:cf:50:05
device-alias name TAPEDRIVE_3 pwwn 50:03:08:c3:a1:cf:50:91
device-alias name TAPEDRIVE_4 pwwn 50:03:08:c3:a1:cf:50:95
device-alias name TAPEDRIVE_5 pwwn 50:03:08:c3:a1:cf:50:99
device-alias name TAPEDRIVE_6 pwwn 50:03:08:c3:a1:cf:50:9d
device-alias name WOKINGBKP_L0 pwwn 10:00:00:90:fa:3b:9e:f5
device-alias name WOKINGBKP_L1 pwwn 10:00:00:90:fa:3b:9b:27
device-alias name WOKINGBKP_R0 pwwn 10:00:00:90:fa:3b:9b:26
device-alias name WOKINGBKP_R1 pwwn 10:00:00:90:fa:3b:9e:f4
device-alias name BLK-NETAPP3220_0c pwwn 50:0a:09:81:01:c1:27:a0
device-alias name BLK-NETAPP3220_0d pwwn 50:0a:09:80:01:c1:27:a0
device-alias commit
fcdomain fcid database
vsan 1 wwn 10:00:00:90:fa:3b:9b:27 fcid 0x7d0000 dynamic
! [WOKINGBKP_L1]
vsan 1 wwn 10:00:00:90:fa:3b:9b:26 fcid 0x7d0020 dynamic
! [WOKINGBKP_R0]
vsan 1 wwn 10:00:00:90:fa:3b:9e:f4 fcid 0x7d0040 dynamic
! [WOKINGBKP_R1]
vsan 1 wwn 10:00:00:90:fa:3b:9e:f5 fcid 0x7d0060 dynamic
! [WOKINGBKP_L0]
vsan 1 wwn 50:03:08:c3:a1:cf:50:91 fcid 0x7d0080 dynamic
! [TAPEDRIVE_3]
vsan 1 wwn 50:0a:09:80:01:c1:27:a0 fcid 0x7d0081 dynamic
! [BLK-NETAPP3220_0d]
vsan 1 wwn 50:03:08:c3:a1:cf:50:95 fcid 0x7d0082 dynamic
! [TAPEDRIVE_4]
vsan 1 wwn 50:03:08:c3:a1:cf:50:9d fcid 0x7d0083 dynamic
! [TAPEDRIVE_6]
vsan 1 wwn 50:03:08:c3:a1:cf:50:99 fcid 0x7d0084 dynamic
! [TAPEDRIVE_5]
vsan 1 wwn 50:03:08:c3:a1:cf:50:05 fcid 0x7d0085 dynamic
! [TAPEDRIVE_2]
vsan 1 wwn 50:03:08:c3:a1:cf:50:01 fcid 0x7d0086 dynamic
! [TAPEDRIVE_1]
vsan 1 wwn 50:0a:09:81:01:c1:27:a0 fcid 0x7d00a0 dynamic
! [BLK-NETAPP3220_0c]
zone mode enhanced vsan 1
!Active Zone Database Section for vsan 1
zone name Z_WOKINGBKP_TAPEDRIVE vsan 1
member device-alias WOKINGBKP_L0
member device-alias TAPEDRIVE_1
member device-alias TAPEDRIVE_2
member device-alias TAPEDRIVE_3
member device-alias TAPEDRIVE_4
member device-alias TAPEDRIVE_5
member device-alias TAPEDRIVE_6
zone name Z_NETAPP_TAPEDRIVE vsan 1
member device-alias TAPEDRIVE_1
member device-alias TAPEDRIVE_2
member device-alias TAPEDRIVE_3
member device-alias TAPEDRIVE_4
member device-alias TAPEDRIVE_5
member device-alias TAPEDRIVE_6
member device-alias BLK-NETAPP3220_0c
member device-alias BLK-NETAPP3220_0d
zoneset name TAPEDRIVE vsan 1
member Z_WOKINGBKP_TAPEDRIVE
member Z_NETAPP_TAPEDRIVE
zoneset activate name TAPEDRIVE vsan 1
do clear zone database vsan 1
!Full Zone Database Section for vsan 1
zone name Z_WOKINGBKP_TAPEDRIVE vsan 1
member device-alias WOKINGBKP_L0
member device-alias TAPEDRIVE_1
member device-alias TAPEDRIVE_2
member device-alias TAPEDRIVE_3
member device-alias TAPEDRIVE_4
member device-alias TAPEDRIVE_5
member device-alias TAPEDRIVE_6
zone name Z_NETAPP_TAPEDRIVE vsan 1
member device-alias TAPEDRIVE_1
member device-alias TAPEDRIVE_2
member device-alias TAPEDRIVE_3
member device-alias TAPEDRIVE_4
member device-alias TAPEDRIVE_5
member device-alias TAPEDRIVE_6
member device-alias BLK-NETAPP3220_0c
member device-alias BLK-NETAPP3220_0d
zoneset name TAPEDRIVE vsan 1
member Z_WOKINGBKP_TAPEDRIVE
member Z_NETAPP_TAPEDRIVE
zone commit vsan 1
Netbackup is configured as follows:-
1) Within NetBackup we have the robot configured with the "Robot Control is attached to an NDMP host" option, and the robot type is set to "TLD - Tape Library DLT". Screenshot 1 shows the robot library configuration.
2) The tape drives are all configured with 2 paths and the drive type is HCART3. Screenshot 2 shows the configuration of the 1st drive.
3) The strorage unit type is set as NDMP, and the properties shows "tld(0) - hcart3". The correct NDMP host is selected. Screenshot 3 shows the config of the storage unit
4) I have created a new test job with NDMP as the type, and the correct storage unit is selected. Screenshot 4 shows the details of the backup policy.
Some other information. When I run tpautoconf -probe blk-ntap3220-01a everything seems to return correctly. All the tape drives are seen.
From the NetApp side it looks like the tape library and tape droives can be seen
storage show mc
Medium Changer: UK-BLN-PH-S-NX-1:1-23.187L1
Description: ADIC Scalar i500
Serial Number: ADICA0C0023916_LLA
WWNN: 5:003:08c3a1:cf5000
WWPN: 5:003:08c3a1:cf5001
Alias Name(s): mc0
Device State: available
storage show tape
Tape Drive: UK-BLN-PH-S-NX-1:1-25.56
Description: IBM LTO-6 ULTRIUM
Serial Number: F3A1CF5090
WWNN: 5:003:08c3a1:cf5090
WWPN: 5:003:08c3a1:cf5091
Alias Name(s): st5
Device State: available
Tape Drive: UK-BLN-PH-S-NX-1:1-26.54
Description: IBM LTO-6 ULTRIUM
Serial Number: F3A1CF5094
WWNN: 5:003:08c3a1:cf5094
WWPN: 5:003:08c3a1:cf5095
Alias Name(s): st2
Device State: available
Tape Drive: UK-BLN-PH-S-NX-1:1-28.185
Description: IBM LTO-6 ULTRIUM
Serial Number: F3A1CF509C
WWNN: 5:003:08c3a1:cf509c
WWPN: 5:003:08c3a1:cf509d
Alias Name(s): st1
Device State: available
Tape Drive: UK-BLN-PH-S-NX-1:1-27.53
Description: IBM LTO-6 ULTRIUM
Serial Number: F3A1CF5098
WWNN: 5:003:08c3a1:cf5098
WWPN: 5:003:08c3a1:cf5099
Alias Name(s): st3
Device State: available
Tape Drive: UK-BLN-PH-S-NX-1:1-24.186
Description: IBM LTO-6 ULTRIUM
Serial Number: F3A1CF5004
WWNN: 5:003:08c3a1:cf5004
WWPN: 5:003:08c3a1:cf5005
Alias Name(s): st0
Device State: available
Tape Drive: UK-BLN-PH-S-NX-1:1-23.187
Description: IBM LTO-6 ULTRIUM
Serial Number: F3A1CF5000
WWNN: 5:003:08c3a1:cf5000
WWPN: 5:003:08c3a1:cf5001
Alias Name(s): st4
Device State: available
However when I run robtest I can see a tape in the drive :-
Enter tld commands (? returns help information)
s d
drive 1 (addr 256) access = 0 Contains Cartridge = yes
Source address = 4154 (slot 59)
Barcode = HJ4279
drive 2 (addr 257) access = 1 Contains Cartridge = no
drive 3 (addr 258) access = 1 Contains Cartridge = no
drive 4 (addr 259) access = 1 Contains Cartridge = no
drive 5 (addr 260) access = 1 Contains Cartridge = no
drive 6 (addr 261) access = 1 Contains Cartridge = no
READ_ELEMENT_STATUS complete
but when I run the unload d1 command and select the NDMP host the operation eventually times out with a IO failed error... which seems to me to be similar to the bakup job fialing.
We don't think the hardware is at fault as the robotic library successfully initialises without error and I am able to inventory the library within NetBackup. I am also able to move tapes around in the library using the web interface to the library.
So my question is ... where is the issue. We are stuck and can't seem to figure out what is causing the issue. Is our zoning correct? Is NetBackup configured correctly?
Any help/advice appreciated.
thanks
03-26-2015 06:43 AM
03-26-2015 07:19 AM
Hi Marianne,
Yes I have removed the Robot, Drives, Storage Unit from NetBackup and restarted the Device Manager when prompted in NetBackup.
Was there anyting else I needed to do to remove the previous device config?
Thanks