cancel
Showing results for 
Search instead for 
Did you mean: 

SAN media server

Yiwen
Level 3
Hi All

Pealse some on can explain for me what is this issue:

i'm using SSO with SAN media server

///////////////////////////////
08/29/2009 03:13:40 - requesting resource Minsatcs01-hcart-robot-tld-0
08/29/2009 03:13:40 - requesting resource netbackup.NBU_CLIENT.MAXJOBS.Minsatcs01
08/29/2009 03:13:40 - requesting resource netbackup.NBU_POLICY.MAXJOBS.test
08/29/2009 03:14:30 - Error bptm (pid=24922) error requesting media, TpErrno = Robot operation failed
08/29/2009 03:14:30 - Warning bptm (pid=24922) media id O401L1 load operation reported an error
08/29/2009 03:13:41 - Waiting for scan drive stop HP.ULTRIUM4-SCSI.000, Media server: Minsatcs01
08/29/2009 03:13:41 - granted resource  netbackup.NBU_CLIENT.MAXJOBS.Minsatcs01
08/29/2009 03:13:41 - granted resource  netbackup.NBU_POLICY.MAXJOBS.test
08/29/2009 03:13:41 - granted resource  O401L1
08/29/2009 03:13:41 - granted resource  HP.ULTRIUM4-SCSI.000
08/29/2009 03:13:41 - granted resource  Minsatcs01-hcart-robot-tld-0
08/29/2009 03:13:41 - estimated 0 kbytes needed
08/29/2009 03:13:42 - started process bpbrm (pid=24916)
08/29/2009 03:13:42 - connecting
08/29/2009 03:13:42 - connected; connect time: 0:00:00
08/29/2009 03:13:45 - mounting O401L1
08/29/2009 03:13:47 - current media O401L1 complete, requesting next media Any
08/29/2009 03:14:53 - Error bptm (pid=24922) NBJM returned an extended error status: resource request failed (800)
08/29/2009 03:14:54 - Error bpbrm (pid=24916) from client Minsatcs01: ERR - bpbkar exiting because backup is aborting
08/29/2009 03:14:11 - end writing
An extended error status has been encountered, check detailed status (252)
//////////////////////////////////////


the Robot should be controled by the master sever or with each media server ??
imagebrowser image


thnx a lot.
12 REPLIES 12

Mouse
Moderator
Moderator
Partner    VIP    Accredited Certified
08/29/2009 03:14:30 - Warning bptm (pid=24922) media id O401L1 load operation reported an error

This load error comes from your robotic device and error code 252 means you have to look into System Log and understand what a kind of robotic error happened.
Maybe you have some physical problems with robot and/or media

The problem is out of NBU scope.

Anonymous
Not applicable
Ideally the Master server should be the robot control host. Thats if its a physical host.

Make sure you can Inventory your robot successfully.

Logon to the web UI of the robotic device if it has one and perform checks on its logs, diagnostics etc for problems.
Do you have a tape stuck?

Perform some of the robtest commands from the master see below:

GENERAL ERROR: How to troubleshoot robot communication issues in Windows
http://seer.entsupport.symantec.com/docs/276893.htm



Robtest Commands issued on the master server

  • Starting robtest
    • robtest
    • 1  --> to select TLD 0
  • Getting help
    • ?
  • Looking at contents of the tape drives
    • s d
  • Looking at the contents of the library
    • s s
  • Moving a tape from a drive to a library slot
    • s d  --> to identify drive number that has tape (Contains Cartridge = yes, Barcode=XXXXXX)
    • s s  --> to identify an empty slot in the tape library (Netbackup will need to be re-inventoried)
    • m d# s#  --> from from drive # to slot #
    • s d  --> verify the tape drive is empty
    • s s --> verify the library slot has the tape

Yiwen
Level 3

thnx

i did the test in the master server and all the test they are ok.

but the master server is me who install it and runs solaris 10 and is patched , the 3 media servers run solaris 9 and i think that the SAN is not installed on them because i get some errors from the media servers like:

 tldd[9003]: [ID 832037 daemon.error] scsi command failed, may be timeout, scsi_pkt.us_reason = 6
or

Fatal open error on HP.ULTRIUM4-SCSI.000 (device 0, /devices/pci@1e,600000/SUNW,qlc@2/fp@0,0/sg@w500110a000946d5a,0:raw),
 errno = 19 (No such device), DOWN'ing it


i did the backup on the three servers but when i remove the tapes to label them the problem apear again.

thnx a lot


xiazhen
Level 4
 hi,guys:

      you can use the following thmod for check it .

     1. mt status , what ouput is it ?

     2. cfgadm -la  , what output is it ?

  best regards
    

Yiwen
Level 3
Hi

thnx

i don't think that the problem is in the link bcoz if u see :

cfgadm -al you see this :

c9                             fc-fabric    connected    configured   unknown
c9::500110a000946d5a           tape         connected    configured   unknown


but me i receive those errors in the backup :

Error bptm (pid=8992) incorrect media found in drive index 0, expected A00006, found A00001, FREEZING A00006

even when i tried to erase the media i got this:

08/31/2009 09:51:48 - begin Erase
08/31/2009 09:51:48 - started process bplabel (pid=28209)
08/31/2009 09:51:48 - requesting resource vol-minsat02:A00003
08/31/2009 09:51:48 - granted resource  A00003
08/31/2009 09:51:48 - granted resource  HP.ULTRIUM4-SCSI.000
08/31/2009 09:51:52 - mounting A00003
08/31/2009 09:52:45 - mounted A00003; mount time: 0:00:53
08/31/2009 09:52:46 - expected media A00003; found media A00000
08/31/2009 09:52:46 - end Erase; elapsed time 0:00:58
media manager found wrong tape in drive (93)

it means what expected media X;  found media Y ?????


thnx a lot.

Mouse
Moderator
Moderator
Partner    VIP    Accredited Certified
It seems your library contents contradicts the NetBackup database.

NetBackup used to think that your slot contains A00006 but found A00001 which seems to be really inventory problem

Yiwen
Level 3
thnx Moise

but i don't think that is coming for the  inventory problem bcoz every time i do the inventory is ok. see bellow and i updated all the time.

and just one question for this error:

Fatal open error on HP.ULTRIUM4-SCSI.000 (device 0, /devices/pci@1e,600000/SUNW,qlc@2/fp@0,0/sg@w500110a000946d5a,0:raw),
 errno = 19 (No such device), DOWN'ing it

it can be that because the SAN is not installed in solaris9 ??

because in the document of the HBA no thing is talking about a patch.



--------------------------------------------------------------------
08/31/2009 11:03:16
Robot: TLD(0) on netbackup
Operation: Compare Robot Contents
EMM Server: netbackup
--------------------------------------------------------------------

    Robot Contents         Volume Configuration

Slot    Tape  Barcode          Media ID Barcode          Mismatch Detected
====    ====  =============    ======== =============    =================
   1      Yes  -none-           A00000   -none-      
   2       No
   ...
    14      Yes  -none-           A00001   -none-      
  15      Yes  -none-           A00002   -none-      
  16       No
  17      Yes  -none-           A00003   -none-      
  18      Yes  -none-           A00004   -none-      
  19       No
  20       No
  21      Yes  -none-           A00005   -none-      
  22      Yes  -none-           A00006   -none-      
  23       No
 
--------------------------------------------------------------------

thnx.

Anonymous
Not applicable
Seems like you have some media with barcode labels and some without?

You reference A0000n and also O401L1

A00nnn is assigned when there is no barcode labels or there is not a label on the tape header. Either way this sounds quite messy.

If this is the case, my recommendation is purchase enough LTO labels for all your media, in the library.
2nd recommendation. Do not have mixed tape media in your library, as this can also give you extra work. ie LTO1 and LTO4
Make sure you only have 1 type. (You can have mixed type media but that is an advanced setup)

I notice you look like you have HP ULTRIUM LTO4 drives.
In my HP MSL8096 library, I configured (via the web management interface) that the Library should send only the first 6 characters from the left to right off the barcode on the tape media to backup software. Therefore you do not need any initial Barcode Generation rules to strip the suffix L1 or L? characters in NetBackup.

Address these possible issues and regularly inventory your library, within netbackup.


SAN
Verify your zoning for the tape libaries/drives.

On all media server's run the netbackup command
/usr/openv/volmgr/scan

Check the output.
This is to interrogate exactly at the hardware level what the OS can see.

More info:
How to do device discovery from command line for NetBackup device configuration.
http://seer.entsupport.symantec.com/docs/291443.htm

Yiwen
Level 3
hi

-for the labeling O401L1 i use it first but when faced problems i remove all the labels

-for the zonning i put each media in the same zone with TL (for the TL it has 2 wwwp one for robot and other for the drive , should i put in the zoning configuration the one of the drive or of the robot ??)

-for scan i had all the same output from all media servers
************************************************************
*********************** SDT_TAPE    ************************
*********************** SDT_CHANGER ************************
*********************** SDT_OPTICAL ************************
************************************************************
------------------------------------------------------------
Device Name  : "/dev/rmt/1cbn"
Passthru Name: "/dev/sg/c0tw500110a000946d5al0"
Volume Header: ""
Port: -1; Bus: -1; Target: -1; LUN: -1
Inquiry    : "HP      Ultrium 4-SCSI  H44W"
Vendor ID  : "HP      "
Product ID : "Ultrium 4-SCSI  "
Product Rev: "H44W"
Serial Number: "HU183634L8"
WWN          : ""
WWN Id Type  : 0
Device Identifier: ""
Device Type    : SDT_TAPE
NetBackup Drive Type: 3
Removable      : Yes
Device Supports: SCSI-5
Flags : 0x0
Reason: 0x0
------------------------------------------------------------
Device Name  : "/dev/sg/c0tw500110a000946d5al1"
Passthru Name: "/dev/sg/c0tw500110a000946d5al1"
Volume Header: ""
Port: -1; Bus: -1; Target: -1; LUN: -1
Inquiry    : "HP      MSL G3 Series   D.00"
Vendor ID  : "HP      "
Product ID : "MSL G3 Series   "
Product Rev: "D.00"
Serial Number: "0838BR0022"
WWN          : ""
WWN Id Type  : 0
Device Identifier: "HP      MSL G3 Series   0838BR0022"
Device Type    : SDT_CHANGER
NetBackup Robot Type: 8
Removable      : Yes
Device Supports: SCSI-5
Number of Drives : 1
Number of Slots  : 24
Number of Media Access Ports: 0
Drive 1 Serial Number      : "HU183634L8"
Flags : 0x0
Reason: 0x0

when i do the backup on any policy  the media freeze one after on and then NB say none media available\

8/31/2009 11:48:44 - requesting resource Minsatcs02-hcart-robot-tld-0
08/31/2009 11:48:44 - requesting resource netbackup.NBU_CLIENT.MAXJOBS.Minsatcs02
08/31/2009 11:48:44 - requesting resource netbackup.NBU_POLICY.MAXJOBS.minsat02-OS
08/31/2009 11:48:44 - Waiting for scan drive stop HP.ULTRIUM4-SCSI.000, Media server: Minsatcs02
08/31/2009 11:49:16 - granted resource  netbackup.NBU_CLIENT.MAXJOBS.Minsatcs02
08/31/2009 11:49:16 - granted resource  netbackup.NBU_POLICY.MAXJOBS.minsat02-OS
08/31/2009 11:49:16 - granted resource  A00006
08/31/2009 11:49:16 - granted resource  HP.ULTRIUM4-SCSI.000
08/31/2009 11:49:16 - granted resource  Minsatcs02-hcart-robot-tld-0
08/31/2009 11:49:16 - estimated 0 kbytes needed
08/31/2009 11:49:17 - started process bpbrm (pid=27760)
08/31/2009 11:49:17 - connecting
08/31/2009 11:49:18 - connected; connect time: 0:00:00
08/31/2009 11:49:21 - mounting A00006
08/31/2009 11:51:05 - Error bptm (pid=27997) incorrect media found in drive index 0, expected A00006, found A00002, FREEZING A00006
08/31/2009 11:50:21 - mounted A00006; mount time: 0:01:00
08/31/2009 11:50:21 - current media A00006 complete, requesting next media Any
08/31/2009 11:51:17 - Waiting for scan drive stop HP.ULTRIUM4-SCSI.000, Media server: Minsatcs02
08/31/2009 11:51:17 - granted resource  A00007
08/31/2009 11:51:17 - granted resource  HP.ULTRIUM4-SCSI.000
08/31/2009 11:51:17 - granted resource  Minsatcs02-hcart-robot-tld-0
08/31/2009 11:51:17 - mounting A00007
08/31/2009 11:53:01 - Error bptm (pid=27997) incorrect media found in drive index 0, expected A00007, found A00000, FREEZING A00007
08/31/2009 11:52:17 - mounted A00007; mount time: 0:01:00
08/31/2009 11:52:17 - current media A00007 complete, requesting next media Any
08/31/2009 11:53:12 - Waiting for scan drive stop HP.ULTRIUM4-SCSI.000, Media server: Minsatcs02
08/31/2009 11:53:13 - granted resource  A00008
08/31/2009 11:53:13 - granted resource  HP.ULTRIUM4-SCSI.000
08/31/2009 11:53:13 - granted resource  Minsatcs02-hcart-robot-tld-0
08/31/2009 11:53:15 - mounting A00008
08/31/2009 11:56:15 - Error bptm (pid=27997) incorrect media found in drive index 0, expected A00008, found A00001, FREEZING A00008
08/31/2009 11:55:30 - mounted A00008; mount time: 0:02:15
08/31/2009 11:55:31 - current media A00008 complete, requesting next media Any
08/31/2009 11:56:26 - Waiting for scan drive stop HP.ULTRIUM4-SCSI.000, Media server: Minsatcs02
08/31/2009 11:56:27 - granted resource  A00000
08/31/2009 11:56:27 - granted resource  HP.ULTRIUM4-SCSI.000
08/31/2009 11:56:27 - granted resource  Minsatcs02-hcart-robot-tld-0
08/31/2009 11:56:29 - mounting A00000
08/31/2009 11:58:13 - Error bptm (pid=27997) incorrect media found in drive index 0, expected A00000, found A00001, FREEZING A00000
08/31/2009 11:57:29 - mounted A00000; mount time: 0:01:00
08/31/2009 11:57:29 - current media A00000 complete, requesting next media Any
08/31/2009 11:59:13 - Error bpbrm (pid=27760) from client Minsatcs02: ERR - bpbkar exiting because backup is aborting
08/31/2009 11:58:29 - end writing
unable to allocate new media for backup, storage unit has none available (96)

thnx

Anonymous
Not applicable
Here is some technology information about what NetBackup does when it mounts a tape.

NetBackup will check the label on the tape before attempting to write the backup image.  If the recorded label on the tape is different than the external barcode label (or the NetBackup Media ID), this error will appear.

Follow the troubleshooting in this technote to verify the recorded label and netbackup ID match.
GENERAL ERROR: Tapes are being frozen due to "<16> write_backup: incorrect media found in drive index" errors.
http://seer.entsupport.symantec.com/docs/276993.htm

DOCUMENTATION: How to troubleshoot frozen media on UNIX and Windows
http://seer.entsupport.symantec.com/docs/249632.htm


Review the firmware of your library and drive products
Latest Library Tapes and Tools August 2009 bundle with release notes for all Storageworks products
http://h20000.www2.hp.com/bizsupport/TechSupport/SoftwareDescription.jsp?lang=en&cc=us&prodTypeId=12169&prodSeriesId=3936307&swItem=co-38809-28&prodNameId=3799665&swEnvOID=2078&swLang=13&taskId=135&mode=4&idx=0

Firmware
Your drive is at H44W, there are newer revisions available. Also found this:

ADVISORY: HP StorageWorks Ultrium 4 drives do not communicate with the host on HP StorageWorks 1/8 G2 Autoloader or HP StorageWorks MSL2024, MSL4048, or MSL8098 libraries when the Encryption Kit is enabled

http://h20000.www2.hp.com/bizsupport/TechSupport/Document.jsp?lang=en&cc=us&objectID=c01487509

Yiwen
Level 3
Hi

Some thing strange here !! when i remove the cables from the two media server , the backup go normal in the last media server,and i do it for the three server it works.??

some one can explain for me this issue pls.
thnx.

Anonymous
Not applicable
Let me summarize from your details so far.

You have 1 master server and 3 SAN media servers (SSO) sharing 1 tape library and 1 tape drive.
As detailed in the scan output. However, your blurry snapshot above seems to show 2 robots.?

So at least 4 media servers (including the master) sharing 2 drives. Across 2 robots.
SCSI reservations going on perhaps?

You have too many problems to pinpoint what is going on.
All I can advise is that you consult /reread the NetBackup Guides as something fundamentally is incorrect about your configuration.

System Administrator's Guide for UNIX and Linux, Volume 1 (  http://seer.entsupport.symantec.com/docs/290201.htm )
System Administrator's Guide for UNIX and Linux, Volume 2 (  http://seer.entsupport.symantec.com/docs/290202.htm )
Device Configuration Guide for UNIX, Linux, and Windows (  http://seer.entsupport.symantec.com/docs/290200.htm )
Commands for UNIX and Linux (  http://seer.entsupport.symantec.com/docs/290234.htm )
Troubleshooting Guide for UNIX, Linux, and Windows (  http://seer.entsupport.symantec.com/docs/290230.htm )


Furthermore verify Hardware compatibility and OS compatibility from the matrices at the bottom of this page.

Listing of Veritas NetBackup (tm) 6.5 and 6.5.x Manuals and links to the respective TechNotes
http://seer.entsupport.symantec.com/docs/290282.htm

Further Advise: Install the latest Maintenance Pack or Update for the version of NetBackup you are on, for all servers in your environment.