Forum Discussion

H_Sharma's avatar
H_Sharma
Level 6
10 years ago

Scan Command

Hi Experts,

We have one master server windows 2008 and 8 drives.

8 drives are visible on OS level and working fine.

We are experincing issue with the Drives 1 or 2 drives are taking too much time to mount tapes. Its an intermittent issue. Sometimes all 8 are working perfectly fine. We observed its not only a particular drive but if all 7 drives are working 1 of them is not mounting tape due to which backups are in queue and after half an hour it mounts the tape and all drives start working.

In scan -changer command on master server it shows all the 8 drives with serial number.

However scan command also shows all 8 drives with serial number and on specific each drives there are only 5 drives. Is there any issue?

or if scan -changer shows all the 8 drives is it ok? if its ok then why scan command is showing 5 drives( Indivudual details of the drives ) after showing all 8 drives ? 

  • ^^^ What she said!!!!

    And when you look at which drives are seen by each NBU server, COMPARE THE SERIAL #'S & WWNs, not the "Tape0", etc.  One media server may call a drive Tape0, and the other called it Tape5 when it was added.  Ignore those names.

    This will help to track down which drives are the ones causing problems.

    Once you can see all the drives from all the servers, make sure you have persistent binding enabled on your HBAs (done through some utilitiy for your HBA's, not NetBackup) so that the paths between the tape drives and the ports on your tape drives will stay the same after a server reboot.  This has caused all kinds of problems for us in the past, with tape drives disappearing from one media server but working fine on the others.

  • Hi Marianne,

    I checked the scan output on master it shows drives coming up and disappering. I checked the scan output on media servers in one of the media server it shows all 8 and in one 4 and disappering.

    Ok this is what i had seen.

    Interestingly. 4 days ago we had got our drives firmware upgraded. In which we had to gracecfully restart the netbackup services on master and we did power cycle of our library.

    This drive issue is disappered and all the 8 drives are working perfectly fine.

    However we have scan output that is still same after the firmware updgrade and power cycle and netbackup up/down.

    Let me tell you we have taken the output of the Scan command in .txt file as suggest by the TSE.

    Master:- Scan still shows 4-5 drives and disappering.

    Media 1:- Scan still shows all the 8 drives.

    Media 2:- Scan still shows 4-5 drives and rest disappering.

    One more interesting point here. Which i missed earlier. We have a cluster server in which there is no such issue of drive and all the drives were running and scan output is correct for all.

    So we can rule out the possibility of firmware upgrade.Its nothing to do with firmware.

    So I firmly believe that refresing the netbackup services had resolved the issue but again interesting thing is still SCAN output is not correct on the problemtic server however drives started working perfectly fine

    Pls share whats your stake on this one :) 

     

  • NBU doesn't use scan as such for job operations, so it is possible that if the drives are configured, scan can give incorrect output, but jobs still run ok.  I've seen this once on Solaris in the past.

    scan also works independantly of NBU, let me show you :

    Here I have stopped NBU on my master server:

    bpps -x shows no processes.

     

    NB Processes
    ------------


    MM Processes
    ------------


    Shared Symantec Processes
    -------------------------
        root 14964     1   1   Mar 05 ?           4:40 /opt/VRTSpbx/bin/pbx_exchange

    root@womble $ scan

    However, scan still works.


    ************************************************************
    *********************** SDT_TAPE    ************************
    *********************** SDT_CHANGER ************************
    ************************************************************
    ------------------------------------------------------------
    Device Name  : "/dev/rmt/0cbn"
    Passthru Name: "/dev/sg/c1t3l0"
    Volume Header: ""
    Port: -1; Bus: -1; Target: -1; LUN: -1
    Inquiry    : "HP      Ultrium 1-SCSI  E38W"
    Vendor ID  : "HP      "
    Product ID : "Ultrium 1-SCSI  "
    Product Rev: "E38W"
    Serial Number: "HU74D01591"
    WWN          : ""
    WWN Id Type  : 0
    Device Identifier: "HP      Ultrium 1-SCSI  HU74D01591"
    Device Type    : SDT_TAPE
    NetBackup Drive Type: 3
    Removable      : Yes
    Device Supports: SCSI-3
    Flags : 0x0
    Reason: 0x0
    root@womble DataCollect $

     

    So, providing a job doesn't need the drive to do what ever scan makes it do, it'll work.  If you deleted the drives however, and tried to use the wizard to readd them, you'll probably have an issue,as the device wizard does something similiar  to scan -all

     

     

  • We have a cluster server in which there is no such issue of drive and all the drives were running and scan output is correct for all.

     

    I agree - issue is most likely not with drive firmware.

    If you look at my post over here  you will see that I have suggested other possibilities - 

    HBA, HBA driver, cable, gbic on both sides, switch port, Switch config (zoning), persistent binding between OS and HBA, etc.. etc...

    Have you checked hardware config on both master server nodes?
    Are the HBAs the same make/model? 
    Are the same drivers and firmware used for HBAs?
    Are disk and tape zoned to different HBAs?

    Have you checked that correct HBA settings are used for tape?
    Old TN for Emulex as an example: 
    http://www.symantec.com/docs/TECH22464 

    Same OS patches and hotfixes?
    (I have in the past seen that outdated Microsoft Storport drivers caused similar issues.)

    Have you confirmed that all devices are zoned to both nodes and media servers?

    Have you checked Windows Event Viewer logs for errors when the devices are losing connectivity?