Forum Discussion

koribilli12's avatar
9 years ago
Solved

Drives in Netbackup going down frequently

Team, I am seeing multiple drive down issues in our environment, this is happening very frequently. We are not sure if we are doing any mistake in configuring the drives. I am attaching the command out put for vmglob -listall tpconfig -emm_dev_list Also I am getting missing drive in the tpautoconf -t see the attached sheets and provide me the solution.

  • Marianne:

    The issue is now resolved and this is the same like teaming removed from two NIC's and kept only one NIC card. Backup thru put is fine and no more drives are getting down.

    All network depends upon having good backups.

    Regards

    Kishore

  • Unfortunately, this is the equivalent of saying, " my car is making a funny noise" - it is too general and nonspecific.

    drives go down within NetBackup for several reasons

    drives go down in a robot for different reasons

    Drives go away from the OS for other reasons

    BUT - essentially - if it is down at the OS, it will be down within NetBackup.

    You should get an idle or steady state environment and check the OS first, then the robot, then the drives, then netbackup.

    There are several really good posts on drive trouble shooting 

     

  • Given you have missing devices, I suspect this is the cause of your issues.  Though the post aboves points out there could be other casues, and there can, lots ...  but most of these a OS / Hardware related.  bptm log is a good place to start, as is Activity Monitor.  More complex issues may require volmgr logs  and other more involved troubleshooting.  Step one for you is to determine, that when a drive goes down, is it visible at the same path to the OS.

    Also, when a drive is down, what do you do to get it back up.

    NBU tracks drives by serial number.  If, for example a drive with SN 1234 disappears from the OS (or the path changes) then NBU will list this as a missing device.

    The way to stop this is to configure persistent binding, which is done via the HBA utility (sansurfer or HBAnywhere, depnding on the brand of the HBA).

    If you really are losing the paths to the drives, then you have a SAN issue between the OS and the drive.

    So, if you have 10 drives, then one goes down and the OS still sees 10 drives in total, then it is likely the path changed.  tpautoconf -report_disc would then show the missing device (because it can't find the drive with serial number xxx at drive path yyy).  If report_disc also shows a 'new device' with serial number xxx but at a new path zzz, then the path changed and persistent binding should resolve the issue.

    In summary, given what I see, you don't have a NBU problem, you have a SAN problem which in turn is affecting NBU.

    Once all the drives are vsible to the OS, you could use tpautoconf -replace_drive option to tell NBU that the drive with drivename abc is now at path xxx.  This should clear, I think, the drives reported as missing.  Other option is just to delete everything (nbemmcmd -deletealldevices -allrecords  (this will delete ALL robots and drives from every server)), make sure the OS can see everything, set up persistent binding and then reconfigure.

    • Marianne's avatar
      Marianne
      Level 6

      koribilli12 the first thing I noticed is that you did not select or tell us which OS.

      As already stated by above excellent posts, Missing Drive means that the OS has lost connectivity and that you need to troubleshoot at OS level. 

      Start with OS syslog - if Windows, Event Viewer System log, if Unix/Linux, messages or syslog.
      Look for events related to hba's and tape drives.

      Only come back to NBU to reconfig once you have found and fixed the issue. 

      One more thing - it looks like there are way too many tape drives attached to this media server. Even the most powerful server will battle to stream more than 3 or 4 drives simultaneously. 

      • koribilli12's avatar
        koribilli12
        Level 4

        Marianne:

        The issue is now resolved and this is the same like teaming removed from two NIC's and kept only one NIC card. Backup thru put is fine and no more drives are getting down.

        All network depends upon having good backups.

        Regards

        Kishore