cancel
Showing results for 
Search instead for 
Did you mean: 

Netbackup 7.7.1 - Drives Down

Albatross_
Level 5

Hello All,

We are using Netbackup 7.7.1 on Linux box with SSO, From last three days I was observing that most of the drives are not working as expected.

We have total of 16 drives. and only 2 of them are functional right now. We dupe the Images from the Primary storage to TL / Tapes, which means drives are engaged only for Duplications

Since the drives functionality is not up the mark, we are experiencing Low free disk space on the Primary storage.

The drives are showing Active and Ready as NO. I checked with nbrbutil command and found that there are some MDS Allocations where the drives show as "Drives= ", So I have release the MDS Allocation with the helpf the nbrbutil command

I still dont see the drives are coming up and working as normal. They dont pick up the tapes automatically. 

Initially thought the tape might be stucked in the Drive, so logged into the library and manually moved the tapes to empty slot. After a couple of minuites the drive will be inserted with new tape and that doesnt show up in the NB console / vmoprcmd outputs.

I really need this to be fixed, Suggestions would be greatly apprecitated.

Thanks

 

 

 

6 REPLIES 6

cruisen
Level 6
Partner Accredited

Hello Albatross,

do you run cleaning session on the drives on a regulary bases. Is this acomplished by hardware or by netbackup itself.

It sounds to me that the drives are not ready because a cleaning alert was sent from the robot.

Please tell me more how you do the cleaning of all this drives.

And have you checked if the cleaning drives have not passed the limit of drive cleaning, so that they are not working correctly.

Best regards,

Cruisen

Marianne
Level 6
Partner    VIP    Accredited Certified

Tapes loaded in drives and then not getting used sounds like incorrect Device Mapping.
This is normally a result of no Persistent Binding of devices at HBA-level.

This can only be resolved by deleting all tape drives on all media servers and then re-run the Device Config wizard.

To know the exact reason for drives being DOWN'ed, add VERBOSE entry to vm.conf on all media servers and restart NBU.
Reason for drive being DOWN'ed will be logged in /var/log/messages on media server where problem is experienced.

Albatross_
Level 5

Hello, Yes you are right. The cleaning drives have passed the limit and I have extendend the limit to 30 clean ups.

When I try to clean the drive manually its throwing error 39.

But There seems to be no connectivity issues.

 

Albatross_
Level 5

Hello Marianne,

Please find the logs in one of the media servers

 

Apr 15 08:54:03 nyc-media03 tldcd[3774]: Processing UNMOUNT, TLD(0) drive 12, slot -1, barcode  <NONE>         , vsn ******
Apr 15 08:54:03 nyc-media03 tldcd[17920]: TLD(0) opening robotic path /dev/sg5
Apr 15 08:54:05 nyc-media03 ltid[3676]: LTID - Sent ROBOTIC request, Type=3, Param2=1
Apr 15 08:54:05 nyc-media03 tldd[3761]: TLD(0) DismountTape ****** from drive 15
Apr 15 08:54:06 nyc-media03 tldd[17967]: TLD(0) unload==TRUE, but no unload, drive 15 (device 1)
Apr 15 08:54:06 nyc-media03 tldcd[3774]: ../tldcd.c.3034, process_request(), received command=3, from         peername=nyc-media03.mycompany.loc, version 50
Apr 15 08:54:06 nyc-media03 tldcd[3774]: Processing UNMOUNT, TLD(0) drive 15, slot -1, barcode  <NONE>         , vsn ******
Apr 15 08:54:06 nyc-media03 tldcd[17997]: TLD(0) opening robotic path /dev/sg5
Apr 15 08:54:08 nyc-media03 tldcd[3774]: ../tldcd.c.3034, process_request(), received command=1, from         peername=nyc-media02.mycompany.loc, version 50
Apr 15 08:54:08 nyc-media03 tldcd[3774]: Processing MOUNT, TLD(0) drive 13, slot 31, barcode S24154L6                , vsn S24154
Apr 15 08:54:08 nyc-media03 tldcd[18065]: TLD(0) opening robotic path /dev/sg5
Apr 15 08:54:09 nyc-media03 ltid[3676]: LTID - Sent ROBOTIC request, Type=1, Param2=0
Apr 15 08:54:09 nyc-media03 tldd[3761]: TLD(0) MountTape S23810 on drive 10, from slot 57
Apr 15 08:54:10 nyc-media03 tldd[18094]: TLD(0) unload==TRUE, but no unload, drive 10 (device 11)
Apr 15 08:54:10 nyc-media03 tldcd[3774]: ../tldcd.c.3034, process_request(), received command=1, from peername=nyc-media03.mycompany.loc, version 50
Apr 15 08:54:10 nyc-media03 tldcd[3774]: Processing MOUNT, TLD(0) drive 10, slot 57, barcode S23810L6        , vsn S23810
Apr 15 08:54:10 nyc-media03 tldcd[18095]: TLD(0) opening robotic path /dev/sg5
Apr 15 08:54:11 nyc-media03 tldd[17665]: TLD(0) unload==TRUE, but no unload, drive 6 (device 0)
Apr 15 08:54:12 nyc-media03 tldcd[17918]: TLD(0) closing/unlocking robotic path
Apr 15 08:54:12 nyc-media03 tldcd[17997]: inquiry() function processing library HP       ESL G3 Series    680H:
Apr 15 08:54:12 nyc-media03 tldcd[17997]: TLD(0) cannot dismount drive 15, slot 38 already is full
Apr 15 08:54:12 nyc-media03 tldcd[17997]: TLD(0) closing/unlocking robotic path
Apr 15 08:54:12 nyc-media03 tldcd[18065]: inquiry() function processing library HP       ESL G3 Series    680H:
Apr 15 08:54:12 nyc-media03 tldd[3761]: DecodeDismount: TLD(0) drive 15, Actual status: Robotic dismount failure
Apr 15 08:54:12 nyc-media03 tldcd[18065]: TLD(0) expected barcode (S24154L6        ) in slot 31, found barcode (S24002L6        )
Apr 15 08:54:12 nyc-media03 tldcd[18065]: TLD(0) closing/unlocking robotic path
Apr 15 08:54:12 nyc-media03 tldcd[17920]: inquiry() function processing library HP       ESL G3 Series    680H:
Apr 15 08:54:13 nyc-media03 tldcd[3774]: ../tldcd.c.3034, process_request(), received command=3, from peername=nyc-media03.mycompany.loc, version 50
Apr 15 08:54:13 nyc-media03 tldcd[3774]: Processing UNMOUNT, TLD(0) drive 6, slot -1, barcode  <NONE> , vsn ******
Apr 15 08:54:13 nyc-media03 tldcd[18096]: TLD(0) opening robotic path /dev/sg5

 

Do you find something in them

 

Thanks

Marianne
Level 6
Partner    VIP    Accredited Certified

Someone has manually loaded or moved tapes in the library:

 

TLD(0) cannot dismount drive 15, slot 38 already is full

TLD(0) expected barcode (S24154L6        ) in slot 31, found barcode (S24002L6        )

 

You will need to at least remove tapes from slot 31 and 38 and run an inventory.
Look for errors reported for other slots as well.

It is NEVER a good idea to open the robot and manually insert tapes in slots.
The operator does not know if an empty slot belongs to a tape that is currently in a drive.

The tape library's CAP/MAP should be used to enter tapes. 
The robot will then insert tapes in slots that are really empty.

cruisen
Level 6
Partner Accredited

Hello albatross,

I dont think you can extend the limits of cleaning drives. The lifetime is about this 50 and not more.

But anyway did you check this technote:

https://www.veritas.com/support/en_US/article.TECH64769

Perhaps the cleaning is mixing the slots, have you tried to clean up using the robot interface.

I meen from the hardware side. Can you also try to unload / move the tapes first with the robot interface and afterword do an inventory job inside NBU to get the actual inventory correct.

You should be able than to reuse the tapes again, but first get some new clenaing tapes:-)

best regards,

Cruisen