07-18-2018 11:33 PM
Hi! Need help
We have 2 drives on netbackup with 4 media servers. Several days ago all media servers have been rebooted. Since then 1 drive (HP.ULTRIUM6-SCSI.000) is always go to PEND-TLD for all 4 media servers. And another drive is OK (HP.ULTRIUM6-SCSI.001). Rebooted netbackup soft several times , but no luck.
/dev/rmt/2cbn - HP.ULTRIUM6-SCSI.001
/dev/rmt/3cbn - HP.ULTRIUM6-SCSI.000
I'm not a specialist in netbackup, but get some information
root@host1# /opt/openv/volmgr/bin/vmoprcmd -crawlreleasebyname HP.ULTRIUM6-SCSI.000
Host host1 returned: SPC-2 RESERVED
Host host2 returned: SPC-2 RESERVED
Host host3 returned: SPC-2 RESERVED
Host host4 returned: SPC-2 RESERVED
Host host1 returned: SPC-2 RESERVED
root@host1# /opt/openv/volmgr/bin/vmoprcmd -crawlreleasebyname HP.ULTRIUM6-SCSI.001
Host host1 returned: No PERSISTENT REGISTRATIONS or SPC-2 RESERVATION found
Host host2 returned: No PERSISTENT REGISTRATIONS or SPC-2 RESERVATION found
Host host3 returned: No PERSISTENT REGISTRATIONS or SPC-2 RESERVATION found
Host host4 returned: No PERSISTENT REGISTRATIONS or SPC-2 RESERVATION found
In Solaris logs :
Jul 19 08:15:16 host1 tldd[29666]: [ID 702911 daemon.notice] TLD(0) going to UP state
Jul 19 08:15:22 host1 avrd[29691]: [ID 702911 daemon.notice] st.conf configuration for HP.ULTRIUM6-SCSI.001 (device 0), name [HP Ultrium 6-SCSI ], vid [HP Ultrium 6-SCSI ], type 0x3b, block size 0, options 0x18619 (see st(7D) man page)
Jul 19 08:15:23 host1 avrd[29691]: [ID 702911 daemon.notice] st.conf configuration for HP.ULTRIUM6-SCSI.000 (device 1), name [], vid [], type 0x0, block size 0, options 0x440 (see st(7D) man page)
Jul 19 08:23:58 host1 ltid[1732]: [ID 702911 daemon.notice] Operator requested SCSI Release of Drive HP.ULTRIUM6-SCSI.000 returned RESERVATION CONFLICT
Jul 19 08:24:03 host1 ltid[1851]: [ID 702911 daemon.notice] Operator requested SCSI Release of Drive HP.ULTRIUM6-SCSI.000 returned RESERVATION CONFLICT
Seems strange for me that HP.ULTRIUM6-SCSI.001 and HP.ULTRIUM6-SCSI.000 have different driver(or what?) configuration.
Also tried this and get same result
root@host1# mt -f /dev/rmt/2cbn config
"HP Ultrium 6-SCSI", "HP Ultrium 6-SCSI ", "CFGHPULTRIUM6SCSI";
CFGHPULTRIUM6SCSI = 2,0x3B,0,0x18619,4,0x58,0x58,0x5A,0x5A,3,60,1200,600,1200,600,600,18000;
root@host1# mt -f /dev/rmt/3cbn config
"", "", "CFG";
CFG = 2,0x0,0,0x440,4,0x00,0x00,0x00,0x00,0,120,120,3600,3600,3600,3600,3600;
Not sure there the problem is.
Thanks
Solved! Go to Solution.
07-19-2018 03:24 PM
You can certainly try those.
For SPC-2 reservations, only the HBA that made the reservation can release it - so if the reservation was made for example by a media server that is now down/ removed/ unwell .... you cannot release via commands.
I recommend using persistent reservation, this uses a numeric key, we know if we made the reservation ( the key is held on the drive firmware/ memory) and so any HBA can make it. If we do not recognize the key, we won't release it (for example, if it was made by some other application ... which would be bad, but ...)
The golden rule:
All devices (media servers , ndmp hosts etc ) that share a drive MUST use the same type of reservation. If you have some using SPC2 and others persistent, you WILL get data loss at some point.
Persistent reservation is more intellegent, it can also offer slightly more ways to troubleshot (using for example the sg_utils 3rd party commands).
07-19-2018 11:58 PM
Tried
root@host1# mt -f /dev/rmt/3cbn forcereserve
root@host1# mt -f /dev/rmt/3cbn status
And now all is OK, drive is working. Thanks for help!
07-18-2018 11:42 PM
If crawlreleasebyname doesn't release it, try power cycling the drive. This needs to be a real power cycle, not some fluufy soft 'power cycle' from the library console ....
07-18-2018 11:50 PM - edited 07-19-2018 01:46 AM
Thanks, may be have an opportunity to power cycle only 2 weeks later. How do you think, may be such commands can help to run from all media servers or they are identical to crawlreleasebyname?
To break an SPC-2 reservation on Solaris
See the mt(1) man page for more information.
07-19-2018 03:24 PM
You can certainly try those.
For SPC-2 reservations, only the HBA that made the reservation can release it - so if the reservation was made for example by a media server that is now down/ removed/ unwell .... you cannot release via commands.
I recommend using persistent reservation, this uses a numeric key, we know if we made the reservation ( the key is held on the drive firmware/ memory) and so any HBA can make it. If we do not recognize the key, we won't release it (for example, if it was made by some other application ... which would be bad, but ...)
The golden rule:
All devices (media servers , ndmp hosts etc ) that share a drive MUST use the same type of reservation. If you have some using SPC2 and others persistent, you WILL get data loss at some point.
Persistent reservation is more intellegent, it can also offer slightly more ways to troubleshot (using for example the sg_utils 3rd party commands).
07-19-2018 11:58 PM
Tried
root@host1# mt -f /dev/rmt/3cbn forcereserve
root@host1# mt -f /dev/rmt/3cbn status
And now all is OK, drive is working. Thanks for help!