PEND-TLD always for all media servers and 1 drive after rebot
Hi! Need help
We have 2 drives on netbackup with 4 media servers. Several days ago all media servers have been rebooted. Since then 1 drive (HP.ULTRIUM6-SCSI.000) is always go to PEND-TLD for all 4 media servers. And another drive is OK (HP.ULTRIUM6-SCSI.001). Rebooted netbackup soft several times , but no luck.
/dev/rmt/2cbn - HP.ULTRIUM6-SCSI.001
/dev/rmt/3cbn - HP.ULTRIUM6-SCSI.000
I'm not a specialist in netbackup, but get some information
root@host1# /opt/openv/volmgr/bin/vmoprcmd -crawlreleasebyname HP.ULTRIUM6-SCSI.000
Host host1 returned: SPC-2 RESERVED
Host host2 returned: SPC-2 RESERVED
Host host3 returned: SPC-2 RESERVED
Host host4 returned: SPC-2 RESERVED
Host host1 returned: SPC-2 RESERVED
root@host1# /opt/openv/volmgr/bin/vmoprcmd -crawlreleasebyname HP.ULTRIUM6-SCSI.001
Host host1 returned: No PERSISTENT REGISTRATIONS or SPC-2 RESERVATION found
Host host2 returned: No PERSISTENT REGISTRATIONS or SPC-2 RESERVATION found
Host host3 returned: No PERSISTENT REGISTRATIONS or SPC-2 RESERVATION found
Host host4 returned: No PERSISTENT REGISTRATIONS or SPC-2 RESERVATION found
In Solaris logs :
Jul 19 08:15:16 host1 tldd[29666]: [ID 702911 daemon.notice] TLD(0) going to UP state
Jul 19 08:15:22 host1 avrd[29691]: [ID 702911 daemon.notice] st.conf configuration for HP.ULTRIUM6-SCSI.001 (device 0), name [HP Ultrium 6-SCSI ], vid [HP Ultrium 6-SCSI ], type 0x3b, block size 0, options 0x18619 (see st(7D) man page)
Jul 19 08:15:23 host1 avrd[29691]: [ID 702911 daemon.notice] st.conf configuration for HP.ULTRIUM6-SCSI.000 (device 1), name [], vid [], type 0x0, block size 0, options 0x440 (see st(7D) man page)
Jul 19 08:23:58 host1 ltid[1732]: [ID 702911 daemon.notice] Operator requested SCSI Release of Drive HP.ULTRIUM6-SCSI.000 returned RESERVATION CONFLICT
Jul 19 08:24:03 host1 ltid[1851]: [ID 702911 daemon.notice] Operator requested SCSI Release of Drive HP.ULTRIUM6-SCSI.000 returned RESERVATION CONFLICT
Seems strange for me that HP.ULTRIUM6-SCSI.001 and HP.ULTRIUM6-SCSI.000 have different driver(or what?) configuration.
Also tried this and get same result
root@host1# mt -f /dev/rmt/2cbn config
"HP Ultrium 6-SCSI", "HP Ultrium 6-SCSI ", "CFGHPULTRIUM6SCSI";
CFGHPULTRIUM6SCSI = 2,0x3B,0,0x18619,4,0x58,0x58,0x5A,0x5A,3,60,1200,600,1200,600,600,18000;
root@host1# mt -f /dev/rmt/3cbn config
"", "", "CFG";
CFG = 2,0x0,0,0x440,4,0x00,0x00,0x00,0x00,0,120,120,3600,3600,3600,3600,3600;
Not sure there the problem is.
Thanks
You can certainly try those.
For SPC-2 reservations, only the HBA that made the reservation can release it - so if the reservation was made for example by a media server that is now down/ removed/ unwell .... you cannot release via commands.
I recommend using persistent reservation, this uses a numeric key, we know if we made the reservation ( the key is held on the drive firmware/ memory) and so any HBA can make it. If we do not recognize the key, we won't release it (for example, if it was made by some other application ... which would be bad, but ...)
The golden rule:
All devices (media servers , ndmp hosts etc ) that share a drive MUST use the same type of reservation. If you have some using SPC2 and others persistent, you WILL get data loss at some point.
Persistent reservation is more intellegent, it can also offer slightly more ways to troubleshot (using for example the sg_utils 3rd party commands).
Tried
root@host1# mt -f /dev/rmt/3cbn forcereserve
root@host1# mt -f /dev/rmt/3cbn statusAnd now all is OK, drive is working. Thanks for help!