05-23-2013 04:27 PM
Hi ,
I have a critical problem ,We had a netbackup enviorment very stable running for many years . But this week end we rebooted the master server
After rebooting the master server , the scan and sgscan commands hang and they are not at all working . The scan and sgscan runs for more than 48 hours without any results ,
# scan
************************************************************
*********************** SDT_TAPE ************************
*********************** SDT_CHANGER ************************
************************************************************
SGSCAN doesn't go beyond this .
root@jMaster:/usr/openv/volmgr/bin/driver# sgscan
When I do robtest i get the following message
Robtest
My enviorment details
Master Server Netbackup version 6.5 ( SunOS joshua 5.8 Generic_117350-44 sun4u sparc SUNW,Sun-Fire-880 , Solaris 8)
One Media server Netbackup version 6.5 ( RHEL6)
My master server itself is a media server and I have 2 libraries attached to Master . HPESL 712e library and Sun SL700 library attached
SUNSL700(5 LTO1 Tape Drives) is directly attached to the server without any switch and HPESL712e is fiber channel connected and it has 9 LTO4 tape drives
When I do the following , I am getting the following results , I tested for all the device files and all of them are working fine , Tried doing sg.install andsg.build but nothing works
We checked the HBA's are fine , The HP tape library drives are logging into the switch and as i said earlier the SUNSL700 library is direclty connected to server (this was working for more than 5 years direction connection from library to server )
But Now the both the libraries are not showing up on the master , these libraries are not only connected to one server only . Problem started with the reboot of the server . We did reboot of the library after we had problem but nothing is changing, the problem still remains forever ....
05-23-2013 07:57 PM
First netbackup 6.5 is EOL.
from the cfgmadm -al output,i'm not sure all the device was found.
If os level the device is ok ,you can try to rebuild the sg device.
05-23-2013 08:28 PM
Actually, your OS as well NBU version is EOSL.
Even your hba's are obsolete.
I have horror memories of Solaris 8 unloading st and sg drivers from memory and forceload entries had to be added to /etc/system.
There were also specific st patches that had to be installed to get NBU and st drivers to work properly.
My memory goes back more than 10 years ago. I have not seen any customer still using Solaris 8 since then (even down here in 'Dark Africa'!).
You cannot expect any Symantec engineer to know how to troubleshoot NBU on Solaris 8.
99.9% of device issues are OS-related.
PS: I have just found this in my 'Archive' pst in an email dated June 2001:
I have spoken to my 'friendly' tech support guy about scsi reserve/release.
It seems that there is a bug in Solaris 8. This is why we had to disable the scsi reserve/release. There is a Sun patch available for the st driver patch number (I think is) 108725-05 22 released March 2001 (best to check with Sun). There is going to be a Tech note on the VERITAS web site soon.
You NEED to upgrade your environment.
REALLY.
05-23-2013 09:27 PM
I understand , I need to upgrade the enviorment , But its going to take atleast 6 months to upgrade .,
Do you want me to go ahead and engage Solaris Admin to go ahead and install the above mentioned patch and give it a try , Please confirm the Patch Name/Number and what are the settings that I need to change as of now . We have more than 20 enviorments , which is running on 7.5 ,
But , I need to get this fixed , As I have plenty of data to restore . I am expecting about 100 TB to be restored shortly .
Kindly please help .
Regards
05-23-2013 10:18 PM
Best to check for the newest Solaris 8 st patch that you can find.
05-23-2013 10:28 PM
After Installing the patch ,Do I need to run sg.install and sg.build and also do i need to reboot the server
Do I also need to do any configuaration changes in /kernel/drv/sg.conf and /kernel/drv/st.conf or /usr/openv/volmgr/bin/driver/st.conf or /usr/openv/volmgr/bin/driver/sg.conf or sg.links
05-23-2013 10:59 PM
and also get your SA help to take out unidentified device entires form the sg.conf and sg.links
05-23-2013 11:01 PM
05-23-2013 11:13 PM
Do I still need to install this patch 108725-05 22 ??
Kindly Please confirm
Regards
05-23-2013 11:15 PM
This is what I see on my system
05-23-2013 11:20 PM
I think install patch is not a good idea,you need try to find out why sgscan failed.
1 from the device layout, maybe some fault devices?
2 try to confirm OS level can see the devices.
3 try to build new sg deivces,and reboot.
05-23-2013 11:24 PM
I have rebuilted the sg devices more than 5 times and also did a reboot more than 5 times but unfortunately still no luck :(
Regards
05-23-2013 11:50 PM
Do you see the devices in cfgadm -al -o show_FCP_dev
I think the ideas about the versions / patches are valid, however, this was working previusly, and after a reboot it stop, which doesn't really sound liek a patch issue to me. But, then again, it could be the case where somechange was made and this was only 'picked up' after the reboot.
Sometimes devices don't show in cfgadm -al , hence the command I suggested above.
Martin
05-24-2013 12:59 AM
Patch 108725-05 was the latest back in 2001. That was 12 years ago.
Please look for the latest st patch for Solaris 8 that you can find. (If still available on SUN/Oracle web site!)
05-24-2013 03:02 AM
cfgadm -al -o show_FCP_dev is a useful command.
I forget most of the solaris command.
05-24-2013 06:18 AM
I still dont see the drives , I have more than 10 drives as I told earlier .
05-24-2013 07:27 AM
Changer shows as unconfigured, not sure that is correct ????
05-24-2013 07:29 AM
Try cfgadmin -c configure c8 (I think)
05-24-2013 07:50 AM
05-24-2013 08:01 AM
Still tape drives and robot not functioning , Anything else I can do here to fix this .......