cancel
Showing results for 
Search instead for 
Did you mean: 

NetBackup SAN Client slow scanning /dev/sg devices

ianG
Level 5

Hi folks

Is there any way to tune the SAN client, specifically its initial scanning of /dev/sg devices? Can we specifiy what devices to scan, or types of devices to ignore? Our initial scan takes ~2.5hrs (~1100 /dev/sg devices, 99% of which it retries 10x times..)

Log filled with the following type messages:

0,51216,200,200,1463,1536330116729,71722,139769537492800,0:,83:called from DiscoverDevices EVPD Page 0x83 "" device name /dev/sg621 2 retries left,13:DeviceInquiry,1
0,51216,200,200,1464,1536330117730,71722,139769537492800,0:,83:called from DiscoverDevices EVPD Page 0x83 "" device name /dev/sg621 1 retries left,13:DeviceInquiry,1
0,51216,200,200,1465,1536330118730,71722,139769537492800,0:,83:called from DiscoverDevices EVPD Page 0x83 "" device name /dev/sg621 0 retries left,13:DeviceInquiry,1
0,51216,200,200,1466,1536330118731,71722,139769537492800,0:,80:Device type 0 found on /dev/sg621 Vendor/Product: "IBM     2810XIV         000 ",16:StdDeviceInquiry,1
0,51216,200,200,1467,1536330118731,71722,139769537492800,0:,83:called from DiscoverDevices EVPD Page 0x83 "" device name /dev/sg620 9 retries left,13:DeviceInquiry,1
0,51216,200,200,1468,1536330119731,71722,139769537492800,0:,83:called from DiscoverDevices EVPD Page 0x83 "" device name /dev/sg620 8 retries left,13:DeviceInquiry,1
0,51216,200,200,1469,1536330120732,71722,139769537492800,0:,83:called from DiscoverDevices EVPD Page 0x83 "" device name /dev/sg620 7 retries left,13:DeviceInquiry,1
0,51216,200,200,1470,1536330121733,71722,139769537492800,0:,83:called from DiscoverDevices EVPD Page 0x83 "" device name /dev/sg620 6 retries left,13:DeviceInquiry,1
0,51216,200,200,1471,1536330122733,71722,139769537492800,0:,83:called from DiscoverDevices EVPD Page 0x83 "" device name /dev/sg620 5 retries left,13:DeviceInquiry,1
0,51216,200,200,1472,1536330123734,71722,139769537492800,0:,83:called from DiscoverDevices EVPD Page 0x83 "" device name /dev/sg620 4 retries left,13:DeviceInquiry,1

Starts at the highest sg number, and doesnt permit any FT backups to proceed until its hit sg1 (about 2.5hrs later)

It seems very strange, as I can immediately query the tape devices, eg:

lsscsi  | grep tape
[3:0:6:0]    tape    ARCHIVE  Python           V000  /dev/st0
[3:0:6:1]    tape    ARCHIVE  Python           V000  /dev/st1
[4:0:6:0]    tape    ARCHIVE  Python           V000  /dev/st2
[4:0:6:1]    tape    ARCHIVE  Python           V000  /dev/st3

or:

ls -l /dev/sg* | grep tape
crw-rw----. 1 root tape  21, 1049 Sep  7 13:15 /dev/sg1049
crw-rw----. 1 root tape  21, 1050 Sep  7 13:15 /dev/sg1050
crw-rw----. 1 root tape  21,  887 Sep  7 13:15 /dev/sg887
crw-rw----. 1 root tape  21,  888 Sep  7 13:15 /dev/sg888

I found this post with similar symptoms, marked solved, but since we're already at NetBackup 8..

https://vox.veritas.com/t5/NetBackup/SAN-Client-service-takes-a-long-time-to-start/td-p/650362

Any help appreciated!

Many thanks,

Ian

2 REPLIES 2

sdo
Moderator
Moderator
Partner    VIP    Certified

Here's a complete left-field guess... and only because I once saw a different product do something similar... and at that time/place our root cause was... we had some disk storage zoned to a media server, and we had the disk array configured with an empty "storage group/policy/set" presented to the media server.... but... no actual LUNs within the storage array's "group" yet... so... what the media server saw (at the FC/SCSI layer) was a "target" with no LUNs... and this confused the LUN scanning such that it would loop around from 0 to 999 trying to find a LUN, any LUN... but never could.

The solution was any of these:

1) present at least one LUN in the (so far) empty group at the stoage array

...or...

2) to leave it zoned... but not to create an empty storage group on the storage array

...or...

3) de-zone until one is ready to actually present some LUNs (for whatever purpose - i.e. no necessarily for SAN client)

...

...i.e. just don't have hanging empty storage sets/groups/pools presented out from storage arrays.

Thanks for the comment. Sadly I believe all are 'real' sg devices - there are a lot of disks presented to the server, and all multipathed ( w/ 12x paths)