stan56
16 years agoLevel 4
drives serial numbers mismatch
I'm using NetBackup 5.1 with SSO option (planning to upgrade to 6.5 very soon). Solaris Master & Linux media servers, FC tape library.
Recently I've been getting a bizarre issue with a couple of tape drives not being usable on the master, while they are working fine on the media servers. I've been sort of ignoring them since the master server is not running any tape backups anyway, but now I have another drive that all of a sudden doesn't work on media servers, either. I poked a round a bit and found some pretty bad mismatches in drive serial numbers between what "scan" is seeing and what "tpautoconf -t" reports. Oddly enough, one of the drives in tpautoconf is listed twice. In "scan" output, I have 2 drives that are listed twice (same rmt number, but different sg names). Now I'm starting to wonder if I should probably rebuild the sg devices along with the globDB. Am I thinking of the right solution?
Also, just to make sure I understand how "scan" and "tpautoconf" work. If they report drastically different results, which one should I assume is correct? I'm thinking "scan" would be the one to trust, but I'm not sure. The reason being, the rmt device links reported by "scan" for the 3 drives that are not usable on the master server still don't work (I get an error issuing mt command to them), while I know the same drives do work on the media servers.
Recently I've been getting a bizarre issue with a couple of tape drives not being usable on the master, while they are working fine on the media servers. I've been sort of ignoring them since the master server is not running any tape backups anyway, but now I have another drive that all of a sudden doesn't work on media servers, either. I poked a round a bit and found some pretty bad mismatches in drive serial numbers between what "scan" is seeing and what "tpautoconf -t" reports. Oddly enough, one of the drives in tpautoconf is listed twice. In "scan" output, I have 2 drives that are listed twice (same rmt number, but different sg names). Now I'm starting to wonder if I should probably rebuild the sg devices along with the globDB. Am I thinking of the right solution?
Also, just to make sure I understand how "scan" and "tpautoconf" work. If they report drastically different results, which one should I assume is correct? I'm thinking "scan" would be the one to trust, but I'm not sure. The reason being, the rmt device links reported by "scan" for the 3 drives that are not usable on the master server still don't work (I get an error issuing mt command to them), while I know the same drives do work on the media servers.
- Thanks for all the help, I finally fixed the problem with the serials mismatch by using the procedure for drive replacement which was suggested by several people in the thread. I also found an updated version of the procedure, it's much easier to follow than the original which was rather confusing: http://seer.entsupport.symantec.com/docs/271366.htm
Now I'm wondering how those serials got changed. The drives may have been replaced, every now and then we get a get a broken drive and IBM comes out to replace it. Operations take care of it and they don't always inform us when it happens since there's no downtime. I was always under the impression that the serial numbers don't change, just like WWN's dont' change. But I may be wrong. We did have a few situations where we would get some drives that would suddenly become unusable and the way to fix it was basically to rebuild the globDB. Perhaps this was caused by the serial number change. Next time I run into this situation I'll be sure to check for serials mismatch first.