Drives disappearing from fabric

Question

HI All,
We are also facing some drive issues in our environment.Mainly in weekend full backups are running which are huge databse backups on the day all drives going down in master server along with media servers. When i check the var/adm/messages noticed invalid drives also found some drives are disappeaaring and reappearing from fabric.
Please find the output of the commands
bash-3.00$ cat /var/adm/messages |grep -i disappear

Sep 10 04:12:41 dm4cfi fctl: [ID 517869 kern.warning] WARNING: fp(5)::N_x Port with D_ID=180cc, PWWN=500507630f50f601 disappeared from fabric
	Sep 10 16:51:18 dm4cfi fctl: [ID 517869 kern.warning] WARNING: fp(5)::N_x Port with D_ID=180cc, PWWN=500507630f50f601 disappeared from fabric

cat /var/adm/messages |grep -i reappeared

Sep 10 04:12:51 dm4cfi fctl: [ID 517869 kern.warning] WARNING: fp(5)::N_x Port with D_ID=180cc,
PWWN=500507630f50f601 reappeared in fabric
	Sep 10 16:51:28 dm4cfi fctl: [ID 517869 kern.warning] WARNING: fp(5)::N_x Port with D_ID=180cc, PWWN=500507630f50f601 reappeared in fabric
	------------------------------------------------------------------------------------------------------------------------------
	cat /var/adm/messages |grep -i drive output has been attached below.
Can you any help me on this much appreciated.
Master server:solaris 5.10
Netbackup version 7.0
Tape library:IBMTS3580
Please let me know incase of more information required.
&nbsp;
&nbsp;

mph999 · Accepted Answer

Thanks Marianne for moving this to a new post.
&nbsp;
Ashok,
You have a SAN issue, not a netbackup issue. &nbsp;You should speak to your san team to troubleshhot.
Could well be a faulty HBA or switch.
&nbsp;
You have many errors ...
1
Sep 10 04:12:41 dm4cfi fctl: [ID 517869 kern.warning] WARNING: fp(5)::N_x Port with D_ID=180cc, PWWN=500507630f50f601 disappeared from fabric
2
&nbsp;
Sep 10 03:03:41 dm4cfi tldd[24118]: [ID 316020 daemon.notice] TLD(0) unload==TRUE, but no unload, drive 11 (device 19)
Sep 10 03:04:00 dm4cfi tldd[24118]: [ID 921149 daemon.notice] TLD(0) [24118] waited 8 seconds for ready, drive 11
Sep 10 04:12:54 dm4cfi tldcd[19981]: [ID 718619 daemon.error] TLD(0) drive does not exist in robot, drive = 16
Sep 10 04:12:54 dm4cfi tldcd[19990]: [ID 587315 daemon.error] TLD(0) drive does not exist in robot, drive = 8
&nbsp;
NBU will never work properly if there are issues at the os level. 1. (above) are the os level errors and therefore these MUST be fixed before the errors shown in 2. &nbsp;
In fact, when 1. is fixed, I suspect 2. will disappear.
So, the correct way to look at this is to completly forget about NBU beacuse you will ALWAYS have errors in NBU until you fix the os level stuff.
The errors you see in 1 and cause becasue the drives are becoming disconnected from the SAN.
The cause can be anything on the SAN for example
HBA fault
Firmware issue
Switch fault
GBIC fault
You will need to speak with your SAN team, and investigate with them. &nbsp;There is a high possibility you will have to swap parts to see if they are faulty, for example, inserting a new HBA into the server.
Martin

ashok_veritas · Answer

Hi Martin,
Thanks for your information.
could you please investigate on more logs and confirm which are the logs required.
&nbsp;
&nbsp;
Regards,
Ashok

mph999 · Answer

None from NBU
I am not awre of any SAN logs on the server, the only one that I can think of is messages / system log, but this will only show the devices disappearing/ reappearing so apart from show the time this happened, and the fact 'it did happen' they are not going to help.
This is not a problem you are going to fix with logs, it is a case of changing bits and and maybe drivers/ firmware (HBA and switch), but I think the chance if this being a driver is very very low, in fact thinking about it, I doubt this could be caused by a driver. &nbsp;The chnance of it being firmware I think is also very low - most likely I think is that something is faulty, as to what is faulty, it could be anything from the HBA to the drive.
You will have to work with either your san team, or hardware vendor I think.
Martin

mph999 · Answer

This TN shows an example where an issue with and HBA caused devices to disappear.
http://www.symantec.com/docs/TECH33767
Martin

ashok_veritas · Answer

Hi Martin,
Thanks for your prompt response.
I would check with SAN team for investigation and wil get back to you with some more information.
&nbsp;
Ashok

Forum Discussion

Drives disappearing from fabric

5 Replies

Related Content

schedules disappear

Re: LiveUpdate - Is it useful?

Backup Sets disappeared after deleting LTO8 Library

Restore from a standalone drive fail

Servers disappeared from favourite resources.

Recent Discussions

command: bperror

MS-SharePoint policy restore error (2804) .

How to restore a backup

How to configure RBAC

10 years old netbackup appliance database service down, ssl certification out date