Forum Discussion

symsonu's avatar
symsonu
Level 6
7 years ago

Disk going to error state repeatedly

Hi friends,

 

Everytime we switch the service group i.e the diskgroup resource is switched i.e  dg is deported and importedwe are seeing below error during upgrade and  one of mirror disk is going to error state

 

one incident

 

Sep 25 18:06:00 arm001 vxvm:vxconfigd: [ID 702911 daemon.notice] V-5-1-11401 appdg: dg import with I/O fence enabled
Sep 25 18:06:00 arm001 vxvm:vxconfigd: [ID 702911 daemon.notice] V-5-1-16765 Selecting configuration database copy from c2t500601603DE039DFd7s2 from disks: c2t500601603DE039DFd7s2
Sep 25 18:06:00 arm001 vxvm:vxconfigd: [ID 702911 daemon.notice] V-5-1-16766 Trying to import the disk group appdg using configuration database copy from c2t500601603DE039DFd7s2
Sep 25 18:06:00 arm001 vxvm:vxconfigd: [ID 702911 daemon.warning] V-5-1-16751 import_start: disk 1530703976.348.jenarm002 (udid DGC%5FRAID%205%5FCKM00134500734%5F600601602AA036000B243B42CB02E411) not found, flags 0x900008
Sep 25 18:06:00 arm001 vxvm:vxconfigd: [ID 702911 daemon.error] V-5-1-16253 Disk group import of appdg failed with error 183 - Disk for disk group not found
Sep 25 18:06:03 arm001 vxvm:vxconfigd: [ID 702911 daemon.notice] V-5-1-11401 : dg import with I/O fence enabled
Sep 25 18:06:03 arm001 vxvm:vxconfigd: [ID 702911 daemon.notice] V-5-1-11401 appdg: dg import with I/O fence enabled
Sep 25 18:06:03 arm001 vxvm:vxconfigd: [ID 702911 daemon.notice] V-5-1-16765 Selecting configuration database copy from c2t500601603DE039DFd7s2 from disks: c2t500601603DE039DFd7s2
Sep 25 18:06:03 arm001 vxvm:vxconfigd: [ID 702911 daemon.notice] V-5-1-16766 Trying to import the disk group appdg using configuration database copy from c2t500601603DE039DFd7s2
Sep 25 18:06:03 arm001 vxvm:vxconfigd: [ID 702911 daemon.warning] V-5-1-16751 import_start: disk 1530703976.348.jenarm002 (udid DGC%5FRAID%205%5FCKM00134500734%5F600601602AA036000B243B42CB02E411) not found, flags 0x900048
Sep 25 18:06:03 arm001 vxvm:vxconfigd: [ID 702911 daemon.error] V-5-1-16253 Disk group import of appdg failed with error 183 - Disk for disk group not found
Sep 25 18:06:03 arm001 vxvm:vxconfigd: [ID 702911 daemon.notice] V-5-1-11401 : dg import with I/O fence enabled
Sep 25 18:06:03 arm001 vxvm:vxconfigd: [ID 702911 daemon.notice] V-5-1-11401 appdg: dg import with I/O fence enabled
Sep 25 18:06:03 arm001 vxvm:vxconfigd: [ID 702911 daemon.notice] V-5-1-16765 Selecting configuration database copy from c2t500601603DE039DFd7s2 from disks: c2t500601603DE039DFd7s2
Sep 25 18:06:03 arm001 vxvm:vxconfigd: [ID 702911 daemon.notice] V-5-1-16766 Trying to import the disk group appdg using configuration database copy from c2t500601603DE039DFd7s2
Sep 25 18:06:03 arm001 vxvm:vxconfigd: [ID 702911 daemon.warning] V-5-1-16751 import_start: disk 1530703976.348.jenarm002 (udid DGC%5FRAID%205%5FCKM00134500734%5F600601602AA036000B243B42CB02E411) not found, flags 0x900048
Sep 25 18:06:03 arm001 vxvm:vxconfigd: [ID 702911 daemon.error] V-5-1-16253 Disk group import of appdg failed with error 183 - Disk for disk group not found
Sep 25 18:06:03 arm001 vxvm:vxconfigd: [ID 702911 daemon.notice] V-5-1-11401 : dg import with I/O fence enabled
Sep 25 18:06:03 arm001 vxvm:vxconfigd: [ID 702911 daemon.notice] V-5-1-11401 appdg: dg import with I/O fence enabled
Sep 25 18:06:03 arm001 vxvm:vxconfigd: [ID 702911 daemon.notice] V-5-1-16765 Selecting configuration database copy from c2t500601603DE039DFd7s2 from disks: c2t500601603DE039DFd7s2
Sep 25 18:06:04 arm001 vxvm:vxconfigd: [ID 702911 daemon.notice] V-5-1-16766 Trying to import the disk group appdg using configuration database copy from c2t500601603DE039DFd7s2
Sep 25 18:06:04 arm001 vxvm:vxconfigd: [ID 702911 daemon.warning] V-5-1-16751 import_start: disk 1530703976.348.jenarm002 (udid DGC%5FRAID%205%5FCKM00134500734%5F600601602AA036000B243B42CB02E411) not found, flags 0x900048
Sep 25 18:06:04 arm001 vxvm:vxconfigd: [ID 702911 daemon.error] V-5-1-16253 Disk group import of appdg failed with error 183 - Disk for disk group not found
Sep 25 18:06:04 arm001 vxvm:vxconfigd: [ID 702911 daemon.notice] V-5-1-11401 : dg import with I/O fence enabled
Sep 25 18:06:04 arm001 vxvm:vxconfigd: [ID 702911 daemon.notice] V-5-1-11401 appdg: dg import with I/O fence enabled
Sep 25 18:06:04 arm001 vxvm:vxconfigd: [ID 702911 daemon.notice] V-5-1-16765 Selecting configuration database copy from c2t500601603DE039DFd7s2 from disks: c2t500601603DE039DFd7s2
Sep 25 18:06:04 arm001 vxvm:vxconfigd: [ID 702911 daemon.notice] V-5-1-16766 Trying to import the disk group appdg using configuration database copy from c2t500601603DE039DFd7s2
Sep 25 18:06:04 arm001 vxvm:vxconfigd: [ID 702911 daemon.warning] V-5-1-546 Disk disk3mirr in group appdg: Disk device not found
========================================

 

2nd incident

 

Sep 19 09:11:50 arm001 vxvm:vxconfigd: [ID 702911 daemon.warning] V-5-1-16751 import_start: disk 1530703976.348.jenarm002 (udid DGC%5FRAID%205%5FCKM00134500734%5F600601602AA036000B243B42CB02E411) not found, flags 0x900008
Sep 19 09:11:50 arm001 vxvm:vxconfigd: [ID 702911 daemon.error] V-5-1-16253 Disk group import of appdg failed with error 183 - Disk for disk group not found
Sep 19 09:11:53 arm001 vxvm:vxconfigd: [ID 702911 daemon.notice] V-5-1-11401 : dg import with I/O fence enabled
Sep 19 09:11:53 arm001 vxvm:vxconfigd: [ID 702911 daemon.notice] V-5-1-11401 appdg: dg import with I/O fence enabled
Sep 19 09:11:53 arm001 vxvm:vxconfigd: [ID 702911 daemon.notice] V-5-1-16765 Selecting configuration database copy from c2t500601603DE039DFd7s2 from disks: c2t500601603DE039DFd7s2
Sep 19 09:11:53 arm001 vxvm:vxconfigd: [ID 702911 daemon.notice] V-5-1-16766 Trying to import the disk group appdg using configuration database copy from c2t500601603DE039DFd7s2
Sep 19 09:11:53 arm001 vxvm:vxconfigd: [ID 702911 daemon.warning] V-5-1-16751 import_start: disk 1530703976.348.jenarm002 (udid DGC%5FRAID%205%5FCKM00134500734%5F600601602AA036000B243B42CB02E411) not found, flags 0x900048
Sep 19 09:11:53 arm001 vxvm:vxconfigd: [ID 702911 daemon.error] V-5-1-16253 Disk group import of appdg failed with error 183 - Disk for disk group not found
Sep 19 09:11:53 arm001 vxvm:vxconfigd: [ID 702911 daemon.notice] V-5-1-11401 : dg import with I/O fence enabled
Sep 19 09:11:53 arm001 vxvm:vxconfigd: [ID 702911 daemon.notice] V-5-1-11401 appdg: dg import with I/O fence enabled
Sep 19 09:11:53 arm001 vxvm:vxconfigd: [ID 702911 daemon.notice] V-5-1-16765 Selecting configuration database copy from c2t500601603DE039DFd7s2 from disks: c2t500601603DE039DFd7s2
Sep 19 09:11:54 arm001 vxvm:vxconfigd: [ID 702911 daemon.notice] V-5-1-16766 Trying to import the disk group appdg using configuration database copy from c2t500601603DE039DFd7s2
Sep 19 09:11:54 arm001 vxvm:vxconfigd: [ID 702911 daemon.warning] V-5-1-16751 import_start: disk 1530703976.348.jenarm002 (udid DGC%5FRAID%205%5FCKM00134500734%5F600601602AA036000B243B42CB02E411) not found, flags 0x900048
Sep 19 09:11:54 arm001 vxvm:vxconfigd: [ID 702911 daemon.error] V-5-1-16253 Disk group import of appdg failed with error 183 - Disk for disk group not found
Sep 19 09:11:54 arm001 vxvm:vxconfigd: [ID 702911 daemon.notice] V-5-1-11401 : dg import with I/O fence enabled
Sep 19 09:11:54 arm001 vxvm:vxconfigd: [ID 702911 daemon.notice] V-5-1-11401 appdg: dg import with I/O fence enabled
Sep 19 09:11:54 arm001 vxvm:vxconfigd: [ID 702911 daemon.notice] V-5-1-16765 Selecting configuration database copy from c2t500601603DE039DFd7s2 from disks: c2t500601603DE039DFd7s2
Sep 19 09:11:54 arm001 vxvm:vxconfigd: [ID 702911 daemon.notice] V-5-1-16766 Trying to import the disk group appdg using configuration database copy from c2t500601603DE039DFd7s2
Sep 19 09:11:54 arm001 vxvm:vxconfigd: [ID 702911 daemon.warning] V-5-1-16751 import_start: disk 1530703976.348.jenarm002 (udid DGC%5FRAID%205%5FCKM00134500734%5F600601602AA036000B243B42CB02E411) not found, flags 0x900048
Sep 19 09:11:54 arm001 vxvm:vxconfigd: [ID 702911 daemon.error] V-5-1-16253 Disk group import of appdg failed with error 183 - Disk for disk group not found
Sep 19 09:11:54 arm001 vxvm:vxconfigd: [ID 702911 daemon.notice] V-5-1-11401 : dg import with I/O fence enabled
Sep 19 09:11:54 arm001 vxvm:vxconfigd: [ID 702911 daemon.notice] V-5-1-11401 appdg: dg import with I/O fence enabled
Sep 19 09:11:54 arm001 vxvm:vxconfigd: [ID 702911 daemon.notice] V-5-1-16765 Selecting configuration database copy from c2t500601603DE039DFd7s2 from disks: c2t500601603DE039DFd7s2
Sep 19 09:11:54 arm001 vxvm:vxconfigd: [ID 702911 daemon.notice] V-5-1-16766 Trying to import the disk group appdg using configuration database copy from c2t500601603DE039DFd7s2
Sep 19 09:11:54 arm001 vxvm:vxconfigd: [ID 702911 daemon.warning] V-5-1-546 Disk disk3mirr in group appdg: Disk device not found
Sep 19 09:11:54 arm001 vxvm:vxconfigd: [ID 702911 daemon.notice] V-5-1-16254 Disk group import of appdg succeeded.
Sep 19 09:58:12 arm001 fctl: [ID 517869 kern.warning] WARNING: fp(4)::GPN_ID for D_ID=10d00 failed
Sep 19 09:58:12 arm001 fctl: [ID 517869 kern.warning] WARNING: fp(4)::N_x Port with D_ID=10d00, PWWN=5001438026ece4ac disappeared from fabric
Sep 19 09:58:12 arm001 fctl: [ID 517869 kern.warning] WARNING: fp(0)::GPN_ID for D_ID=10d00 failed
Sep 19 09:58:12 arm001 fctl: [ID 517869 kern.warning] WARNING: fp(0)::N_x Port with D_ID=10d00, PWWN=5001438026ee33ac disappeared from fabric
Sep 19 09:58:12 arm001 genunix: [ID 140958 kern.notice] LLT INFO V-14-1-10205 link 0 (net2) node 1 in trouble
=======================================================

grep -i error dmpevents* | more
dmpevents.log:Fri Sep 28 12:22:42.446: SCSI error occurred on Path c2t500601693DE039DFd9s2: opcode=0x5f reported reservation conflict (status=0x18, key=0x0, asc=0x0, as
cq=0x0)
dmpevents.log:Fri Sep 28 12:24:47.441: SCSI error occurred on Path c2t500601603DE039DFd8s2: opcode=0x5f reported reservation conflict (status=0x18, key=0x0, asc=0x0, as
cq=0x0)
dmpevents.log:Fri Sep 28 12:24:47.807: SCSI error occurred on Path c2t500601693DE039DFd9s2: opcode=0x5f reported reservation conflict (status=0x18, key=0x0, asc=0x0, as
cq=0x0)
dmpevents.log.0:Thu May  3 16:59:03.147: SCSI error occurred on Path c4t500601613DE039DFd8s2: opcode=0x28 reported device not ready (status=0x2, key=0x2, asc=0x4, ascq=
0x3)
dmpevents.log.0:Thu May  3 16:59:09.356: SCSI error occurred on Pat

 

Need suggestion as what could be the cause. We have initalised the disk and added into diskgroup and mirrored it again.

However , we wnt to prevent this in futur.

can anyone guide as what could be the cause ?

 

 

Regards

S

 

  • I would look at the Fibre Channel layer as you're seeing a lot os SCSI errors. It could be drivers, cables or firmware.

    Veritas support should confirm the same. 

    I don't think its anything to do with your configuration.

  • an you please run the commands below on both the systems and post the outputs?

    1. uname -a

    2. vxdisk -o alldgs list

    3. vxprint -ht |  egrep -i "err|fail" | tail -5

    4. grep -i scsi /etc/vx/dmpevents.log | egrep -i "fail|err" | wc -l

    5. gabconfig -a

    6. hastatus -sum

    7. haclus -display } grep -i vers

    plus

    8. the details of how the disk groups are swicthed