vcs resource status abnormal
I am facing a problem,i found a resource become online monitor timeout , and i don't know how to fix it ; could anybody give me some suggestion. thanks;
#hastatus
..............
group resource system message
--------------- -------------------- -------------------- --------------------
cfsmount1 CCCG09BER ONLINE
cfsmount1 CCCG10BER ONLINE
cfsmount2 CCCG09BER ONLINE
cfsmount2 CCCG10BER ONLINE
-------------------------------------------------------------------------
cvmvoldg1 CCCG09BER ONLINE
cvmvoldg1 CCCG10BER ONLINE
group resource system message
--------------- -------------------- --------------- --------------------
cvmvoldg1 CCCG09BER |MONITOR TIMEDOUT|UNABLE TO OFFLINE|
cvmvoldg2 CCCG09BER ONLINE
cvmvoldg2 CCCG10BER ONLINE
cvmvoldg2 CCCG09BER |MONITOR TIMEDOUT|UNABLE TO OFFLINE|
---------------------------------------------------------------------------------------------------------------------------------------------
# hares -state cvmvoldg1
#Resource Attribute System Value
cvmvoldg1 State CCCG09BER ONLINE|MONITOR TIMEDOUT
cvmvoldg1 State CCCG10BER ONLINE
#
# vxdisk list
DEVICE TYPE DISK GROUP STATUS
c0t0d0s2 auto:SVM - - SVM
c0t1d0s2 auto:SVM - - SVM
c0t2d0s2 auto:none - - online invalid
c0t3d0s2 auto:none - - online invalid
c3t0d0 auto:cdsdisk - - online
c3t0d1 auto:sliced mmdatadgemcpower4 mmdatadg online shared
c3t0d2 auto:cdsdisk - - online
c3t0d3 auto:cdsdisk - - online
c3t0d4 auto:sliced mmdbdgemcpower2 mmdbdg online shared
c3t0d5s2 auto:sliced mmdatadgc3t0d5 mmdatadg online shared
c3t0d6s2 auto:sliced mmdatadgc3t0d6 mmdatadg online shared
c3t0d7s2 auto:sliced mmdatadgc3t0d7 mmdatadg online shared
c3t0d8s2 auto:sliced mmdatadgc3t0d8 mmdatadg online shared
c3t0d9s2 auto:sliced mmdatadgc3t0d9 mmdatadg online shared
c3t0d10s2 auto:sliced mmdatadgc3t0d10 mmdatadg online shared
# vxprint
Disk group: mmdbdg
TY NAME ASSOC KSTATE LENGTH PLOFFS STATE TUTIL0 PUTIL0
dg mmdbdg mmdbdg - - - - - -
dm mmdbdgemcpower2 c3t0d4 - 419348384 - - - -
v vol01 fsgen ENABLED 419346432 - ACTIVE - -
pl vol01-01 vol01 ENABLED 419346432 - ACTIVE - -
sd mmdbdgemcpower2-01 vol01-01 ENABLED 419346432 0 - - -
Disk group: mmdatadg
TY NAME ASSOC KSTATE LENGTH PLOFFS STATE TUTIL0 PUTIL0
dg mmdatadg mmdatadg - - - - - -
dm mmdatadgc3t0d5 c3t0d5s2 - 1467866928 - - - -
dm mmdatadgc3t0d6 c3t0d6s2 - 1887246288 - - - -
dm mmdatadgc3t0d7 c3t0d7s2 - 1836919616 - - - -
dm mmdatadgc3t0d8 c3t0d8s2 - 1887246288 - - - -
dm mmdatadgc3t0d9 c3t0d9s2 - 1887246288 - - - -
dm mmdatadgc3t0d10 c3t0d10s2 - 1417511040 - - - -
dm mmdatadgemcpower4 c3t0d1 - 1887354784 - - - -
v vol01 fsgen ENABLED 12271390720 - ACTIVE - -
pl vol01-01 vol01 ENABLED 12271390720 - ACTIVE - -
sd mmdatadgemcpower4-01 vol01-01 ENABLED 1887354784 0 - - -
sd mmdatadgc3t0d5-01 vol01-01 ENABLED 1467866928 1887354784 - - -
sd mmdatadgc3t0d6-01 vol01-01 ENABLED 1887246288 3355221712 - - -
sd mmdatadgc3t0d7-01 vol01-01 ENABLED 1836919616 5242468000 - - -
sd mmdatadgc3t0d8-01 vol01-01 ENABLED 1887246288 7079387616 - - -
sd mmdatadgc3t0d9-01 vol01-01 ENABLED 1887246288 8966633904 - - -
sd mmdatadgc3t0d10-01 vol01-01 ENABLED 1417510528 10853880192 - - -
messages in engine_A log:
2015/03/06 13:44:21 VCS ERROR V-16-2-13027 (CCCG09BER) Resource(cvmvoldg2) - monitor procedure did not complete within the expected
time.
2015/03/06 13:44:22 VCS ERROR V-16-2-13027 (CCCG09BER) Resource(cvmvoldg1) - monitor procedure did not complete within the expected
time.
2015/03/06 13:46:28 VCS INFO V-16-1-50086 CPU usage on CCCG09BER is 61%
2015/03/06 13:50:21 VCS ERROR V-16-2-13210 (CCCG09BER) Agent is calling clean for resource(cvmvoldg2) because 4 successive invocatio
ns of the monitor procedure did not complete within the expected time.
2015/03/06 13:50:21 VCS ERROR V-16-2-13210 (CCCG09BER) Agent is calling clean for resource(cvmvoldg1) because 4 successive invocatio
ns of the monitor procedure did not complete within the expected time.
2015/03/06 13:50:39 VCS INFO V-16-2-13068 (CCCG09BER) Resource(cvmvoldg2) - clean completed successfully.
2015/03/06 13:50:53 VCS INFO V-16-2-13068 (CCCG09BER) Resource(cvmvoldg1) - clean completed successfully.
2015/03/06 13:51:40 VCS ERROR V-16-2-13077 (CCCG09BER) Agent is unable to offline resource(cvmvoldg2). Administrative intervention m
ay be required.
2015/03/06 13:51:55 VCS ERROR V-16-2-13077 (CCCG09BER) Agent is unable to offline resource(cvmvoldg1). Administrative intervention m
ay be required.
- Hi, From vcs log, we can see, vcs found cvmvoldg2 resource monitor timeout, try to offline the resource fail. You have 2 questions need to resolve. 1. Need to resolve cvmvoldg1&2 montior timeout issue. Commonly, this was caused by system load high, poor system resource may cause cvmvoldg montior program not return in 60 seconds. Avoid system load high is a better choice. And also you can try increase montiortimeout value and monitor interval value: # haconf -makerw #hatype -modify CVMVolDg MonitorInterval 180 #hatype -modify CVMVolDg MonitorTimeout 180 #haconf -dump -makero 2. Need to resolve current cvmvoldg1&2 state: MONITOR TIMEDOUT|UNABLE TO OFFLINE Try to flush the service group and manual online the two resource, to see if it can change to normal state.