09-08-2013 10:23 PM
Hello Friends,
I need your help in finding out what has caused the service outage and how can we overcome this situation
logs from message file :--> /var/adm/messages
==========================================
Sep 2 23:35:04 duadm2 genunix: [ID 140958 kern.notice] LLT INFO V-14-1-10205 link 0 (oce1) node 0 in trouble
Sep 2 23:35:05 duadm2 genunix: [ID 140958 kern.notice] LLT INFO V-14-1-10205 link 2 (oce0) node 0 in trouble
Sep 2 23:35:09 duadm2 genunix: [ID 487101 kern.notice] LLT INFO V-14-1-10032 link 2 (oce0) node 0 inactive 8 sec (2883648)
Sep 2 23:35:10 duadm2 genunix: [ID 487101 kern.notice] LLT INFO V-14-1-10032 link 0 (oce1) node 0 inactive 8 sec (6644221)
Sep 2 23:35:10 duadm2 genunix: [ID 487101 kern.notice] LLT INFO V-14-1-10032 link 2 (oce0) node 0 inactive 9 sec (2883648)
Sep 2 23:35:11 duadm2 genunix: [ID 487101 kern.notice] LLT INFO V-14-1-10032 link 0 (oce1) node 0 inactive 9 sec (6644221)
Sep 2 23:35:11 duadm2 genunix: [ID 487101 kern.notice] LLT INFO V-14-1-10032 link 2 (oce0) node 0 inactive 10 sec (2883648)
Sep 2 23:35:12 duadm2 genunix: [ID 487101 kern.notice] LLT INFO V-14-1-10032 link 0 (oce1) node 0 inactive 10 sec (6644221)
Sep 2 23:35:12 duadm2 genunix: [ID 487101 kern.notice] LLT INFO V-14-1-10032 link 2 (oce0) node 0 inactive 11 sec (2883648)
Sep 2 23:35:13 duadm2 genunix: [ID 487101 kern.notice] LLT INFO V-14-1-10032 link 0 (oce1) node 0 inactive 11 sec (6644221)
Sep 2 23:35:13 duadm2 genunix: [ID 487101 kern.notice] LLT INFO V-14-1-10032 link 2 (oce0) node 0 inactive 12 sec (2883648)
Sep 2 23:35:14 duadm2 genunix: [ID 487101 kern.notice] LLT INFO V-14-1-10032 link 0 (oce1) node 0 inactive 12 sec (6644221)
Sep 2 23:35:14 duadm2 genunix: [ID 487101 kern.notice] LLT INFO V-14-1-10032 link 2 (oce0) node 0 inactive 13 sec (2883648)
Sep 2 23:35:15 duadm2 genunix: [ID 487101 kern.notice] LLT INFO V-14-1-10032 link 0 (oce1) node 0 inactive 13 sec (6644221)
Sep 2 23:35:15 duadm2 genunix: [ID 487101 kern.notice] LLT INFO V-14-1-10032 link 2 (oce0) node 0 inactive 14 sec (2883648)
Sep 2 23:35:15 duadm2 genunix: [ID 592107 kern.notice] LLT INFO V-14-1-10510 sent hbreq (NULL) on link 2 (oce0) node 0. 4 more to go.
Sep 2 23:35:16 duadm2 genunix: [ID 592107 kern.notice] LLT INFO V-14-1-10510 sent hbreq (NULL) on link 2 (oce0) node 0. 3 more to go.
Sep 2 23:35:16 duadm2 genunix: [ID 592107 kern.notice] LLT INFO V-14-1-10510 sent hbreq (NULL) on link 0 (oce1) node 0. 4 more to go.
Sep 2 23:35:16 duadm2 genunix: [ID 592107 kern.notice] LLT INFO V-14-1-10510 sent hbreq (NULL) on link 2 (oce0) node 0. 2 more to go.
Sep 2 23:35:16 duadm2 genunix: [ID 487101 kern.notice] LLT INFO V-14-1-10032 link 0 (oce1) node 0 inactive 14 sec (6644221)
Sep 2 23:35:16 duadm2 genunix: [ID 592107 kern.notice] LLT INFO V-14-1-10510 sent hbreq (NULL) on link 0 (oce1) node 0. 3 more to go.
Sep 2 23:35:16 duadm2 genunix: [ID 592107 kern.notice] LLT INFO V-14-1-10510 sent hbreq (NULL) on link 2 (oce0) node 0. 1 more to go.
Sep 2 23:35:16 duadm2 genunix: [ID 487101 kern.notice] LLT INFO V-14-1-10032 link 2 (oce0) node 0 inactive 15 sec (2883648)
Sep 2 23:35:16 duadm2 genunix: [ID 592107 kern.notice] LLT INFO V-14-1-10510 sent hbreq (NULL) on link 0 (oce1) node 0. 2 more to go.
Sep 2 23:35:16 duadm2 genunix: [ID 592107 kern.notice] LLT INFO V-14-1-10510 sent hbreq (NULL) on link 2 (oce0) node 0. 0 more to go.
Sep 2 23:35:16 duadm2 genunix: [ID 592107 kern.notice] LLT INFO V-14-1-10510 sent hbreq (NULL) on link 0 (oce1) node 0. 1 more to go.
Sep 2 23:35:16 duadm2 genunix: [ID 205468 kern.notice] LLT INFO V-14-1-10509 link 2 (oce0) node 0 expired
Sep 2 23:35:17 duadm2 genunix: [ID 592107 kern.notice] LLT INFO V-14-1-10510 sent hbreq (NULL) on link 0 (oce1) node 0. 0 more to go.
Sep 2 23:35:17 duadm2 genunix: [ID 487101 kern.notice] LLT INFO V-14-1-10032 link 0 (oce1) node 0 inactive 15 sec (6644221)
Sep 2 23:35:17 duadm2 genunix: [ID 205468 kern.notice] LLT INFO V-14-1-10509 link 0 (oce1) node 0 expired
Sep 2 23:35:21 duadm2 genunix: [ID 316943 kern.notice] GAB INFO V-15-1-20036 Port h gen 288706 membership 01
Sep 2 23:35:21 duadm2 genunix: [ID 608499 kern.notice] GAB INFO V-15-1-20037 Port h gen 288706 jeopardy ;1
Sep 2 23:35:21 duadm2 genunix: [ID 316943 kern.notice] GAB INFO V-15-1-20036 Port a gen 288701 membership 01
Sep 2 23:35:21 duadm2 genunix: [ID 608499 kern.notice] GAB INFO V-15-1-20037 Port a gen 288701 jeopardy ;1
Sep 2 23:35:21 duadm2 genunix: [ID 316943 kern.notice] GAB INFO V-15-1-20036 Port b gen 288704 membership 01
Sep 2 23:35:21 duadm2 genunix: [ID 608499 kern.notice] GAB INFO V-15-1-20037 Port b gen 288704 jeopardy ;1
Sep 2 23:35:21 duadm2 Had[6267]: [ID 702911 daemon.notice] VCS INFO V-16-1-10077 Received new cluster membership
Sep 2 23:35:21 duadm2 Had[6267]: [ID 702911 daemon.notice] VCS ERROR V-16-1-10111 System duadm2 (Node '1') is in Regular and Jeopardy Memberships - Membership: 0x3
, Jeopardy: 0x2
Sep 2 23:56:50 duadm2 in.mpathd[6542]: [ID 585766 daemon.error] Cannot meet requested failure detection time of 10000 ms on (inet oce2) new failure detection time
for group "stor_mnic" is 48426 ms
Sep 2 23:56:57 duadm2 in.mpathd[6542]: [ID 594170 daemon.error] NIC failure detected on oce0 of group pub_mnic
Sep 2 23:56:57 duadm2 in.mpathd[6542]: [ID 832587 daemon.error] Successfully failed over from NIC oce0 to NIC oce9
Sep 2 23:57:01 duadm2 in.mpathd[6542]: [ID 299542 daemon.error] NIC repair detected on oce0 of group pub_mnic
Sep 2 23:57:01 duadm2 in.mpathd[6542]: [ID 620804 daemon.error] Successfully failed back to NIC oce0
Sep 2 23:57:10 duadm2 in.mpathd[6542]: [ID 594170 daemon.error] NIC failure detected on oce0 of group pub_mnic
Sep 2 23:57:10 duadm2 in.mpathd[6542]: [ID 832587 daemon.error] Successfully failed over from NIC oce0 to NIC oce9
Sep 2 23:57:10 duadm2 ip: [ID 876157 kern.warning] WARNING: node 44:1e:a1:74:e5:46 is using our IP address 010.001.014.027 on oce9
Sep 2 23:57:10 duadm2 ip: [ID 876157 kern.warning] WARNING: node 44:1e:a1:74:e5:46 is using our IP address 010.001.014.030 on oce9
Sep 2 23:57:10 duadm2 ip: [ID 876157 kern.warning] WARNING: node 44:1e:a1:74:e5:46 is using our IP address 010.001.014.027 on oce9
Sep 2 23:57:10 duadm2 ip: [ID 876157 kern.warning] WARNING: node 44:1e:a1:74:e5:46 is using our IP address 010.001.014.030 on oce9
Sep 2 23:57:10 duadm2 ip: [ID 876157 kern.warning] WARNING: node 44:1e:a1:74:e5:46 is using our IP address 010.001.014.027 on oce9
Sep 2 23:57:10 duadm2 ip: [ID 876157 kern.warning] WARNING: node 44:1e:a1:74:e5:46 is using our IP address 010.001.014.030 on oce9
Sep 2 23:57:10 duadm2 ip: [ID 567813 kern.warning] WARNING: oce9:2 has duplicate address 010.001.014.027 (claimed by 44:1e:a1:74:e5:46); disabled
Sep 2 23:57:10 duadm2 ip: [ID 567813 kern.warning] WARNING: oce9:1 has duplicate address 010.001.014.030 (claimed by 44:1e:a1:74:e5:46); disabled
Sep 2 23:57:11 duadm2 in.mpathd[6542]: [ID 299542 daemon.error] NIC repair detected on oce0 of group pub_mnic
Sep 2 23:57:11 duadm2 in.mpathd[6542]: [ID 620804 daemon.error] Successfully failed back to NIC oce0
Sep 2 23:57:20 duadm2 in.mpathd[6542]: [ID 168056 daemon.error] All Interfaces in group pub_mnic have failed
Sep 2 23:57:27 duadm2 Had[6267]: [ID 702911 daemon.notice] VCS ERROR V-16-1-10303 Resource pub_mnic (Owner: Unspecified, Group: PubLan) is FAULTED (timed out) on s
ys duadm2
Sep 2 23:57:28 duadm2 Had[6267]: [ID 702911 daemon.notice] VCS ERROR V-16-1-10205 Group PubLan is faulted on system duadm2
Sep 2 23:57:29 duadm2 Had[6267]: [ID 702911 daemon.notice] VCS ERROR V-16-1-10303 Resource pub_mnic (Owner: Unspecified, Group: PubLan) is FAULTED (timed out) on s
ys duadm1
Sep 2 23:57:30 duadm2 Had[6267]: [ID 702911 daemon.notice] VCS ERROR V-16-1-10303 Resource stor_mnic (Owner: Unspecified, Group: StorLan) is FAULTED (timed out) on
sys duadm1
Sep 2 23:57:31 duadm2 Had[6267]: [ID 702911 daemon.notice] VCS ERROR V-16-1-10205 Group PubLan is faulted on system duadm1
Sep 2 23:57:53 duadm2 in.mpathd[6542]: [ID 594170 daemon.error] NIC failure detected on oce11 of group stor_mnic
Sep 2 23:57:53 duadm2 in.mpathd[6542]: [ID 832587 daemon.error] Successfully failed over from NIC oce11 to NIC oce2
Sep 2 23:57:57 duadm2 Had[6267]: [ID 702911 daemon.notice] VCS ERROR V-16-1-10303 Resource ossfs_p1 (Owner: Unspecified, Group: Ossfs) is FAULTED (timed out) on sy
s duadm1
Sep 2 23:57:57 duadm2 Had[6267]: [ID 702911 daemon.notice] VCS ERROR V-16-1-10303 Resource syb1_p1 (Owner: Unspecified, Group: Sybase1) is FAULTED (timed out) on s
ys duadm1
Sep 2 23:57:58 duadm2 Had[6267]: [ID 702911 daemon.notice] VCS ERROR V-16-2-13067 (duadm1) Agent is calling clean for resource(ossfs_ip) because the resource becam
e OFFLINE unexpectedly, on its own.
Sep 2 23:57:59 duadm2 in.mpathd[6542]: [ID 168056 daemon.error] All Interfaces in group stor_mnic have failed
Sep 2 23:58:02 duadm2 AgentFramework[6332]: [ID 702911 daemon.notice] VCS ERROR V-16-2-13067 Thread(8) Agent is calling clean for resource(syb1_ip) because the res
ource became OFFLINE unexpectedly, on its own.
Sep 2 23:58:02 duadm2 Had[6267]: [ID 702911 daemon.notice] VCS ERROR V-16-2-13067 (duadm2) Agent is calling clean for resource(syb1_ip) because the resource became
OFFLINE unexpectedly, on its own.
Sep 2 23:58:02 duadm2 AgentFramework[6332]: [ID 702911 daemon.notice] VCS ERROR V-16-2-13068 Thread(8) Resource(syb1_ip) - clean completed successfully.
Sep 2 23:58:05 duadm2 Had[6267]: [ID 702911 daemon.notice] VCS ERROR V-16-1-10303 Resource syb1_p1 (Owner: Unspecified, Group: Sybase1) is FAULTED (timed out) on s
ys duadm2
Sep 2 23:58:05 duadm2 Had[6267]: [ID 702911 daemon.notice] VCS ERROR V-16-1-10303 Resource ossfs_p1 (Owner: Unspecified, Group: Ossfs) is FAULTED (timed out) on sy
s duadm2
Sep 2 23:58:08 duadm2 Had[6267]: [ID 702911 daemon.notice] VCS ERROR V-16-1-10303 Resource stor_mnic (Owner: Unspecified, Group: StorLan) is FAULTED (timed out) on
sys duadm2
Sep 2 23:58:20 duadm2 ldap_cachemgr[516]: [ID 293258 daemon.warning] libsldap: Status: 91 Mesg: openConnection: simple bind failed - Can't connect to the LDAP ser
ver
Sep 2 23:58:21 duadm2 Had[6267]: [ID 702911 daemon.notice] VCS ERROR V-16-2-13067 (duadm1) Agent is calling clean for resource(stor_p) because the resource became
OFFLINE unexpectedly, on its own.
Sep 2 23:58:21 duadm2 Had[6267]: [ID 702911 daemon.notice] VCS ERROR V-16-2-13070 (duadm1) Resource(stor_p) - clean not implemented.
Sep 2 23:58:22 duadm2 Had[6267]: [ID 702911 daemon.notice] VCS ERROR V-16-1-10205 Group StorLan is faulted on system duadm1
Sep 2 23:58:24 duadm2 Had[6267]: [ID 702911 daemon.notice] VCS ERROR V-16-2-13027 (duadm1) Resource(cluster_maint) - monitor procedure did not complete within the
expected time.
Sep 2 23:58:30 duadm2 AgentFramework[6345]: [ID 702911 daemon.notice] VCS ERROR V-16-2-13067 Thread(3) Agent is calling clean for resource(stor_p) because the reso
urce became OFFLINE unexpectedly, on its own.
Sep 2 23:58:30 duadm2 AgentFramework[6345]: [ID 702911 daemon.notice] VCS ERROR V-16-2-13070 Thread(3) Resource(stor_p) - clean not implemented.
Sep 2 23:58:30 duadm2 Had[6267]: [ID 702911 daemon.notice] VCS ERROR V-16-2-13067 (duadm2) Agent is calling clean for resource(stor_p) because the resource became
OFFLINE unexpectedly, on its own.
Sep 2 23:58:30 duadm2 Had[6267]: [ID 702911 daemon.notice] VCS ERROR V-16-2-13070 (duadm2) Resource(stor_p) - clean not implemented.
Sep 2 23:58:31 duadm2 Had[6267]: [ID 702911 daemon.notice] VCS ERROR V-16-1-10205 Group StorLan is faulted on system duadm2
Sep 2 23:59:19 duadm2 ldap_cachemgr[516]: [ID 293258 daemon.warning] libsldap: Status: 91 Mesg: openConnection: simple bind failed - Can't connect to the LDAP ser
ver
Sep 2 23:59:19 duadm2 ldap_cachemgr[516]: [ID 545954 daemon.error] libsldap: makeConnection: failed to open connection to 10.1.14.35
Sep 2 23:59:19 duadm2 ldap_cachemgr[516]: [ID 687686 daemon.warning] libsldap: Falling back to anonymous, non-SSL mode for __ns_ldap_getRootDSE. openConnection: si
mple bind failed - Can't connect to the LDAP server
Sep 2 23:59:20 duadm2 ldap_cachemgr[516]: [ID 293258 daemon.warning] libsldap: Status: 91 Mesg: openConnection: simple bind failed - Can't connect to the LDAP ser
ver
Sep 2 23:59:20 duadm2 ldap_cachemgr[516]: [ID 292100 daemon.warning] libsldap: could not remove 10.1.14.35 from servers list
Sep 2 23:59:20 duadm2 ldap_cachemgr[516]: [ID 292100 daemon.warning] libsldap: could not remove 10.1.14.37 from servers list
Sep 2 23:59:20 duadm2 ldap_cachemgr[516]: [ID 293258 daemon.warning] libsldap: Status: 7 Mesg: Session error no available conn.
Sep 2 23:59:20 duadm2 ldap_cachemgr[516]: [ID 186574 daemon.error] Error: Unable to refresh profile:default: Session error no available conn.
Sep 3 00:00:08 duadm2 svc.startd[9]: [ID 122153 daemon.warning] svc:/ericsson/eric_monitor/ddc:default: Method or service exit timed out. Killing contract 145156.
Sep 3 00:00:08 duadm2 svc.startd[9]: [ID 636263 daemon.warning] svc:/ericsson/eric_monitor/ddc:default: Method "/opt/ERICddc/bin/ddc stop" failed due to signal KIL
L.
Sep 3 00:00:19 duadm2 ldap_cachemgr[516]: [ID 293258 daemon.warning] libsldap: Status: 91 Mesg: openConnection: simple bind failed - Can't connect to the LDAP ser
ver
Sep 3 00:00:19 duadm2 ldap_cachemgr[516]: [ID 545954 daemon.error] libsldap: makeConnection: failed to open connection to 10.1.14.35
Sep 3 00:00:19 duadm2 ldap_cachemgr[516]: [ID 293258 daemon.warning] libsldap: Status: 91 Mesg: openConnection: simple bind failed - Can't connect to the LDAP ser
ver
Sep 3 00:00:19 duadm2 ldap_cachemgr[516]: [ID 545954 daemon.error] libsldap: makeConnection: failed to open connection to 10.1.14.37
Sep 3 00:00:19 duadm2 ldap_cachemgr[516]: [ID 687686 daemon.warning] libsldap: Falling back to anonymous, non-SSL mode for __ns_ldap_getRootDSE. openConnection: si
mple bind failed - Can't connect to the LDAP server
Sep 3 00:00:20 duadm2 ldap_cachemgr[516]: [ID 293258 daemon.warning] libsldap: Status: 91 Mesg: openConnection: simple bind failed - Can't connect to the LDAP server
Sep 3 00:00:20 duadm2 last message repeated 1 time
Sep 3 00:00:20 duadm2 ldap_cachemgr[516]: [ID 545954 daemon.error] libsldap: makeConnection: failed to open connection to 10.1.14.35
Sep 3 00:00:20 duadm2 ldap_cachemgr[516]: [ID 545954 daemon.error] libsldap: makeConnection: failed to open connection to 10.1.14.37
Sep 3 00:00:20 duadm2 ldap_cachemgr[516]: [ID 687686 daemon.warning] libsldap: Falling back to anonymous, non-SSL mode for __ns_ldap_getRootDSE. openConnection: simple bind failed - Can't connect to the LDAP server
Sep 3 00:00:20 duadm2 last message repeated 1 time
============================================================
engine logs :-->
2013/09/02 23:35:21 VCS INFO V-16-1-10077 Received new cluster membership
2013/09/02 23:35:21 VCS NOTICE V-16-1-10112 System (duadm1) - Membership: 0x3, DDNA: 0x2
2013/09/02 23:35:21 VCS ERROR V-16-1-10111 System duadm2 (Node '1') is in Regular and Jeopardy Memberships - Membership: 0x3, Jeopardy: 0x2
2013/09/02 23:35:25 VCS WARNING V-16-1-11141 LLT heartbeat link status changed. Previous status = oce1 UP oce8 UP oce0 UP; Current status = oce1 DOWN oce8 UP oce0 DOWN.
2013/09/02 23:57:27 VCS ERROR V-16-1-10303 Resource pub_mnic (Owner: Unspecified, Group: PubLan) is FAULTED (timed out) on sys duadm2
2013/09/02 23:57:27 VCS NOTICE V-16-1-10300 Initiating Offline of Resource pub_p (Owner: Unspecified, Group: PubLan) on System duadm2
2013/09/02 23:57:27 VCS INFO V-16-6-0 (duadm2) resfault:(resfault) Invoked with arg0=duadm2, arg1=pub_mnic, arg2=ONLINE
2013/09/02 23:57:27 VCS INFO V-16-0 (duadm2) resfault:(resfault.sh) Invoked with arg0=/ericsson/core/cluster/scripts/resfault.sh, arg1=duadm2 ,arg2=pub_mnic
2013/09/02 23:57:27 VCS INFO V-16-6-15002 (duadm2) hatrigger:hatrigger executed /opt/VRTSvcs/bin/triggers/resfault duadm2 pub_mnic ONLINE successfully
2013/09/02 23:57:28 VCS INFO V-16-1-10305 Resource pub_p (Owner: Unspecified, Group: PubLan) is offline on duadm2 (VCS initiated)
2013/09/02 23:57:28 VCS ERROR V-16-1-10205 Group PubLan is faulted on system duadm2
2013/09/02 23:57:28 VCS NOTICE V-16-1-10446 Group PubLan is offline on system duadm2
2013/09/02 23:57:28 VCS INFO V-16-1-10493 Evaluating duadm1 as potential target node for group PubLan
2013/09/02 23:57:28 VCS INFO V-16-1-50010 Group PubLan is online or faulted on system duadm1
2013/09/02 23:57:28 VCS INFO V-16-1-10493 Evaluating duadm2 as potential target node for group PubLan
2013/09/02 23:57:28 VCS INFO V-16-1-50010 Group PubLan is online or faulted on system duadm2
2013/09/02 23:57:28 VCS NOTICE V-16-1-10235 Restart is set for group PubLan. Group will be brought online if fault on persistent resource clears. If group is brought online anywhere else from AutoStartList or manually, then Restart will be reset
2013/09/02 23:57:28 VCS INFO V-16-6-0 (duadm2) postoffline:(postoffline) Invoked with arg0=duadm2, arg1=PubLan
2013/09/02 23:57:28 VCS INFO V-16-2-13075 (duadm1) Resource(ossfs_ip) has reported unexpected OFFLINE 1 times, which is still within the ToleranceLimit(1).
2013/09/02 23:57:28 VCS INFO V-16-6-0 (duadm2) postoffline:Executing /ericsson/core/cluster/scripts/postoffline.sh with arg0=duadm2, arg1=PubLan
2013/09/02 23:57:28 VCS INFO V-16-6-0 (duadm2) postoffline.sh:PubLan:Nothing done
2013/09/02 23:57:28 VCS INFO V-16-6-0 (duadm2) postoffline:Completed execution of /ericsson/core/cluster/scripts/postoffline.sh for group PubLan
2013/09/02 23:57:29 VCS INFO V-16-6-15002 (duadm2) hatrigger:hatrigger executed /opt/VRTSvcs/bin/triggers/postoffline duadm2 PubLan successfully
2013/09/02 23:57:29 VCS ERROR V-16-1-10303 Resource pub_mnic (Owner: Unspecified, Group: PubLan) is FAULTED (timed out) on sys duadm1
2013/09/02 23:57:29 VCS NOTICE V-16-1-10300 Initiating Offline of Resource pub_p (Owner: Unspecified, Group: PubLan) on System duadm1
2013/09/02 23:57:30 VCS INFO V-16-6-0 (duadm1) resfault:(resfault) Invoked with arg0=duadm1, arg1=pub_mnic, arg2=ONLINE
2013/09/02 23:57:30 VCS INFO V-16-0 (duadm1) resfault:(resfault.sh) Invoked with arg0=/ericsson/core/cluster/scripts/resfault.sh, arg1=duadm1 ,arg2=pub_mnic
2013/09/02 23:57:30 VCS INFO V-16-6-15002 (duadm1) hatrigger:hatrigger executed /opt/VRTSvcs/bin/triggers/resfault duadm1 pub_mnic ONLINE successfully
2013/09/02 23:57:31 VCS ERROR V-16-1-10303 Resource stor_mnic (Owner: Unspecified, Group: StorLan) is FAULTED (timed out) on sys duadm1
2013/09/02 23:57:31 VCS WARNING V-16-1-13310 Offlining parent group DDCMon on system duadm1
2013/09/02 23:57:31 VCS NOTICE V-16-1-10300 Initiating Offline of Resource ddc_app (Owner: Unspecified, Group: DDCMon) on System duadm1
2013/09/02 23:57:31 VCS NOTICE V-16-1-10300 Initiating Offline of Resource hyperic_app (Owner: Unspecified, Group: DDCMon) on System duadm1
2013/09/02 23:57:31 VCS INFO V-16-1-10305 Resource pub_p (Owner: Unspecified, Group: PubLan) is offline on duadm1 (VCS initiated)
2013/09/02 23:57:31 VCS ERROR V-16-1-10205 Group PubLan is faulted on system duadm1
2013/09/02 23:57:31 VCS NOTICE V-16-1-10446 Group PubLan is offline on system duadm1
2013/09/02 23:57:31 VCS INFO V-16-1-10493 Evaluating duadm1 as potential target node for group PubLan
2013/09/02 23:57:31 VCS INFO V-16-1-50010 Group PubLan is online or faulted on system duadm1
2013/09/02 23:57:31 VCS INFO V-16-1-10493 Evaluating duadm2 as potential target node for group PubLan
2013/09/02 23:57:31 VCS INFO V-16-1-50010 Group PubLan is online or faulted on system duadm2
2013/09/02 23:57:31 VCS NOTICE V-16-1-10235 Restart is set for group PubLan. Group will be brought online if fault on persistent resource clears. If group is brought online anywhere else from AutoStartList or manually, then Restart will be reset
2013/09/02 23:57:31 VCS INFO V-16-6-0 (duadm1) resfault:(resfault) Invoked with arg0=duadm1, arg1=stor_mnic, arg2=ONLINE
2013/09/02 23:57:31 VCS INFO V-16-6-0 (duadm1) postoffline:(postoffline) Invoked with arg0=duadm1, arg1=PubLan
2013/09/02 23:57:31 VCS INFO V-16-0 (duadm1) resfault:(resfault.sh) Invoked with arg0=/ericsson/core/cluster/scripts/resfault.sh, arg1=duadm1 ,arg2=stor_mnic
2013/09/02 23:57:31 VCS INFO V-16-6-0 (duadm1) postoffline:Executing /ericsson/core/cluster/scripts/postoffline.sh with arg0=duadm1, arg1=PubLan
2013/09/02 23:57:31 VCS INFO V-16-6-15002 (duadm1) hatrigger:hatrigger executed /opt/VRTSvcs/bin/triggers/resfault duadm1 stor_mnic ONLINE successfully
2013/09/02 23:57:31 VCS INFO V-16-6-0 (duadm1) postoffline.sh:PubLan:Nothing done
2013/09/02 23:57:31 VCS INFO V-16-6-0 (duadm1) postoffline:Completed execution of /ericsson/core/cluster/scripts/postoffline.sh for group PubLan
2013/09/02 23:57:31 VCS INFO V-16-6-15002 (duadm1) hatrigger:hatrigger executed /opt/VRTSvcs/bin/triggers/postoffline duadm1 PubLan successfully
2013/09/02 23:57:32 VCS INFO V-16-2-13075 (duadm2) Resource(syb1_ip) has reported unexpected OFFLINE 1 times, which is still within the ToleranceLimit(1).
2013/09/02 23:57:41 VCS INFO V-16-2-13075 (duadm1) Resource(pms_ip) has reported unexpected OFFLINE 1 times, which is still within the ToleranceLimit(1).
2013/09/02 23:57:41 VCS INFO V-16-2-13075 (duadm1) Resource(snmp_ip) has reported unexpected OFFLINE 1 times, which is still within the ToleranceLimit(1).
2013/09/02 23:57:41 VCS INFO V-16-2-13075 (duadm1) Resource(cms_ip) has reported unexpected OFFLINE 1 times, which is still within the ToleranceLimit(1).
2013/09/02 23:57:57 VCS ERROR V-16-1-10303 Resource ossfs_p1 (Owner: Unspecified, Group: Ossfs) is FAULTED (timed out) on sys duadm1
2013/09/02 23:57:57 VCS NOTICE V-16-1-10300 Initiating Offline of Resource snmp_ip (Owner: Unspecified, Group: Ossfs) on System duadm1
2013/09/02 23:57:57 VCS NOTICE V-16-1-10300 Initiating Offline of Resource pms_ip (Owner: Unspecified, Group: Ossfs) on System duadm1
2013/09/02 23:57:57 VCS NOTICE V-16-1-10300 Initiating Offline of Resource stop_oss (Owner: Unspecified, Group: Ossfs) on System duadm1
2013/09/02 23:57:57 VCS NOTICE V-16-1-10300 Initiating Offline of Resource cms_ip (Owner: Unspecified, Group: Ossfs) on System duadm1
2013/09/02 23:57:57 VCS ERROR V-16-1-10303 Resource syb1_p1 (Owner: Unspecified, Group: Sybase1) is FAULTED (timed out) on sys duadm1
2013/09/02 23:57:58 VCS INFO V-16-1-10305 Resource snmp_ip (Owner: Unspecified, Group: Ossfs) is offline on duadm1 (VCS initiated)
2013/09/02 23:57:58 VCS INFO V-16-1-10305 Resource pms_ip (Owner: Unspecified, Group: Ossfs) is offline on duadm1 (VCS initiated)
2013/09/02 23:57:58 VCS INFO V-16-1-10305 Resource cms_ip (Owner: Unspecified, Group: Ossfs) is offline on duadm1 (VCS initiated)
2013/09/02 23:57:58 VCS INFO V-16-6-0 (duadm1) resfault:(resfault) Invoked with arg0=duadm1, arg1=syb1_p1, arg2=ONLINE
2013/09/02 23:57:58 VCS INFO V-16-6-0 (duadm1) resfault:(resfault) Invoked with arg0=duadm1, arg1=ossfs_p1, arg2=ONLINE
2013/09/02 23:57:58 VCS INFO V-16-0 (duadm1) resfault:(resfault.sh) Invoked with arg0=/ericsson/core/cluster/scripts/resfault.sh, arg1=duadm1 ,arg2=syb1_p1
2013/09/02 23:57:58 VCS INFO V-16-0 (duadm1) resfault:(resfault.sh) Invoked with arg0=/ericsson/core/cluster/scripts/resfault.sh, arg1=duadm1 ,arg2=ossfs_p1
2013/09/02 23:57:58 VCS INFO V-16-6-15002 (duadm1) hatrigger:hatrigger executed /opt/VRTSvcs/bin/triggers/resfault duadm1 syb1_p1 ONLINE successfully
2013/09/02 23:57:58 VCS INFO V-16-6-15002 (duadm1) hatrigger:hatrigger executed /opt/VRTSvcs/bin/triggers/resfault duadm1 ossfs_p1 ONLINE successfully
2013/09/02 23:57:58 VCS ERROR V-16-2-13067 (duadm1) Agent is calling clean for resource(ossfs_ip) because the resource became OFFLINE unexpectedly, on its own.
2013/09/02 23:57:58 VCS INFO V-16-2-13068 (duadm1) Resource(ossfs_ip) - clean completed successfully.
2013/09/02 23:57:59 VCS INFO V-16-1-10307 Resource ossfs_ip (Owner: Unspecified, Group: Ossfs) is offline on duadm1 (Not initiated by VCS)
2013/09/02 23:57:59 VCS INFO V-16-6-0 (duadm1) resfault:(resfault) Invoked with arg0=duadm1, arg1=ossfs_ip, arg2=ONLINE
2013/09/02 23:57:59 VCS INFO V-16-0 (duadm1) resfault:(resfault.sh) Invoked with arg0=/ericsson/core/cluster/scripts/resfault.sh, arg1=duadm1 ,arg2=ossfs_ip
2013/09/02 23:57:59 VCS INFO V-16-6-15002 (duadm1) hatrigger:hatrigger executed /opt/VRTSvcs/bin/triggers/resfault duadm1 ossfs_ip ONLINE successfully
2013/09/02 23:58:02 VCS ERROR V-16-2-13067 (duadm2) Agent is calling clean for resource(syb1_ip) because the resource became OFFLINE unexpectedly, on its own.
2013/09/02 23:58:02 VCS INFO V-16-2-13068 (duadm2) Resource(syb1_ip) - clean completed successfully.
2013/09/02 23:58:02 VCS INFO V-16-1-10307 Resource syb1_ip (Owner: Unspecified, Group: Sybase1) is offline on duadm2 (Not initiated by VCS)
2013/09/02 23:58:02 VCS NOTICE V-16-1-10300 Initiating Offline of Resource stop_sybase (Owner: Unspecified, Group: Sybase1) on System duadm2
2013/09/02 23:58:02 VCS INFO V-16-6-0 (duadm2) resfault:(resfault) Invoked with arg0=duadm2, arg1=syb1_ip, arg2=ONLINE
2013/09/02 23:58:02 VCS INFO V-16-10001-88 (duadm2) Application:stop_sybase:offline:Executed [/ericsson/core/cluster/scripts/stop_sybase.sh stop] successfully.
2013/09/02 23:58:02 VCS INFO V-16-0 (duadm2) resfault:(resfault.sh) Invoked with arg0=/ericsson/core/cluster/scripts/resfault.sh, arg1=duadm2 ,arg2=syb1_ip
2013/09/02 23:58:02 VCS INFO V-16-6-15002 (duadm2) hatrigger:hatrigger executed /opt/VRTSvcs/bin/triggers/resfault duadm2 syb1_ip ONLINE successfully
2013/09/02 23:58:03 VCS INFO V-16-1-50135 User root fired command: hagrp -switch Oss duadm2 from localhost
2013/09/02 23:58:03 VCS NOTICE V-16-1-10208 Initiating switch of group Oss from system duadm1 to system duadm2
2013/09/02 23:58:03 VCS NOTICE V-16-1-10300 Initiating Offline of Resource activemq (Owner: Unspecified, Group: Oss) on System duadm1
2013/09/02 23:58:03 VCS NOTICE V-16-1-10300 Initiating Offline of Resource activemq_oss_loggingbroker (Owner: Unspecified, Group: Oss) on System duadm1
2013/09/02 23:58:03 VCS NOTICE V-16-1-10300 Initiating Offline of Resource alex (Owner: Unspecified, Group: Oss) on System duadm1
2013/09/02 23:58:03 VCS NOTICE V-16-1-10300 Initiating Offline of Resource apache (Owner: Unspecified, Group: Oss) on System duadm1
2013/09/02 23:58:03 VCS NOTICE V-16-1-10300 Initiating Offline of Resource cron (Owner: Unspecified, Group: Oss) on System duadm1
2013/09/02 23:58:03 VCS NOTICE V-16-1-10300 Initiating Offline of Resource fmria (Owner: Unspecified, Group: Oss) on System duadm1
2013/09/02 23:58:03 VCS NOTICE V-16-1-10300 Initiating Offline of Resource glassfish (Owner: Unspecified, Group: Oss) on System duadm1
2013/09/02 23:58:03 VCS NOTICE V-16-1-10300 Initiating Offline of Resource imgr_tomcat (Owner: Unspecified, Group: Oss) on System duadm1
2013/09/02 23:58:03 VCS NOTICE V-16-1-10300 Initiating Offline of Resource restart_mc (Owner: Unspecified, Group: Oss) on System duadm1
2013/09/02 23:58:03 VCS NOTICE V-16-1-10300 Initiating Offline of Resource syb_log_mon (Owner: Unspecified, Group: Oss) on System duadm1
2013/09/02 23:58:03 VCS NOTICE V-16-1-10300 Initiating Offline of Resource syb_proc_mon (Owner: Unspecified, Group: Oss) on System duadm1
2013/09/02 23:58:03 VCS NOTICE V-16-1-10300 Initiating Offline of Resource trapdist (Owner: Unspecified, Group: Oss) on System duadm1
2013/09/02 23:58:03 VCS NOTICE V-16-1-10300 Initiating Offline of Resource vrsnt_log_mon (Owner: Unspecified, Group: Oss) on System duadm1
2013/09/02 23:58:03 VCS INFO V-16-10001-1 (duadm1) Application:stop_oss:stop_oss.sh:Waiting for Oss to go Offline
2013/09/02 23:58:03 VCS INFO V-16-10001-88 (duadm1) Application:cron:offline:Executed [/ericsson/core/cluster/scripts/cron.sh stop] successfully.
2013/09/02 23:58:03 VCS INFO V-16-10001-88 (duadm1) Application:alex:offline:Executed [/ericsson/core/cluster/scripts/svc.sh stop /ericsson/eric_3pp/alex:default] successfully.
2013/09/02 23:58:03 VCS INFO V-16-10001-88 (duadm1) Application:apache:offline:Executed [/ericsson/core/cluster/scripts/svc.sh stop /network/http:apache2] successfully.
2013/09/02 23:58:05 VCS INFO V-16-10001-88 (duadm1) Application:restart_mc:offline:Executed [/ericsson/core/cluster/scripts/restart_mc.sh stop] successfully.
2013/09/02 23:58:05 VCS ERROR V-16-1-10303 Resource syb1_p1 (Owner: Unspecified, Group: Sybase1) is FAULTED (timed out) on sys duadm2
2013/09/02 23:58:05 VCS ERROR V-16-1-10303 Resource ossfs_p1 (Owner: Unspecified, Group: Ossfs) is FAULTED (timed out) on sys duadm2
2013/09/02 23:58:05 VCS INFO V-16-6-0 (duadm2) resfault:(resfault) Invoked with arg0=duadm2, arg1=syb1_p1, arg2=ONLINE
2013/09/02 23:58:05 VCS INFO V-16-6-0 (duadm2) resfault:(resfault) Invoked with arg0=duadm2, arg1=ossfs_p1, arg2=ONLINE
2013/09/02 23:58:05 VCS INFO V-16-0 (duadm2) resfault:(resfault.sh) Invoked with arg0=/ericsson/core/cluster/scripts/resfault.sh, arg1=duadm2 ,arg2=syb1_p1
2013/09/02 23:58:05 VCS INFO V-16-0 (duadm2) resfault:(resfault.sh) Invoked with arg0=/ericsson/core/cluster/scripts/resfault.sh, arg1=duadm2 ,arg2=ossfs_p1
2013/09/02 23:58:05 VCS INFO V-16-6-15002 (duadm2) hatrigger:hatrigger executed /opt/VRTSvcs/bin/triggers/resfault duadm2 syb1_p1 ONLINE successfully
2013/09/02 23:58:05 VCS INFO V-16-6-15002 (duadm2) hatrigger:hatrigger executed /opt/VRTSvcs/bin/triggers/resfault duadm2 ossfs_p1 ONLINE successfully
2013/09/02 23:58:05 VCS INFO V-16-1-10305 Resource stop_sybase (Owner: Unspecified, Group: Sybase1) is offline on duadm2 (VCS initiated)
2013/09/02 23:58:05 VCS NOTICE V-16-1-10300 Initiating Offline of Resource masterdataservice_BACKUP (Owner: Unspecified, Group: Sybase1) on System duadm2
2013/09/02 23:58:08 VCS ERROR V-16-1-10303 Resource stor_mnic (Owner: Unspecified, Group: StorLan) is FAULTED (timed out) on sys duadm2
2013/09/02 23:58:08 VCS WARNING V-16-1-13310 Offlining parent group DDCMon on system duadm2
2013/09/02 23:58:08 VCS NOTICE V-16-1-10300 Initiating Offline of Resource ddc_app (Owner: Unspecified, Group: DDCMon) on System duadm2
2013/09/02 23:58:08 VCS NOTICE V-16-1-10300 Initiating Offline of Resource hyperic_app (Owner: Unspecified, Group: DDCMon) on System duadm2
2013/09/02 23:58:08 VCS INFO V-16-6-0 (duadm2) resfault:(resfault) Invoked with arg0=duadm2, arg1=stor_mnic, arg2=ONLINE
2013/09/02 23:58:08 VCS INFO V-16-0 (duadm2) resfault:(resfault.sh) Invoked with arg0=/ericsson/core/cluster/scripts/resfault.sh, arg1=duadm2 ,arg2=stor_mnic
2013/09/02 23:58:08 VCS INFO V-16-6-15002 (duadm2) hatrigger:hatrigger executed /opt/VRTSvcs/bin/triggers/resfault duadm2 stor_mnic ONLINE successfully
2013/09/02 23:58:21 VCS ERROR V-16-2-13067 (duadm1) Agent is calling clean for resource(stor_p) because the resource became OFFLINE unexpectedly, on its own.
2013/09/02 23:58:21 VCS ERROR V-16-2-13070 (duadm1) Resource(stor_p) - clean not implemented.
2013/09/02 23:58:22 VCS INFO V-16-1-10307 Resource stor_p (Owner: Unspecified, Group: StorLan) is offline on duadm1 (Not initiated by VCS)
2013/09/02 23:58:22 VCS ERROR V-16-1-10205 Group StorLan is faulted on system duadm1
2013/09/02 23:58:22 VCS NOTICE V-16-1-10446 Group StorLan is offline on system duadm1
2013/09/02 23:58:22 VCS INFO V-16-1-10493 Evaluating duadm1 as potential target node for group StorLan
2013/09/02 23:58:22 VCS INFO V-16-1-50010 Group StorLan is online or faulted on system duadm1
2013/09/02 23:58:22 VCS INFO V-16-1-10493 Evaluating duadm2 as potential target node for group StorLan
2013/09/02 23:58:22 VCS INFO V-16-1-50010 Group StorLan is online or faulted on system duadm2
2013/09/02 23:58:22 VCS NOTICE V-16-1-10235 Restart is set for group StorLan. Group will be brought online if fault on persistent resource clears. If group is brought online anywhere else from AutoStartList or manually, then Restart will be reset
2013/09/02 23:58:22 VCS INFO V-16-6-0 (duadm1) resfault:(resfault) Invoked with arg0=duadm1, arg1=stor_p, arg2=ONLINE
2013/09/02 23:58:22 VCS INFO V-16-6-0 (duadm1) postoffline:(postoffline) Invoked with arg0=duadm1, arg1=StorLan
2013/09/02 23:58:22 VCS INFO V-16-0 (duadm1) resfault:(resfault.sh) Invoked with arg0=/ericsson/core/cluster/scripts/resfault.sh, arg1=duadm1 ,arg2=stor_p
2013/09/02 23:58:22 VCS INFO V-16-6-0 (duadm1) postoffline:Executing /ericsson/core/cluster/scripts/postoffline.sh with arg0=duadm1, arg1=StorLan
2013/09/02 23:58:22 VCS INFO V-16-6-15002 (duadm1) hatrigger:hatrigger executed /opt/VRTSvcs/bin/triggers/resfault duadm1 stor_p ONLINE successfully
2013/09/02 23:58:22 VCS INFO V-16-6-0 (duadm1) postoffline.sh:StorLan:Nothing done
2013/09/02 23:58:22 VCS INFO V-16-6-0 (duadm1) postoffline:Completed execution of /ericsson/core/cluster/scripts/postoffline.sh for group StorLan
2013/09/02 23:58:22 VCS INFO V-16-6-15002 (duadm1) hatrigger:hatrigger executed /opt/VRTSvcs/bin/triggers/postoffline duadm1 StorLan successfully
2013/09/02 23:58:24 VCS ERROR V-16-2-13027 (duadm1) Resource(cluster_maint) - monitor procedure did not complete within the expected time.
2013/09/02 23:58:30 VCS ERROR V-16-2-13067 (duadm2) Agent is calling clean for resource(stor_p) because the resource became OFFLINE unexpectedly, on its own.
2013/09/02 23:58:30 VCS ERROR V-16-2-13070 (duadm2) Resource(stor_p) - clean not implemented.
2013/09/02 23:58:31 VCS INFO V-16-1-10307 Resource stor_p (Owner: Unspecified, Group: StorLan) is offline on duadm2 (Not initiated by VCS)
2013/09/02 23:58:31 VCS ERROR V-16-1-10205 Group StorLan is faulted on system duadm2
2013/09/02 23:58:31 VCS NOTICE V-16-1-10446 Group StorLan is offline on system duadm2
2013/09/02 23:58:31 VCS INFO V-16-1-10493 Evaluating duadm1 as potential target node for group StorLan
2013/09/02 23:58:31 VCS INFO V-16-1-50010 Group StorLan is online or faulted on system duadm1
2013/09/02 23:58:31 VCS INFO V-16-1-10493 Evaluating duadm2 as potential target node for group StorLan
2013/09/02 23:58:31 VCS INFO V-16-1-50010 Group StorLan is online or faulted on system duadm2
2013/09/02 23:58:31 VCS NOTICE V-16-1-10235 Restart is set for group StorLan. Group will be brought online if fault on persistent resource clears. If group is brought online anywhere else from AutoStartList or manually, then Restart will be reset
2013/09/02 23:58:31 VCS INFO V-16-6-0 (duadm2) resfault:(resfault) Invoked with arg0=duadm2, arg1=stor_p, arg2=ONLINE
2013/09/02 23:58:31 VCS INFO V-16-6-0 (duadm2) postoffline:(postoffline) Invoked with arg0=duadm2, arg1=StorLan
2013/09/02 23:58:31 VCS INFO V-16-0 (duadm2) resfault:(resfault.sh) Invoked with arg0=/ericsson/core/cluster/scripts/resfault.sh, arg1=duadm2 ,arg2=stor_p
2013/09/02 23:58:31 VCS INFO V-16-6-0 (duadm2) postoffline:Executing /ericsson/core/cluster/scripts/postoffline.sh with arg0=duadm2, arg1=StorLan
2013/09/02 23:58:31 VCS INFO V-16-6-15002 (duadm2) hatrigger:hatrigger executed /opt/VRTSvcs/bin/triggers/resfault duadm2 stor_p ONLINE successfully
2013/09/02 23:58:31 VCS INFO V-16-6-0 (duadm2) postoffline.sh:StorLan:Nothing done
2013/09/02 23:58:31 VCS INFO V-16-6-0 (duadm2) postoffline:Completed execution of /ericsson/core/cluster/scripts/postoffline.sh for group StorLan
2013/09/02 23:58:31 VCS INFO V-16-6-15002 (duadm2) hatrigger:hatrigger executed /opt/VRTSvcs/bin/triggers/postoffline duadm2 StorLan successfully
2013/09/02 23:59:03 VCS INFO V-16-10001-1 (duadm1) Application:stop_oss:stop_oss.sh:Waiting for Oss to go Offline
2013/09/02 23:59:33 VCS INFO V-16-10001-88 (duadm1) Application:syb_log_mon:offline:Executed [/ericsson/core/cluster/scripts/svc.sh stop /ericsson/eric_3pp/sybase_log_monitor:default] successfully.
2013/09/02 23:59:33 VCS INFO V-16-10001-88 (duadm1) Application:syb_proc_mon:offline:Executed [/ericsson/core/cluster/scripts/svc.sh stop /ericsson/eric_3pp/sybase_process_monitor:default] successfully.
2013/09/02 23:59:34 VCS INFO V-16-10001-88 (duadm1) Application:fmria:offline:Executed [/ericsson/core/cluster/scripts/svc.sh stop /ericsson/eric_ep/riadaemon:default] successfully.
2013/09/02 23:59:35 VCS INFO V-16-10001-88 (duadm1) Application:trapdist:offline:Executed [/ericsson/core/cluster/scripts/svc.sh stop /ericsson/eric_ep/trapd:default] successfully.
2013/09/02 23:59:35 VCS INFO V-16-10001-88 (duadm1) Application:vrsnt_log_mon:offline:Executed [/ericsson/core/cluster/scripts/svc.sh stop /ericsson/eric_3pp/versant_log_monitor:default] successfully.
2013/09/02 23:59:36 VCS INFO V-16-1-10305 Resource alex (Owner: Unspecified, Group: Oss) is offline on duadm1 (VCS initiated)
2013/09/02 23:59:37 VCS INFO V-16-10001-88 (duadm1) Application:imgr_tomcat:offline:Executed [/ericsson/core/cluster/scripts/svc.sh stop /ericsson/eric_3pp/tomcat:default] successfully.
2013/09/02 23:59:37 VCS INFO V-16-1-10305 Resource apache (Owner: Unspecified, Group: Oss) is offline on duadm1 (VCS initiated)
2013/09/02 23:59:38 VCS INFO V-16-1-10305 Resource cron (Owner: Unspecified, Group: Oss) is offline on duadm1 (VCS initiated)
2013/09/02 23:59:40 VCS INFO V-16-2-13075 (duadm1) Resource(tomcat) has reported unexpected OFFLINE 1 times, which is still within the ToleranceLimit(2).
2013/09/02 23:59:50 VCS INFO V-16-1-10305 Resource restart_mc (Owner: Unspecified, Group: Oss) is offline on duadm1 (VCS initiated)
2013/09/02 23:59:50 VCS NOTICE V-16-1-10300 Initiating Offline of Resource smssr (Owner: Unspecified, Group: Oss) on System duadm1
2013/09/02 23:59:50 VCS INFO V-16-1-10305 Resource syb_proc_mon (Owner: Unspecified, Group: Oss) is offline on duadm1 (VCS initiated)
2013/09/02 23:59:52 VCS INFO V-16-1-10305 Resource syb_log_mon (Owner: Unspecified, Group: Oss) is offline on duadm1 (VCS initiated)
2013/09/02 23:59:52 VCS INFO V-16-1-10305 Resource trapdist (Owner: Unspecified, Group: Oss) is offline on duadm1 (VCS initiated)
2013/09/02 23:59:52 VCS INFO V-16-1-10305 Resource fmria (Owner: Unspecified, Group: Oss) is offline on duadm1 (VCS initiated)
2013/09/02 23:59:54 VCS INFO V-16-1-10305 Resource vrsnt_log_mon (Owner: Unspecified, Group: Oss) is offline on duadm1 (VCS initiated)
2013/09/02 23:59:55 VCS INFO V-16-10001-88 (duadm1) Application:smssr:offline:Executed [/ericsson/core/cluster/scripts/svc.sh stop /ericsson/ossrc/ssr:default] successfully.
2013/09/02 23:59:57 VCS INFO V-16-1-10305 Resource imgr_tomcat (Owner: Unspecified, Group: Oss) is offline on duadm1 (VCS initiated)
2013/09/03 00:00:00 VCS INFO V-16-1-10305 Resource smssr (Owner: Unspecified, Group: Oss) is offline on duadm1 (VCS initiated)
2013/09/03 00:00:00 VCS NOTICE V-16-1-10300 Initiating Offline of Resource supervisor (Owner: Unspecified, Group: Oss) on System duadm1
2013/09/03 00:00:03 VCS INFO V-16-10001-88 (duadm1) Application:supervisor:offline:Executed [/ericsson/core/cluster/scripts/svc.sh stop /ericsson/ossrc/ssrProcessSupervisor:default] successfully.
2013/09/03 00:00:03 VCS INFO V-16-10001-1 (duadm1) Application:stop_oss:stop_oss.sh:Waiting for Oss to go Offline
2013/09/03 00:00:06 VCS INFO V-16-1-10305 Resource supervisor (Owner: Unspecified, Group: Oss) is offline on duadm1 (VCS initiated)
2013/09/03 00:00:06 VCS NOTICE V-16-1-10300 Initiating Offline of Resource tbs (Owner: Unspecified, Group: Oss) on System duadm1
2013/09/03 00:00:11 VCS INFO V-16-2-13075 (duadm1) Resource(tomcat) has reported unexpected OFFLINE 2 times, which is still within the ToleranceLimit(2).
2013/09/03 00:01:03 VCS INFO V-16-10001-1 (duadm1) Application:stop_oss:stop_oss.sh:Waiting for Oss to go Offline
2013/09/03 00:01:08 VCS ERROR V-16-2-13067 (duadm1) Agent is calling clean for resource(tomcat) because the resource became OFFLINE unexpectedly, on its own.
2013/09/03 00:01:08 VCS INFO V-16-2-13068 (duadm1) Resource(tomcat) - clean completed successfully.
2013/09/03 00:01:08 VCS ERROR V-16-2-13073 (duadm1) Resource(tomcat) became OFFLINE unexpectedly on its own. Agent is restarting (attempt number 1 of 2) the resource.
2013/09/03 00:01:08 VCS INFO V-16-10001-88 (duadm1) Application:tomcat:online:Executed [/ericsson/core/cluster/scripts/svc.sh start /ericsson/eric_3pp/tomcat:default] successfully.
2013/09/03 00:01:09 VCS INFO V-16-2-13716 (duadm1) Resource(tomcat): Output of the completed operation (online)
==============================================
svcadm: Instance "svc:/ericsson/eric_3pp/tomcat:default" is not in a maintenance or degraded state.
==============================================
2013/09/03 00:01:11 VCS NOTICE V-16-2-13076 (duadm1) Agent has successfully restarted resource(tomcat).
2013/09/03 00:01:11 VCS INFO V-16-1-55031 Resource tomcat in online state received recurring online message on system duadm1
2013/09/03 00:01:35 VCS INFO V-16-10001-88 (duadm1) Application:activemq_oss_loggingbroker:offline:Executed [/ericsson/core/cluster/scripts/svc.sh stop /ericsson/eric_3pp/activemq_oss_loggingbroker:default] successfully.
2013/09/03 00:01:35 VCS INFO V-16-10001-88 (duadm1) Application:activemq:offline:Executed [/ericsson/core/cluster/scripts/svc.sh stop /ericsson/eric_3pp/activemq:default] successfully.
2013/09/03 00:01:36 VCS INFO V-16-2-13716 (duadm1) Resource(activemq): Output of the completed operation (offline)
==============================================
svcadm: Instance "svc:/ericsson/eric_3pp/activemq:default" is in maintenance state.
==============================================
2013/09/03 00:01:36 VCS INFO V-16-2-13716 (duadm1) Resource(activemq_oss_loggingbroker): Output of the completed operation (offline)
==============================================
svcadm: Instance "svc:/ericsson/eric_3pp/activemq_oss_loggingbroker:default" is in maintenance state.
==============================================
2013/09/03 00:01:38 VCS INFO V-16-1-10305 Resource activemq (Owner: Unspecified, Group: Oss) is offline on duadm1 (VCS initiated)
2013/09/03 00:01:38 VCS INFO V-16-1-10305 Resource activemq_oss_loggingbroker (Owner: Unspecified, Group: Oss) is offline on duadm1 (VCS initiated)
2013/09/03 00:01:39 VCS ERROR V-16-20018-25 (duadm2) SybaseBk:masterdataservice_BACKUP:offline:Sybase Backup service masterdataservice_BACKUP cannot be stopped by isql - shutdown
2013/09/03 00:01:39 VCS ERROR V-16-20018-17 (duadm2) SybaseBk:masterdataservice_BACKUP:offline:clean scripts should do the process kill sequence
2013/09/03 00:01:39 VCS INFO V-16-2-13716 (duadm2) Resource(masterdataservice_BACKUP): Output of the completed operation (offline)
==============================================
Password:
CT-LIBRARY error:
ct_connect(): user api layer: internal Client Library error: Read from the server has timed out.
Password:
CT-LIBRARY error:
ct_connect(): user api layer: internal Client Library error: Read from the server has timed out.
==============================================
2013/09/03 00:01:40 VCS ERROR V-16-2-13064 (duadm2) Agent is calling clean for resource(masterdataservice_BACKUP) because the resource is up even after offline completed.
2013/09/03 00:01:40 VCS ERROR V-16-20018-20 (duadm2) SybaseBk:masterdataservice_BACKUP:clean:kill -15 of 16419
2013/09/03 00:02:01 VCS INFO V-16-2-13068 (duadm2) Resource(masterdataservice_BACKUP) - clean completed successfully.
2013/09/03 00:02:01 VCS WARNING V-16-20018-301 (duadm2) SybaseBk:masterdataservice_BACKUP:monitor:Open for backupserver failed, setting cookie to NULL
2013/09/03 00:02:01 VCS INFO V-16-1-10305 Resource masterdataservice_BACKUP (Owner: Unspecified, Group: Sybase1) is offline on duadm2 (VCS initiated)
2013/09/03 00:02:01 VCS NOTICE V-16-1-10300 Initiating Offline of Resource masterdataservice (Owner: Unspecified, Group: Sybase1) on System duadm2
2013/09/03 00:02:03 VCS INFO V-16-10001-1 (duadm1) Application:stop_oss:stop_oss.sh:Waiting for Oss to go Offline
2013/09/03 00:02:32 VCS WARNING V-16-2-13011 (duadm1) Resource(hyperic_app): offline procedure did not complete within the expected time.
2013/09/03 00:02:32 VCS ERROR V-16-2-13063 (duadm1) Agent is calling clean for resource(hyperic_app) because offline did not complete within the expected time.
2013/09/03 00:02:32 VCS WARNING V-16-2-13011 (duadm1) Resource(ddc_app): offline procedure did not complete within the expected time.
2013/09/03 00:02:32 VCS ERROR V-16-2-13063 (duadm1) Agent is calling clean for resource(ddc_app) because offline did not complete within the expected time.
2013/09/03 00:02:32 VCS INFO V-16-2-13068 (duadm1) Resource(hyperic_app) - clean completed successfully.
2013/09/03 00:02:32 VCS INFO V-16-2-13068 (duadm1) Resource(ddc_app) - clean completed successfully.
2013/09/03 00:02:34 VCS INFO V-16-1-10305 Resource hyperic_app (Owner: Unspecified, Group: DDCMon) is offline on duadm1 (VCS initiated)
2013/09/03 00:02:34 VCS INFO V-16-1-10305 Resource ddc_app (Owner: Unspecified, Group: DDCMon) is offline on duadm1 (VCS initiated)
2013/09/03 00:02:34 VCS NOTICE V-16-1-10300 Initiating Offline of Resource ddc_mount (Owner: Unspecified, Group: DDCMon) on System duadm1
2013/09/03 00:03:03 VCS INFO V-16-10001-1 (duadm1) Application:stop_oss:stop_oss.sh:Waiting for Oss to go Offline
2013/09/03 00:03:05 VCS WARNING V-16-2-13011 (duadm1) Resource(glassfish): offline procedure did not complete within the expected time.
2013/09/03 00:03:05 VCS ERROR V-16-2-13063 (duadm1) Agent is calling clean for resource(glassfish) because offline did not complete within the expected time.
2013/09/03 00:03:07 VCS INFO V-16-10001-88 (duadm1) Application:tbs:offline:Executed [/ericsson/core/cluster/scripts/svc.sh stop /ericsson/eric_ep/TBS:default] successfully.
2013/09/03 00:03:08 VCS INFO V-16-2-13716 (duadm1) Resource(tbs): Output of the completed operation (offline)
==============================================
svcadm: Instance "svc:/ericsson/eric_ep/TBS:default" is in maintenance state.
==============================================
2013/09/03 00:03:08 VCS WARNING V-16-2-13011 (duadm2) Resource(hyperic_app): offline procedure did not complete within the expected time.
2013/09/03 00:03:08 VCS WARNING V-16-2-13011 (duadm2) Resource(ddc_app): offline procedure did not complete within the expected time.
2013/09/03 00:03:08 VCS ERROR V-16-2-13063 (duadm2) Agent is calling clean for resource(hyperic_app) because offline did not complete within the expected time.
2013/09/03 00:03:08 VCS ERROR V-16-2-13063 (duadm2) Agent is calling clean for resource(ddc_app) because offline did not complete within the expected time.
2013/09/03 00:03:08 VCS INFO V-16-2-13068 (duadm2) Resource(ddc_app) - clean completed successfully.
2013/09/03 00:03:08 VCS INFO V-16-2-13068 (duadm2) Resource(hyperic_app) - clean completed successfully.
2013/09/03 00:03:10 VCS INFO V-16-1-10305 Resource tbs (Owner: Unspecified, Group: Oss) is offline on duadm1 (VCS initiated)
2013/09/03 00:03:10 VCS NOTICE V-16-1-10300 Initiating Offline of Resource ext_notif (Owner: Unspecified, Group: Oss) on System duadm1
2013/09/03 00:03:10 VCS NOTICE V-16-1-10300 Initiating Offline of Resource ext_nsa (Owner: Unspecified, Group: Oss) on System duadm1
2013/09/03 00:03:10 VCS NOTICE V-16-1-10300 Initiating Offline of Resource log_service (Owner: Unspecified, Group: Oss) on System duadm1
2013/09/03 00:03:10 VCS NOTICE V-16-1-10300 Initiating Offline of Resource notif (Owner: Unspecified, Group: Oss) on System duadm1
2013/09/03 00:03:10 VCS NOTICE V-16-1-10300 Initiating Offline of Resource nsa (Owner: Unspecified, Group: Oss) on System duadm1
2013/09/03 00:03:10 VCS NOTICE V-16-1-10300 Initiating Offline of Resource oad (Owner: Unspecified, Group: Oss) on System duadm1
2013/09/03 00:03:10 VCS NOTICE V-16-1-10300 Initiating Offline of Resource opendj (Owner: Unspecified, Group: Oss) on System duadm1
2013/09/03 00:03:10 VCS NOTICE V-16-1-10300 Initiating Offline of Resource osagent (Owner: Unspecified, Group: Oss) on System duadm1
2013/09/03 00:03:10 VCS NOTICE V-16-1-10300 Initiating Offline of Resource rmi_reg (Owner: Unspecified, Group: Oss) on System duadm1
2013/09/03 00:03:10 VCS NOTICE V-16-1-10300 Initiating Offline of Resource rmi_reg_ext (Owner: Unspecified, Group: Oss) on System duadm1
2013/09/03 00:03:10 VCS NOTICE V-16-1-10300 Initiating Offline of Resource sb_nsa (Owner: Unspecified, Group: Oss) on System duadm1
2013/09/03 00:03:10 VCS NOTICE V-16-1-10300 Initiating Offline of Resource sentinel (Owner: Unspecified, Group: Oss) on System duadm1
2013/09/03 00:03:10 VCS NOTICE V-16-1-10300 Initiating Offline of Resource time_service (Owner: Unspecified, Group: Oss) on System duadm1
2013/09/03 00:03:10 VCS NOTICE V-16-1-10300 Initiating Offline of Resource tomcat (Owner: Unspecified, Group: Oss) on System duadm1
2013/09/03 00:03:10 VCS INFO V-16-1-10305 Resource ddc_app (Owner: Unspecified, Group: DDCMon) is offline on duadm2 (VCS initiated)
2013/09/03 00:03:10 VCS INFO V-16-1-10305 Resource hyperic_app (Owner: Unspecified, Group: DDCMon) is offline on duadm2 (VCS initiated)
2013/09/03 00:03:10 VCS NOTICE V-16-1-10300 Initiating Offline of Resource ddc_mount (Owner: Unspecified, Group: DDCMon) on System duadm2
2013/09/03 00:04:03 VCS INFO V-16-10001-1 (duadm1) Application:stop_oss:stop_oss.sh:Waiting for Oss to go Offline
2013/09/03 00:04:06 VCS ERROR V-16-2-13006 (duadm1) Resource(glassfish): clean procedure did not complete within the expected time.
2013/09/03 00:05:03 VCS INFO V-16-10001-1 (duadm1) Application:stop_oss:stop_oss.sh:Waiting for Oss to go Offline
2013/09/03 00:06:03 VCS INFO V-16-10001-1 (duadm1) Application:stop_oss:stop_oss.sh:Waiting for Oss to go Offline
2013/09/03 00:07:01 VCS INFO V-16-2-13003 (duadm2) Resource(masterdataservice): Output of the timed out operation (offline)
Password:
CT-LIBRARY error:
ct_connect(): user api layer: internal Client Library error: Read from the server has timed out.
2013/09/03 00:07:01 VCS WARNING V-16-2-13011 (duadm2) Resource(masterdataservice): offline procedure did not complete within the expected time.
2013/09/03 00:07:01 VCS ERROR V-16-2-13063 (duadm2) Agent is calling clean for resource(masterdataservice) because offline did not complete within the expected time.
2013/09/03 00:07:02 VCS ERROR V-16-20018-20 (duadm2) Sybase:masterdataservice:clean:kill -15 of 16142
2013/09/03 00:07:03 VCS INFO V-16-10001-1 (duadm1) Application:stop_oss:stop_oss.sh:Waiting for Oss to go Offline
2013/09/03 00:07:22 VCS INFO V-16-20018-51 (duadm2) Sybase:masterdataservice:clean:Could not open directory [/var/tmp/sybase_shm/masterdataservice] for read [No such file or directory]
2013/09/03 00:07:22 VCS NOTICE V-16-20018-78 (duadm2) Sybase:masterdataservice:clean:Cannot remove Shared Memory.
2013/09/03 00:07:22 VCS INFO V-16-2-13068 (duadm2) Resource(masterdataservice) - clean completed successfully.
2013/09/03 00:07:23 VCS WARNING V-16-20018-301 (duadm2) Sybase:masterdataservice:monitor:Open for dataserver failed, setting cookie to NULL
2013/09/03 00:07:23 VCS INFO V-16-1-10305 Resource masterdataservice (Owner: Unspecified, Group: Sybase1) is offline on duadm2 (VCS initiated)
2013/09/03 00:07:23 VCS NOTICE V-16-1-10300 Initiating Offline of Resource syb1bak_ip (Owner: Unspecified, Group: Sybase1) on System duadm2
2013/09/03 00:07:23 VCS NOTICE V-16-1-10300 Initiating Offline of Resource dbdumps_mount (Owner: Unspecified, Group: Sybase1) on System duadm2
2013/09/03 00:07:23 VCS NOTICE V-16-1-10300 Initiating Offline of Resource fmsybdata_mount (Owner: Unspecified, Group: Sybase1) on System duadm2
2013/09/03 00:07:23 VCS NOTICE V-16-1-10300 Initiating Offline of Resource fmsyblog_mount (Owner: Unspecified, Group: Sybase1) on System duadm2
2013/09/03 00:07:23 VCS NOTICE V-16-1-10300 Initiating Offline of Resource pmsybdata_mount (Owner: Unspecified, Group: Sybase1) on System duadm2
2013/09/03 00:07:23 VCS NOTICE V-16-1-10300 Initiating Offline of Resource pmsyblog_mount (Owner: Unspecified, Group: Sybase1) on System duadm2
2013/09/03 00:07:23 VCS NOTICE V-16-1-10300 Initiating Offline of Resource sybdata_mount (Owner: Unspecified, Group: Sybase1) on System duadm2
2013/09/03 00:07:23 VCS NOTICE V-16-1-10300 Initiating Offline of Resource syblog_mount (Owner: Unspecified, Group: Sybase1) on System duadm2
2013/09/03 00:07:23 VCS NOTICE V-16-1-10300 Initiating Offline of Resource sybmaster_mount (Owner: Unspecified, Group: Sybase1) on System duadm2
2013/09/03 00:07:25 VCS INFO V-16-1-10305 Resource dbdumps_mount (Owner: Unspecified, Group: Sybase1) is offline on duadm2 (VCS initiated)
2013/09/03 00:07:25 VCS INFO V-16-1-10305 Resource fmsyblog_mount (Owner: Unspecified, Group: Sybase1) is offline on duadm2 (VCS initiated)
2013/09/03 00:07:25 VCS INFO V-16-1-10305 Resource pmsybdata_mount (Owner: Unspecified, Group: Sybase1) is offline on duadm2 (VCS initiated)
2013/09/03 00:07:25 VCS INFO V-16-1-10305 Resource pmsyblog_mount (Owner: Unspecified, Group: Sybase1) is offline on duadm2 (VCS initiated)
2013/09/03 00:07:25 VCS INFO V-16-1-10305 Resource sybdata_mount (Owner: Unspecified, Group: Sybase1) is offline on duadm2 (VCS initiated)
2013/09/03 00:07:25 VCS INFO V-16-1-10305 Resource fmsybdata_mount (Owner: Unspecified, Group: Sybase1) is offline on duadm2 (VCS initiated)
2013/09/03 00:07:25 VCS INFO V-16-1-10305 Resource syblog_mount (Owner: Unspecified, Group: Sybase1) is offline on duadm2 (VCS initiated)
2013/09/03 00:07:25 VCS INFO V-16-1-10305 Resource syb1bak_ip (Owner: Unspecified, Group: Sybase1) is offline on duadm2 (VCS initiated)
2013/09/03 00:07:35 VCS WARNING V-16-2-13011 (duadm1) Resource(ddc_mount): offline procedure did not complete within the expected time.
2013/09/03 00:07:35 VCS ERROR V-16-2-13063 (duadm1) Agent is calling clean for resource(ddc_mount) because offline did not complete within the expected time.
2013/09/03 00:07:35 VCS INFO V-16-10001-5530 (duadm1) Mount:ddc_mount:clean:Checking for loopback mount.
2013/09/03 00:08:03 VCS INFO V-16-10001-1 (duadm1) Application:stop_oss:stop_oss.sh:Waiting for Oss to go Offline
2013/09/03 00:08:11 VCS WARNING V-16-2-13011 (duadm1) Resource(ext_nsa): offline procedure did not complete within the expected time.
2013/09/03 00:08:11 VCS WARNING V-16-2-13011 (duadm1) Resource(notif): offline procedure did not complete within the expected time.
2013/09/03 00:08:11 VCS ERROR V-16-2-13063 (duadm1) Agent is calling clean for resource(notif) because offline did not complete within the expected time.
2013/09/03 00:08:11 VCS ERROR V-16-2-13063 (duadm1) Agent is calling clean for resource(ext_nsa) because offline did not complete within the expected time.
2013/09/03 00:08:11 VCS WARNING V-16-2-13011 (duadm1) Resource(oad): offline procedure did not complete within the expected time.
2013/09/03 00:08:11 VCS ERROR V-16-2-13063 (duadm1) Agent is calling clean for resource(oad) because offline did not complete within the expected time.
2013/09/03 00:08:11 VCS INFO V-16-2-13068 (duadm1) Resource(oad) - clean completed successfully.
2013/09/03 00:08:11 VCS INFO V-16-2-13068 (duadm1) Resource(notif) - clean completed successfully.
2013/09/03 00:08:11 VCS WARNING V-16-2-13011 (duadm1) Resource(ext_notif): offline procedure did not complete within the expected time.
2013/09/03 00:08:11 VCS ERROR V-16-2-13063 (duadm1) Agent is calling clean for resource(ext_notif) because offline did not complete within the expected time.
2013/09/03 00:08:11 VCS WARNING V-16-2-13011 (duadm1) Resource(log_service): offline procedure did not complete within the expected time.
2013/09/03 00:08:11 VCS ERROR V-16-2-13063 (duadm1) Agent is calling clean for resource(log_service) because offline did not complete within the expected time.
2013/09/03 00:08:11 VCS WARNING V-16-2-13011 (duadm1) Resource(nsa): offline procedure did not complete within the expected time.
2013/09/03 00:08:11 VCS ERROR V-16-2-13063 (duadm1) Agent is calling clean for resource(nsa) because offline did not complete within the expected time.
2013/09/03 00:08:11 VCS INFO V-16-2-13068 (duadm1) Resource(nsa) - clean completed successfully.
2013/09/03 00:08:11 VCS INFO V-16-2-13068 (duadm1) Resource(ext_nsa) - clean completed successfully.
2013/09/03 00:08:12 VCS WARNING V-16-2-13011 (duadm2) Resource(ddc_mount): offline procedure did not complete within the expected time.
2013/09/03 00:08:12 VCS ERROR V-16-2-13063 (duadm2) Agent is calling clean for resource(ddc_mount) because offline did not complete within the expected time.
2013/09/03 00:08:12 VCS INFO V-16-10001-5530 (duadm2) Mount:ddc_mount:clean:Checking for loopback mount.
2013/09/03 00:08:12 VCS WARNING V-16-2-13011 (duadm1) Resource(osagent): offline procedure did not complete within the expected time.
2013/09/03 00:08:12 VCS WARNING V-16-2-13011 (duadm1) Resource(opendj): offline procedure did not complete within the expected time.
2013/09/03 00:08:12 VCS ERROR V-16-2-13063 (duadm1) Agent is calling clean for resource(osagent) because offline did not complete within the expected time.
2013/09/03 00:08:12 VCS ERROR V-16-2-13063 (duadm1) Agent is calling clean for resource(opendj) because offline did not complete within the expected time.
2013/09/03 00:08:12 VCS INFO V-16-2-13068 (duadm1) Resource(osagent) - clean completed successfully.
2013/09/03 00:08:36 VCS ERROR V-16-2-13006 (duadm1) Resource(ddc_mount): clean procedure did not complete within the expected time.
2013/09/03 00:09:03 VCS INFO V-16-10001-1 (duadm1) Application:stop_oss:stop_oss.sh:Waiting for Oss to go Offline
2013/09/03 00:09:07 VCS WARNING V-16-2-13011 (duadm1) Resource(rmi_reg_ext): offline procedure did not complete within the expected time.
2013/09/03 00:09:07 VCS ERROR V-16-2-13063 (duadm1) Agent is calling clean for resource(rmi_reg_ext) because offline did not complete within the expected time.
2013/09/03 00:09:07 VCS INFO V-16-2-13068 (duadm1) Resource(rmi_reg_ext) - clean completed successfully.
2013/09/03 00:09:11 VCS ERROR V-16-2-13006 (duadm1) Resource(ext_notif): clean procedure did not complete within the expected time.
2013/09/03 00:09:11 VCS ERROR V-16-2-13006 (duadm1) Resource(log_service): clean procedure did not complete within the expected time.
2013/09/03 00:09:13 VCS ERROR V-16-2-13006 (duadm2) Resource(ddc_mount): clean procedure did not complete within the expected time.
2013/09/03 00:09:13 VCS ERROR V-16-2-13006 (duadm1) Resource(opendj): clean procedure did not complete within the expected time.
2013/09/03 00:09:28 VCS INFO V-16-1-10305 Resource oad (Owner: Unspecified, Group: Oss) is offline on duadm1 (VCS initiated)
2013/09/03 00:09:28 VCS NOTICE V-16-1-10300 Initiating Offline of Resource gui_nsa (Owner: Unspecified, Group: Oss) on System duadm1
2013/09/03 00:09:28 VCS INFO V-16-1-10305 Resource nsa (Owner: Unspecified, Group: Oss) is offline on duadm1 (VCS initiated)
2013/09/03 00:09:28 VCS INFO V-16-1-10305 Resource notif (Owner: Unspecified, Group: Oss) is offline on duadm1 (VCS initiated)
2013/09/03 00:09:30 VCS INFO V-16-1-10305 Resource rmi_reg_ext (Owner: Unspecified, Group: Oss) is offline on duadm1 (VCS initiated)
2013/09/03 00:09:30 VCS INFO V-16-1-10305 Resource osagent (Owner: Unspecified, Group: Oss) is offline on duadm1 (VCS initiated)
2013/09/03 00:09:30 VCS INFO V-16-1-10305 Resource ext_nsa (Owner: Unspecified, Group: Oss) is offline on duadm1 (VCS initiated)
2013/09/03 00:09:32 VCS INFO V-16-1-10305 Resource glassfish (Owner: Unspecified, Group: Oss) is offline on duadm1 (VCS initiated)
2013/09/03 00:09:36 VCS ERROR V-16-2-13077 (duadm1) Agent is unable to offline resource(ddc_mount). Administrative intervention may be required.
2013/09/03 00:09:36 VCS INFO V-16-10001-5530 (duadm1) Mount:ddc_mount:clean:Checking for loopback mount.
2013/09/03 00:10:03 VCS INFO V-16-10001-1 (duadm1) Application:stop_oss:stop_oss.sh:Waiting for Oss to go Offline
2013/09/03 00:10:12 VCS ERROR V-16-2-13210 (duadm1) Agent is calling clean for resource(cluster_maint) because 4 successive invocations of the monitor procedure did not complete within the expected time.
2013/09/03 00:10:13 VCS ERROR V-16-2-13077 (duadm2) Agent is unable to offline resource(ddc_mount). Administrative intervention may be required.
2013/09/03 00:10:13 VCS INFO V-16-10001-5530 (duadm2) Mount:ddc_mount:clean:Checking for loopback mount.
2013/09/03 00:10:13 VCS INFO V-16-1-10305 Resource ext_notif (Owner: Unspecified, Group: Oss) is offline on duadm1 (VCS initiated)
2013/09/03 00:10:13 VCS INFO V-16-1-10305 Resource log_service (Owner: Unspecified, Group: Oss) is offline on duadm1 (VCS initiated)
2013/09/03 00:10:16 VCS INFO V-16-1-10305 Resource opendj (Owner: Unspecified, Group: Oss) is offline on duadm1 (VCS initiated)
2013/09/03 00:11:03 VCS INFO V-16-10001-1 (duadm1) Application:stop_oss:stop_oss.sh:Waiting for Oss to go Offline
2013/09/03 00:11:13 VCS ERROR V-16-2-13006 (duadm1) Resource(cluster_maint): clean procedure did not complete within the expected time.
2013/09/03 00:11:37 VCS INFO V-16-10001-5530 (duadm1) Mount:ddc_mount:clean:Checking for loopback mount.
2013/09/03 00:12:04 VCS INFO V-16-10001-1 (duadm1) Application:stop_oss:stop_oss.sh:Waiting for Oss to go Offline
2013/09/03 00:12:14 VCS INFO V-16-10001-5530 (duadm2) Mount:ddc_mount:clean:Checking for loopback mount.
2013/09/03 00:12:16 VCS INFO V-16-2-13026 (duadm1) Resource(cluster_maint) - monitor procedure finished successfully after failing to complete within the expected time for (4) consecutive times.
2013/09/03 00:12:16 VCS INFO V-16-2-13075 (duadm1) Resource(cluster_maint) has reported unexpected OFFLINE 1 times, which is still within the ToleranceLimit(2).
2013/09/03 00:12:24 VCS WARNING V-16-2-13011 (duadm2) Resource(sybmaster_mount): offline procedure did not complete within the expected time.
2013/09/03 00:12:24 VCS ERROR V-16-2-13063 (duadm2) Agent is calling clean for resource(sybmaster_mount) because offline did not complete within the expected time.
2013/09/03 00:12:24 VCS INFO V-16-10001-5530 (duadm2) Mount:sybmaster_mount:clean:Checking for loopback mount.
2013/09/03 00:13:04 VCS INFO V-16-10001-1 (duadm1) Application:stop_oss:stop_oss.sh:Waiting for Oss to go Offline
2013/09/03 00:13:13 VCS WARNING V-16-2-13011 (duadm1) Resource(tomcat): offline procedure did not complete within the expected time.
2013/09/03 00:13:13 VCS WARNING V-16-2-13011 (duadm1) Resource(time_service): offline procedure did not complete within the expected time.
2013/09/03 00:13:13 VCS ERROR V-16-2-13063 (duadm1) Agent is calling clean for resource(tomcat) because offline did not complete within the expected time.
2013/09/03 00:13:13 VCS WARNING V-16-2-13011 (duadm1) Resource(rmi_reg): offline procedure did not complete within the expected time.
2013/09/03 00:13:13 VCS ERROR V-16-2-13063 (duadm1) Agent is calling clean for resource(time_service) because offline did not complete within the expected time.
2013/09/03 00:13:13 VCS ERROR V-16-2-13063 (duadm1) Agent is calling clean for resource(rmi_reg) because offline did not complete within the expected time.
2013/09/03 00:13:13 VCS WARNING V-16-2-13011 (duadm1) Resource(sb_nsa): offline procedure did not complete within the expected time.
2013/09/03 00:13:13 VCS ERROR V-16-2-13063 (duadm1) Agent is calling clean for resource(sb_nsa) because offline did not complete within the expected time.
2013/09/03 00:13:13 VCS INFO V-16-2-13068 (duadm1) Resource(tomcat) - clean completed successfully.
2013/09/03 00:13:13 VCS INFO V-16-2-13068 (duadm1) Resource(rmi_reg) - clean completed successfully.
2013/09/03 00:13:14 VCS WARNING V-16-2-13011 (duadm1) Resource(sentinel): offline procedure did not complete within the expected time.
2013/09/03 00:13:14 VCS ERROR V-16-2-13063 (duadm1) Agent is calling clean for resource(sentinel) because offline did not complete within the expected time.
2013/09/03 00:13:14 VCS INFO V-16-2-13068 (duadm1) Resource(sentinel) - clean completed successfully.
2013/09/03 00:13:15 VCS INFO V-16-1-10305 Resource rmi_reg (Owner: Unspecified, Group: Oss) is offline on duadm1 (VCS initiated)
2013/09/03 00:13:15 VCS INFO V-16-1-10305 Resource tomcat (Owner: Unspecified, Group: Oss) is offline on duadm1 (VCS initiated)
2013/09/03 00:13:16 VCS INFO V-16-1-10305 Resource sentinel (Owner: Unspecified, Group: Oss) is offline on duadm1 (VCS initiated)
2013/09/03 00:13:16 VCS INFO V-16-2-13075 (duadm1) Resource(cluster_maint) has reported unexpected OFFLINE 2 times, which is still within the ToleranceLimit(2).
2013/09/03 00:13:25 VCS ERROR V-16-2-13006 (duadm2) Resource(sybmaster_mount): clean procedure did not complete within the expected time.
2013/09/03 00:13:38 VCS INFO V-16-10001-5530 (duadm1) Mount:ddc_mount:clean:Checking for loopback mount.
2013/09/03 00:14:04 VCS INFO V-16-10001-1 (duadm1) Application:stop_oss:stop_oss.sh:Waiting for Oss to go Offline
2013/09/03 00:14:14 VCS ERROR V-16-2-13006 (duadm1) Resource(sb_nsa): clean procedure did not complete within the expected time.
2013/09/03 00:14:14 VCS ERROR V-16-2-13006 (duadm1) Resource(time_service): clean procedure did not complete within the expected time.
2013/09/03 00:14:15 VCS ERROR V-16-2-13067 (duadm1) Agent is calling clean for resource(cluster_maint) because the resource became OFFLINE unexpectedly, on its own.
2013/09/03 00:14:15 VCS INFO V-16-10001-5530 (duadm2) Mount:ddc_mount:clean:Checking for loopback mount.
2013/09/03 00:14:16 VCS INFO V-16-1-10305 Resource time_service (Owner: Unspecified, Group: Oss) is offline on duadm1 (VCS initiated)
2013/09/03 00:14:16 VCS INFO V-16-1-10305 Resource sb_nsa (Owner: Unspecified, Group: Oss) is offline on duadm1 (VCS initiated)
2013/09/03 00:14:25 VCS ERROR V-16-2-13077 (duadm2) Agent is unable to offline resource(sybmaster_mount). Administrative intervention may be required.
2013/09/03 00:14:26 VCS INFO V-16-1-53504 VCS Engine Alive message!!
2013/09/03 00:14:26 VCS INFO V-16-10001-5530 (duadm2) Mount:sybmaster_mount:clean:Checking for loopback mount.
2013/09/03 00:14:39 VCS WARNING V-16-2-13011 (duadm1) Resource(gui_nsa): offline procedure did not complete within the expected time.
2013/09/03 00:14:39 VCS ERROR V-16-2-13063 (duadm1) Agent is calling clean for resource(gui_nsa) because offline did not complete within the expected time.
2013/09/03 00:14:39 VCS INFO V-16-2-13068 (duadm1) Resource(gui_nsa) - clean completed successfully.
2013/09/03 00:14:41 VCS INFO V-16-1-10305 Resource gui_nsa (Owner: Unspecified, Group: Oss) is offline on duadm1 (VCS initiated)
2013/09/03 00:14:41 VCS NOTICE V-16-1-10300 Initiating Offline of Resource change_versant (Owner: Unspecified, Group: Oss) on System duadm1
2013/09/03 00:15:04 VCS INFO V-16-10001-1 (duadm1) Application:stop_oss:stop_oss.sh:Waiting for Oss to go Offline
2013/09/03 00:15:16 VCS ERROR V-16-2-13006 (duadm1) Resource(cluster_maint): clean procedure did not complete within the expected time.
2013/09/03 00:15:40 VCS INFO V-16-10001-5530 (duadm1) Mount:ddc_mount:clean:Checking for loopback mount.
2013/09/03 00:16:04 VCS INFO V-16-10001-1 (duadm1) Application:stop_oss:stop_oss.sh:Waiting for Oss to go Offline
2013/09/03 00:16:16 VCS INFO V-16-10001-5530 (duadm2) Mount:ddc_mount:clean:Checking for loopback mount.
2013/09/03 00:16:27 VCS INFO V-16-10001-5530 (duadm2) Mount:sybmaster_mount:clean:Checking for loopback mount.
2013/09/03 00:17:04 VCS INFO V-16-10001-1 (duadm1) Application:stop_oss:stop_oss.sh:Waiting for Oss to go Offline
2013/09/03 00:17:41 VCS INFO V-16-10001-5530 (duadm1) Mount:ddc_mount:clean:Checking for loopback mount.
2013/09/03 00:17:58 VCS WARNING V-16-2-13011 (duadm1) Resource(stop_oss): offline procedure did not complete within the expected time.
2013/09/03 00:17:58 VCS ERROR V-16-2-13063 (duadm1) Agent is calling clean for resource(stop_oss) because offline did not complete within the expected time.
2013/09/03 00:18:04 VCS INFO V-16-1-50135 User root fired command: hagrp -switch Oss duadm2 from localhost
2013/09/03 00:18:04 VCS INFO V-16-10001-1 (duadm1) Application:stop_oss:stop_oss.sh:Waiting for Oss to go Offline
2013/09/03 00:18:18 VCS INFO V-16-10001-5530 (duadm2) Mount:ddc_mount:clean:Checking for loopback mount.
2013/09/03 00:18:28 VCS INFO V-16-10001-5530 (duadm2) Mount:sybmaster_mount:clean:Checking for loopback mount.
2013/09/03 00:18:59 VCS ERROR V-16-2-13006 (duadm1) Resource(stop_oss): clean procedure did not complete within the expected time.
2013/09/03 00:19:42 VCS INFO V-16-10001-5530 (duadm1) Mount:ddc_mount:clean:Checking for loopback mount.
2013/09/03 00:19:42 VCS WARNING V-16-2-13011 (duadm1) Resource(change_versant): offline procedure did not complete within the expected time.
2013/09/03 00:19:42 VCS ERROR V-16-2-13063 (duadm1) Agent is calling clean for resource(change_versant) because offline did not complete within the expected time.
2013/09/03 00:19:42 VCS INFO V-16-2-13068 (duadm1) Resource(change_versant) - clean completed successfully.
2013/09/03 00:19:44 VCS INFO V-16-1-10305 Resource change_versant (Owner: Unspecified, Group: Oss) is offline on duadm1 (VCS initiated)
2013/09/03 00:19:44 VCS NOTICE V-16-1-10446 Group Oss is offline on system duadm1
2013/09/03 00:19:44 VCS INFO V-16-6-15025 (duadm2) hatrigger:invoking nfs_preonline
2013/09/03 00:19:44 VCS INFO V-16-6-15076 (duadm2) hatrigger:invoking regular preonline trigger if it exists
2013/09/03 00:19:44 VCS INFO V-16-6-0 (duadm1) postoffline:(postoffline) Invoked with arg0=duadm1, arg1=Oss
2013/09/03 00:19:44 VCS INFO V-16-6-0 (duadm1) postoffline:Executing /ericsson/core/cluster/scripts/postoffline.sh with arg0=duadm1, arg1=Oss
2013/09/03 00:19:44 VCS INFO V-16-6-0 (duadm2) preonline:(preonline) Invoked with arg0=duadm2, arg1=Oss
2013/09/03 00:19:44 VCS INFO V-16-6-0 (duadm2) preonline:Executing /ericsson/core/cluster/scripts/preonline.sh with arg0=duadm2, arg1=Oss
2013/09/03 00:19:44 VCS INFO V-16-6-0 (duadm1) postoffline.sh:Oss:Stop old FM processes
2013/09/03 00:19:44 VCS INFO V-16-6-0 (duadm1) postoffline.sh:Oss:Stopping dummy SMF resources for Sybase
2013/09/03 00:19:52 VCS INFO V-16-6-0 (duadm2) preonline.sh:Oss:Waiting for Sybase1 to come online globally
2013/09/03 00:20:00 VCS INFO V-16-6-0 (duadm2) preonline.sh:Oss:Waiting for Sybase1 to come online globally
2013/09/03 00:20:01 VCS INFO V-16-1-10305 Resource stop_oss (Owner: Unspecified, Group: Ossfs) is offline on duadm1 (VCS initiated)
2013/09/03 00:20:01 VCS NOTICE V-16-1-10300 Initiating Offline of Resource cluster_maint (Owner: Unspecified, Group: Ossfs) on System duadm1
2013/09/03 00:20:01 VCS NOTICE V-16-1-10300 Initiating Offline of Resource smrs_nfs (Owner: Unspecified, Group: Ossfs) on System duadm1
2013/09/03 00:20:01 VCS NOTICE V-16-1-10300 Initiating Offline of Resource ossbak_ip (Owner: Unspecified, Group: Ossfs) on System duadm1
2013/09/03 00:20:01 VCS NOTICE V-16-1-10300 Initiating Offline of Resource eba_ebsg_mount (Owner: Unspecified, Group: Ossfs) on System duadm1
2013/09/03 00:20:01 VCS NOTICE V-16-1-10300 Initiating Offline of Resource eba_ebss_mount (Owner: Unspecified, Group: Ossfs) on System duadm1
2013/09/03 00:20:01 VCS NOTICE V-16-1-10300 Initiating Offline of Resource eba_ebsw_mount (Owner: Unspecified, Group: Ossfs) on System duadm1
2013/09/03 00:20:01 VCS NOTICE V-16-1-10300 Initiating Offline of Resource eba_rede_mount (Owner: Unspecified, Group: Ossfs) on System duadm1
2013/09/03 00:20:01 VCS NOTICE V-16-1-10300 Initiating Offline of Resource eba_rtt_mount (Owner: Unspecified, Group: Ossfs) on System duadm1
2013/09/03 00:20:01 VCS NOTICE V-16-1-10300 Initiating Offline of Resource home_mount (Owner: Unspecified, Group: Ossfs) on System duadm1
2013/09/03 00:20:01 VCS NOTICE V-16-1-10300 Initiating Offline of Resource nms_cosm_mount (Owner: Unspecified, Group: Ossfs) on System duadm1
2013/09/03 00:20:01 VCS NOTICE V-16-1-10300 Initiating Offline of Resource pmstor_mount (Owner: Unspecified, Group: Ossfs) on System duadm1
2013/09/03 00:20:01 VCS NOTICE V-16-1-10300 Initiating Offline of Resource segment1_mount (Owner: Unspecified, Group: Ossfs) on System duadm1
2013/09/03 00:20:01 VCS NOTICE V-16-1-10300 Initiating Offline of Resource sgwcg_mount (Owner: Unspecified, Group: Ossfs) on System duadm1
2013/09/03 00:20:01 VCS NOTICE V-16-1-10300 Initiating Offline of Resource versant_mount (Owner: Unspecified, Group: Ossfs) on System duadm1
2013/09/03 00:20:01 VCS NOTICE V-16-1-10300 Initiating Offline of Resource a3pp_share (Owner: Unspecified, Group: Ossfs) on System duadm1
2013/09/03 00:20:01 VCS NOTICE V-16-1-10300 Initiating Offline of Resource ericsson_share (Owner: Unspecified, Group: Ossfs) on System duadm1
2013/09/03 00:20:01 VCS NOTICE V-16-1-10300 Initiating Offline of Resource etc_share (Owner: Unspecified, Group: Ossfs) on System duadm1
2013/09/03 00:20:01 VCS NOTICE V-16-1-10300 Initiating Offline of Resource mail_share (Owner: Unspecified, Group: Ossfs) on System duadm1
2013/09/03 00:20:01 VCS NOTICE V-16-1-10300 Initiating Offline of Resource opt_share (Owner: Unspecified, Group: Ossfs) on System duadm1
2013/09/03 00:20:01 VCS NOTICE V-16-1-10300 Initiating Offline of Resource upgrade_share (Owner: Unspecified, Group: Ossfs) on System duadm1
2013/09/03 00:20:01 VCS NOTICE V-16-1-10300 Initiating Offline of Resource usr_local_share (Owner: Unspecified, Group: Ossfs) on System duadm1
2013/09/03 00:20:01 VCS NOTICE V-16-1-10300 Initiating Offline of Resource var_share (Owner: Unspecified, Group: Ossfs) on System duadm1
2013/09/03 00:20:02 VCS INFO V-16-1-10305 Resource a3pp_share (Owner: Unspecified, Group: Ossfs) is offline on duadm1 (VCS initiated)
2013/09/03 00:20:02 VCS INFO V-16-1-10305 Resource ericsson_share (Owner: Unspecified, Group: Ossfs) is offline on duadm1 (VCS initiated)
2013/09/03 00:20:02 VCS INFO V-16-1-10305 Resource etc_share (Owner: Unspecified, Group: Ossfs) is offline on duadm1 (VCS initiated)
2013/09/03 00:20:02 VCS INFO V-16-1-10305 Resource opt_share (Owner: Unspecified, Group: Ossfs) is offline on duadm1 (VCS initiated)
2013/09/03 00:20:02 VCS INFO V-16-1-10305 Resource usr_local_share (Owner: Unspecified, Group: Ossfs) is offline on duadm1 (VCS initiated)
2013/09/03 00:20:02 VCS INFO V-16-1-10305 Resource mail_share (Owner: Unspecified, Group: Ossfs) is offline on duadm1 (VCS initiated)
2013/09/03 00:20:02 VCS INFO V-16-1-10305 Resource var_share (Owner: Unspecified, Group: Ossfs) is offline on duadm1 (VCS initiated)
2013/09/03 00:20:02 VCS INFO V-16-1-10305 Resource upgrade_share (Owner: Unspecified, Group: Ossfs) is offline on duadm1 (VCS initiated)
2013/09/03 00:20:03 VCS INFO V-16-1-10305 Resource ossbak_ip (Owner: Unspecified, Group: Ossfs) is offline on duadm1 (VCS initiated)
2013/09/03 00:20:08 VCS INFO V-16-6-0 (duadm2) preonline.sh:Oss:Waiting for Sybase1 to come online globally
2013/09/03 00:20:17 VCS INFO V-16-6-0 (duadm2) preonline.sh:Oss:Waiting for Sybase1 to come online globally
2013/09/03 00:20:19 VCS INFO V-16-10001-5530 (duadm2) Mount:ddc_mount:clean:Checking for loopback mount.
2013/09/03 00:20:25 VCS INFO V-16-6-0 (duadm2) preonline.sh:Oss:Waiting for Sybase1 to come online globally
2013/09/03 00:20:29 VCS INFO V-16-10001-5530 (duadm2) Mount:sybmaster_mount:clean:Checking for loopback mount.
2013/09/03 00:20:33 VCS INFO V-16-6-0 (duadm2) preonline.sh:Oss:Waiting for Sybase1 to come online globally
2013/09/03 00:20:41 VCS INFO V-16-6-0 (duadm2) preonline.sh:Oss:Waiting for Sybase1 to come online globally
2013/09/03 00:20:49 VCS INFO V-16-6-0 (duadm2) preonline.sh:Oss:Waiting for Sybase1 to come online globally
2013/09/03 00:20:57 VCS INFO V-16-6-0 (duadm2) preonline.sh:Oss:Waiting for Sybase1 to come online globally
2013/09/03 00:21:03 VCS INFO V-16-1-10305 Resource eba_ebsw_mount (Owner: Unspecified, Group: Ossfs) is offline on duadm1 (VCS initiated)
2013/09/03 00:21:03 VCS INFO V-16-1-10305 Resource eba_rtt_mount (Owner: Unspecified, Group: Ossfs) is offline on duadm1 (VCS initiated)
2013/09/03 00:21:03 VCS INFO V-16-1-10305 Resource eba_rede_mount (Owner: Unspecified, Group: Ossfs) is offline on duadm1 (VCS initiated)
2013/09/03 00:21:03 VCS INFO V-16-1-10305 Resource pmstor_mount (Owner: Unspecified, Group: Ossfs) is offline on duadm1 (VCS initiated)
2013/09/03 00:21:03 VCS INFO V-16-1-10305 Resource eba_ebss_mount (Owner: Unspecified, Group: Ossfs) is offline on duadm1 (VCS initiated)
2013/09/03 00:21:03 VCS INFO V-16-1-10305 Resource nms_cosm_mount (Owner: Unspecified, Group: Ossfs) is offline on duadm1 (VCS initiated)
2013/09/03 00:21:03 VCS INFO V-16-1-10305 Resource eba_ebsg_mount (Owner: Unspecified, Group: Ossfs) is offline on duadm1 (VCS initiated)
2013/09/03 00:21:03 VCS INFO V-16-1-10305 Resource segment1_mount (Owner: Unspecified, Group: Ossfs) is offline on duadm1 (VCS initiated)
2013/09/03 00:21:04 VCS INFO V-16-1-10305 Resource versant_mount (Owner: Unspecified, Group: Ossfs) is offline on duadm1 (VCS initiated)
2013/09/03 00:21:05 VCS INFO V-16-6-0 (duadm2) preonline.sh:Oss:Waiting for Sybase1 to come online globally
2013/09/03 00:21:13 VCS INFO V-16-6-0 (duadm2) preonline.sh:Oss:Waiting for Sybase1 to come online globally
2013/09/03 00:21:21 VCS INFO V-16-6-0 (duadm2) preonline.sh:Oss:Waiting for Sybase1 to come online globally
2013/09/03 00:21:29 VCS INFO V-16-6-0 (duadm2) preonline.sh:Oss:Waiting for Sybase1 to come online globally
Solved! Go to Solution.
09-12-2013 04:09 AM
As you correctly have proxies, then setting ToleranceLimit on the MultiNICB type may help a little as it would delay faulting the MultinicB resources which in turn MAY delay the faulting of the Proxies but as the default MonitorInternal for MultiNICB is only 10 seconds, so even setting ToleranceLimit to 2 would only give an extra 20 seconds for MulitNICB, which means above that it would have faulted at 23:57:47, rather than 23:57:27, but as the Proxy ran its monitor at 23:57:57, this wouldn't have made any difference, so you could set ToleranceLimit on Proxy type instead.
Mike
09-09-2013 01:48 AM
Both your heartbeat links failed at the same time and so from a VCS perpective each node thinks the other one has gone down as it has lost all connection to it and so each node will try to take over the others service groups and then when the network returns you then have the same IP up on both nodes, so you get further errors.
Your heartbeats should be independent and so for VCS to interpret a networt failure, incorrectly as a node failure should mean that 2 pieces of network hardware failed at the same time which is very very rare, so more likely, your heartbeats are not independent and you have a single point of failure (SPOF), where the failure of one component caused both heartbeats (and it looks like the public network too) to fail.
Therefore you need to look at the design of your hearbeats, each heartbeats should:
Mike
09-10-2013 08:31 AM
I think the node is showing to be in jeopardy
Sep 2 23:35:21 duadm2 genunix: [ID 608499 kern.notice] GAB INFO V-15-1-20037 Port h gen 288706 jeopardy ;1
Sep 2 23:35:21 duadm2 genunix: [ID 316943 kern.notice] GAB INFO V-15-1-20036 Port a gen 288701 membership 01
Sep 2 23:35:21 duadm2 genunix: [ID 608499 kern.notice] GAB INFO V-15-1-20037 Port a gen 288701 jeopardy ;1
Sep 2 23:35:21 duadm2 genunix: [ID 316943 kern.notice] GAB INFO V-15-1-20036 Port b gen 288704 membership 01
Sep 2 23:35:21 duadm2 genunix: [ID 608499 kern.notice] GAB INFO V-15-1-20037 Port b gen 288704 jeopardy ;1
Sep 2 23:35:21 duadm2 Had[6267]: [ID 702911 daemon.notice] VCS INFO V-16-1-10077 Received new cluster membership
Sep 2 23:35:21 duadm2 Had[6267]: [ID 702911 daemon.notice] VCS ERROR V-16-1-10111 System duadm2 (Node '1') is in Regular and Jeopardy Memberships - Membership: 0x3
if all the heartbeats have lost , then there should be a iofencing that have kicked in and could have made one node down.
09-10-2013 10:25 AM
Yes you're right - I had only looked at messages log and had assumed you had 2 heartbeats, but I have now noticed your links are named link0 and link2, implying you may have a 3rd link called link1 and engine log confirms this as it says:
Previous status = oce1 UP oce8 UP oce0 UP; Current status = oce1 DOWN oce8 UP oce0 DOWN
However this still means you probably have 2 heartbeats sharing a common component as oce1 and oce0 went down at the same time which probably means they are connected to same switch. This means (when all 3 heartbeats are working) if you lost NIC oce8, then the cluster would NOT go into jeopardy and then sometime later before you have fixed oce8 (could be several hours or days later) if you lost switch that oce1 and oce0 are connected to then you would loose the 2 remaining heartbeats at the same time and this would cause split-brain. So you should still look at the design of your hearbeats to ensure they are all independent.
In terms of finding out what has caused the service outage - it is evident that you had a network outage and this caused network resources in VCS to fault and fail service groups.
Mike
09-11-2013 08:43 AM
Thanks Mike for helping me understnad the situation.
You have helped me in the past and you are always there for my rescue :)
As in previous post you mentioned that both nodes tried to online IP resource and thats why there were
duplicate ip as below
if the node went to jeopardy only then what could be the reason for below messages.
Sep 2 23:57:10 duadm2 ip: [ID 876157 kern.warning] WARNING: node 44:1e:a1:74:e5:46 is using our IP address 010.001.014.027 on oce9
Sep 2 23:57:10 duadm2 ip: [ID 876157 kern.warning] WARNING: node 44:1e:a1:74:e5:46 is using our IP address 010.001.014.030 on oce9
Sep 2 23:57:10 duadm2 ip: [ID 876157 kern.warning] WARNING: node 44:1e:a1:74:e5:46 is using our IP address 010.001.014.027 on oce9
Sep 2 23:57:10 duadm2 ip: [ID 876157 kern.warning] WARNING: node 44:1e:a1:74:e5:46 is using our IP address 010.001.014.030 on oce9
Sep 2 23:57:10 duadm2 ip: [ID 876157 kern.warning] WARNING: node 44:1e:a1:74:e5:46 is using our IP address 010.001.014.027 on oce9
Sep 2 23:57:10 duadm2 ip: [ID 876157 kern.warning] WARNING: node 44:1e:a1:74:e5:46 is using our IP address 010.001.014.030 on oce9
Sep 2 23:57:10 duadm2 ip: [ID 567813 kern.warning] WARNING: oce9:2 has duplicate address 010.001.014.027 (claimed by 44:1e:a1:74:e5:46); disabled
Sep 2 23:57:10 duadm2 ip: [ID 567813 kern.warning] WARNING: oce9:1 has duplicate address 010.001.014.030 (claimed by 44:1e:a1:74:e5:46); disabled
09-11-2013 03:27 PM
Here is my analysis:
At 23:35 interface oce0 and oce1 failed - these are both heartbeats and also oce0 belongs to Mpathd group PubLan which is paired with oce9:
Sep 2 23:35:04 duadm2 genunix: [ID 140958 kern.notice] LLT INFO V-14-1-10205 link 0 (oce1) node 0 in troubleSep 2 23:35:05 duadm2 genunix: [ID 140958 kern.notice] LLT INFO V-14-1-10205 link 2 (oce0) node 0 in trouble
For some reason, Solaris mpathd does not detect the failure of oce0 untl 20 mins later, where it fails over to oce9
Sep 2 23:56:57 duadm2 in.mpathd[6542]: [ID 594170 daemon.error] NIC failure detected on oce0 of group pub_mnicSep 2 23:56:57 duadm2 in.mpathd[6542]: [ID 832587 daemon.error] Successfully failed over from NIC oce0 to NIC oce9
Sep 2 23:57:01 duadm2 in.mpathd[6542]: [ID 299542 daemon.error] NIC repair detected on oce0 of group pub_mnicSep 2 23:57:01 duadm2 in.mpathd[6542]: [ID 620804 daemon.error] Successfully failed back to NIC oce0
Sep 2 23:57:20 duadm2 in.mpathd[6542]: [ID 168056 daemon.error] All Interfaces in group pub_mnic have failed
Mike
09-11-2013 11:46 PM
There was a message from mpathd deamon before detecting failure,
Sep 2 23:56:50 duadm2 in.mpathd[6542]: [ID 585766 daemon.error] Cannot meet requested failure detection time of 10000 ms on (inet oce2) new failure detection time
for group "stor_mnic" is 48426 ms
Sep 2 23:56:57 duadm2 in.mpathd[6542]: [ID 594170 daemon.error] NIC failure detected on oce0 of group pub_mnic
Also,
publan is a parallel service group, if it got failed on one node, it should have worked from other node
and service should be available. isnt it ?
B BkupLan duadm1 Y N ONLINE
B BkupLan duadm2 Y N ONLINE
B DDCMon duadm1 Y N ONLINE
B DDCMon duadm2 Y N ONLINE
B Oss duadm1 Y N ONLINE
B Oss duadm2 Y N OFFLINE
B Ossfs duadm1 Y N ONLINE
B Ossfs duadm2 Y N OFFLINE
B PrivLan duadm1 Y N ONLINE
B PrivLan duadm2 Y N ONLINE
B PubLan duadm1 Y N ONLINE
B PubLan duadm2 Y N ONLINE
B StorLan duadm1 Y N ONLINE
B StorLan duadm2 Y N ONLINE
B Sybase1 duadm1 Y N OFFLINE
B Sybase1 duadm2 Y N ONLINE
One more question, does incresing the toelrance limit of multinic resources will help here?
09-12-2013 01:19 AM
publan was just the first resource to be failed and this didn't cause any other resources to offline, but after this you also had:
09-12-2013 01:21 AM
hares -display pub_mnic
#Resource Attribute System Value
pub_mnic Group global PubLan
pub_mnic Type global MultiNICB
pub_mnic AutoStart global 1
pub_mnic Critical global 1
pub_mnic Enabled global 1
pub_mnic LastOnline global duadm1
pub_mnic MonitorOnly global 0
pub_mnic ResourceOwner global
pub_mnic TriggerEvent global 0
pub_mnic ArgListValues duadm1 UseMpathd 1 1 MpathdCommand 1 /usr/lib/inet/in.mpathd ConfigCheck 1 1 MpathdRestart 1 1 Device 4 oce0 0 oce9 1 NetworkHosts 1 10.1.14.1 LinkTestRatio 1 1 IgnoreLinkStatus 1 1 NetworkTimeout 1 100 OnlineTestRepeatCount 1 3 OfflineTestRepeatCount 1 3 NoBroadcast 1 0 DefaultRouter 1 0.0.0.0 Failback 1 0 GroupName 1 "" Protocol 1 IPv4
pub_mnic ArgListValues duadm2 UseMpathd 1 1 MpathdCommand 1 /usr/lib/inet/in.mpathd ConfigCheck 1 1 MpathdRestart 1 1 Device 4 oce0 0 oce9 1 NetworkHosts 1 10.1.14.1 LinkTestRatio 1 1 IgnoreLinkStatus 1 1 NetworkTimeout 1 100 OnlineTestRepeatCount 1 3 OfflineTestRepeatCount 1 3 NoBroadcast 1 0 DefaultRouter 1 0.0.0.0 Failback 1 0 GroupName 1 "" Protocol 1 IPv4
pub_mnic ConfidenceLevel duadm1 0
pub_mnic ConfidenceLevel duadm2 0
pub_mnic ConfidenceMsg duadm1
pub_mnic ConfidenceMsg duadm2
pub_mnic Flags duadm1
pub_mnic Flags duadm2
pub_mnic IState duadm1 not waiting
pub_mnic IState duadm2 not waiting
pub_mnic MonitorMethod duadm1 Traditional
pub_mnic MonitorMethod duadm2 Traditional
pub_mnic Probed duadm1 1
pub_mnic Probed duadm2 1
pub_mnic Start duadm1 0
pub_mnic Start duadm2 0
pub_mnic State duadm1 ONLINE
pub_mnic State duadm2 ONLINE
pub_mnic ComputeStats global 0
pub_mnic ConfigCheck global 1
pub_mnic DefaultRouter global 0.0.0.0
pub_mnic Failback global 0
pub_mnic GroupName global
pub_mnic IgnoreLinkStatus global 1
pub_mnic LinkTestRatio global 1
pub_mnic MpathdCommand global /usr/lib/inet/in.mpathd
pub_mnic MpathdRestart global 1
pub_mnic NetworkHosts global 10.1.14.1
pub_mnic NetworkTimeout global 100
pub_mnic NoBroadcast global 0
pub_mnic OfflineTestRepeatCount global 3
pub_mnic OnlineTestRepeatCount global 3
pub_mnic Protocol global IPv4
pub_mnic TriggerResStateChange global 0
pub_mnic UseMpathd global 1
pub_mnic ContainerInfo duadm1 Type Name Enabled
pub_mnic ContainerInfo duadm2 Type Name Enabled
pub_mnic Device duadm1 oce0 0 oce9 1
pub_mnic Device duadm2 oce0 0 oce9 1
pub_mnic MonitorTimeStats duadm1 Avg 0 TS
pub_mnic MonitorTimeStats duadm2 Avg 0 TS
pub_mnic ResourceInfo duadm1 State Valid Msg TS
09-12-2013 01:28 AM
stor_mnic is also a MULTINICB resource
hares -display stor_mnic
#Resource Attribute System Value
stor_mnic Group global StorLan
stor_mnic Type global MultiNICB
stor_mnic AutoStart global 1
stor_mnic Critical global 1
stor_mnic Enabled global 1
stor_mnic LastOnline global duadm1
stor_mnic MonitorOnly global 0
stor_mnic ResourceOwner global
stor_mnic TriggerEvent global 0
stor_mnic ArgListValues duadm1 UseMpathd 1 1 MpathdCommand 1 /usr/lib/inet/in.mpathd ConfigCheck 1 1 MpathdRestart 1 1 Device 4 oce2 0 oce11 1 NetworkHosts 1 10.0.0.29 LinkTestRatio 1 1 IgnoreLinkStatus 1 1 NetworkTimeout 1 100 OnlineTestRepeatCount 1 3 OfflineTestRepeatCount 1 3 NoBroadcast 1 0 DefaultRouter 1 0.0.0.0 Failback 1 0 GroupName 1 "" Protocol 1 IPv4
stor_mnic ArgListValues duadm2 UseMpathd 1 1 MpathdCommand 1 /usr/lib/inet/in.mpathd ConfigCheck 1 1 MpathdRestart 1 1 Device 4 oce2 0 oce11 1 NetworkHosts 1 10.0.0.29 LinkTestRatio 1 1 IgnoreLinkStatus 1 1 NetworkTimeout 1 100 OnlineTestRepeatCount 1 3 OfflineTestRepeatCount 1 3 NoBroadcast 1 0 DefaultRouter 1 0.0.0.0 Failback 1 0 GroupName 1 "" Protocol 1 IPv4
stor_mnic ConfidenceLevel duadm1 0
stor_mnic ConfidenceLevel duadm2 0
stor_mnic ConfidenceMsg duadm1
stor_mnic ConfidenceMsg duadm2
stor_mnic Flags duadm1
stor_mnic Flags duadm2
stor_mnic IState duadm1 not waiting
stor_mnic IState duadm2 not waiting
stor_mnic MonitorMethod duadm1 Traditional
stor_mnic MonitorMethod duadm2 Traditional
stor_mnic Probed duadm1 1
stor_mnic Probed duadm2 1
stor_mnic Start duadm1 0
stor_mnic Start duadm2 0
stor_mnic State duadm1 ONLINE
stor_mnic State duadm2 ONLINE
stor_mnic ComputeStats global 0
stor_mnic ConfigCheck global 1
stor_mnic DefaultRouter global 0.0.0.0
stor_mnic Failback global 0
stor_mnic GroupName global
stor_mnic IgnoreLinkStatus global 1
stor_mnic LinkTestRatio global 1
stor_mnic MpathdCommand global /usr/lib/inet/in.mpathd
stor_mnic MpathdRestart global 1
stor_mnic NetworkHosts global 10.0.0.29
stor_mnic NetworkTimeout global 100
stor_mnic NoBroadcast global 0
stor_mnic OfflineTestRepeatCount global 3
stor_mnic OnlineTestRepeatCount global 3
stor_mnic Protocol global IPv4
stor_mnic TriggerResStateChange global 0
stor_mnic UseMpathd global 1
stor_mnic ContainerInfo duadm1 Type Name Enabled
stor_mnic ContainerInfo duadm2 Type Name Enabled
stor_mnic Device duadm1 oce2 0 oce11 1
stor_mnic Device duadm2 oce2 0 oce11 1
stor_mnic MonitorTimeStats duadm1 Avg 0 TS
stor_mnic MonitorTimeStats duadm2 Avg 0 TS
stor_mnic ResourceInfo duadm1 State Valid Msg TS
stor_mnic ResourceInfo duadm2 State Valid Msg TS
09-12-2013 02:34 AM
ossfs_p1 is a proxy resource, so , I think it failed because of pub_mnic failed.
hares -display ossfs_p1
#Resource Attribute System Value
ossfs_p1 Group global Ossfs
ossfs_p1 Type global Proxy
ossfs_p1 AutoStart global 1
ossfs_p1 Critical global 1
ossfs_p1 Enabled global 1
ossfs_p1 LastOnline global duadm1
ossfs_p1 MonitorOnly global 0
ossfs_p1 ResourceOwner global
ossfs_p1 TriggerEvent global 0
ossfs_p1 ArgListValues duadm1 TargetResName 1 pub_mnic TargetSysName 1 "" TargetResName:Probed 1 1 TargetResName:State 1 2
ossfs_p1 ArgListValues duadm2 TargetResName 1 pub_mnic TargetSysName 1 "" TargetResName:Probed 1 1 TargetResName:State 1 2
ossfs_p1 ConfidenceLevel duadm1 0
ossfs_p1 ConfidenceLevel duadm2 0
ossfs_p1 ConfidenceMsg duadm1
ossfs_p1 ConfidenceMsg duadm2
ossfs_p1 Flags duadm1
ossfs_p1 Flags duadm2
ossfs_p1 IState duadm1 not waiting
ossfs_p1 IState duadm2 not waiting
ossfs_p1 MonitorMethod duadm1 Traditional
ossfs_p1 MonitorMethod duadm2 Traditional
ossfs_p1 Probed duadm1 1
ossfs_p1 Probed duadm2 1
ossfs_p1 Start duadm1 0
ossfs_p1 Start duadm2 0
ossfs_p1 State duadm1 ONLINE
ossfs_p1 State duadm2 ONLINE
ossfs_p1 ComputeStats global 0
ossfs_p1 ResourceInfo global State Valid Msg TS
ossfs_p1 TargetResName global pub_mnic
ossfs_p1 TargetSysName global
ossfs_p1 TriggerResStateChange global 0
ossfs_p1 ContainerInfo duadm1 Type Name Enabled
ossfs_p1 ContainerInfo duadm2 Type Name Enabled
ossfs_p1 MonitorTimeStats duadm1 Avg 0 TS
ossfs_p1 MonitorTimeStats duadm2 Avg 0 TS
root@duadm1>
09-12-2013 04:09 AM
As you correctly have proxies, then setting ToleranceLimit on the MultiNICB type may help a little as it would delay faulting the MultinicB resources which in turn MAY delay the faulting of the Proxies but as the default MonitorInternal for MultiNICB is only 10 seconds, so even setting ToleranceLimit to 2 would only give an extra 20 seconds for MulitNICB, which means above that it would have faulted at 23:57:47, rather than 23:57:27, but as the Proxy ran its monitor at 23:57:57, this wouldn't have made any difference, so you could set ToleranceLimit on Proxy type instead.
Mike
09-25-2013 03:16 AM
Hello Mike ,
as we know that due to network outage , multunicb resources and then proxy resources faulted
that caused the failure of other dependent resources.
I need your expert advice on the recovery procedure here.
Say , if the network gets corrected then will alll the resources come online at its own ?
or administartive action is required.
In our case , it didnot come up automaticaly after network is back.
09-25-2013 08:17 AM
Hello Mike ,
as we know that due to network outage , multunicb resources and then proxy resources faulted
that caused the failure of other dependent resources.
I need your expert advice on the recovery procedure here.
Say , if the network gets corrected then will alll the resources come online at its own ?
or administartive action is required.
In our case , it didnot come up automaticaly after network is back.