cancel
Showing results for 
Search instead for 
Did you mean: 

Veritas Cluster Filesystem was down

Claudio_Alejand
Level 2
Certified

Hi Everyone,

 

I had an issue last Sunday , two cluster nodes with the status (state: out of cluster) and the node 1 was rebooted, below are the most important logs.

 

2013/11/16 05:36:16 VCS ERROR V-16-1-10205 Group cvm is faulted on system rsgisap1
2013/11/16 05:37:47 VCS ERROR V-16-1-10205 Group cvm is faulted on system rsgisap2


2013/11/16 05:34:00 VCS NOTICE V-16-1-10080 System (rsgisap2) - Membership: 0x2, Jeopardy: 0x1
2013/11/16 05:34:00 VCS ERROR V-16-1-10091 System rsgisap1 (Node '0') is in Jeopardy Membership - Membership: 0x2, Visible: 0x0
2013/11/16 05:36:16 VCS NOTICE V-16-1-10080 System (rsgisap2) - Membership: 0x2, Jeopardy: 0x0
2013/11/16 05:50:55 VCS NOTICE V-16-1-10080 System (rsgisap2) - Membership: 0x2, Jeopardy: 0x1
2013/11/16 05:50:55 VCS ERROR V-16-1-10091 System rsgisap1 (Node '0') is in Jeopardy Membership - Membership: 0x2, Visible: 0x0
2013/11/16 05:51:12 VCS NOTICE V-16-1-10080 System (rsgisap2) - Membership: 0x3, Jeopardy: 0x0


2013/02/23 03:14:44 VCS ERROR V-16-1-10303 Resource cvm_vxconfigd (Owner: unknown, Group: cvm) is FAULTED (timed out) on sys rsgisap2
2013/11/16 05:34:00 VCS ERROR V-16-1-10322 System rsgisap1 (Node '0') changed state from RUNNING to FAULTED
2013/11/16 05:36:16 VCS ERROR V-16-1-10205 Group vrts_vea_cfs_int_cfsmount1 is faulted on system rsgisap1
2013/11/16 05:36:16 VCS ERROR V-16-1-10205 Group vrts_vea_cfs_int_cfsmount2 is faulted on system rsgisap1
2013/11/16 05:36:16 VCS ERROR V-16-1-10205 Group vrts_vea_cfs_int_cfsmount3 is faulted on system rsgisap1
2013/11/16 05:36:16 VCS ERROR V-16-1-10205 Group vrts_vea_cfs_int_cfsmount4 is faulted on system rsgisap1
2013/11/16 05:36:16 VCS ERROR V-16-1-10205 Group cvm is faulted on system rsgisap1
2013/11/16 05:37:47 VCS ERROR V-16-1-10205 Group cvm is faulted on system rsgisap2
2013/11/16 05:51:12 VCS NOTICE V-16-1-10322 System rsgisap1 (Node '0') changed state from FAULTED to INITING


2013/11/16 05:34:01 VCS NOTICE V-16-1-10446 Group vrts_vea_cfs_int_cfsmount1 is offline on system rsgisap1
2013/11/16 05:34:01 VCS NOTICE V-16-1-10446 Group vrts_vea_cfs_int_cfsmount2 is offline on system rsgisap1
2013/11/16 05:34:01 VCS NOTICE V-16-1-10446 Group vrts_vea_cfs_int_cfsmount3 is offline on system rsgisap1
2013/11/16 05:34:01 VCS NOTICE V-16-1-10446 Group vrts_vea_cfs_int_cfsmount4 is offline on system rsgisap1


2013/11/16 05:37:47 VCS NOTICE V-16-1-10300 Initiating Offline of Resource cvmvoldg4 (Owner: unknown, Group: vrts_vea_cfs_int_cfsmount4) on System rsgisap2
2013/11/16 05:37:47 VCS NOTICE V-16-1-10300 Initiating Offline of Resource cvmvoldg3 (Owner: unknown, Group: vrts_vea_cfs_int_cfsmount3) on System rsgisap2
2013/11/16 05:37:47 VCS NOTICE V-16-1-10300 Initiating Offline of Resource cvmvoldg2 (Owner: unknown, Group: vrts_vea_cfs_int_cfsmount2) on System rsgisap2
2013/11/16 05:37:47 VCS NOTICE V-16-1-10300 Initiating Offline of Resource cvmvoldg1 (Owner: unknown, Group: vrts_vea_cfs_int_cfsmount1) on System rsgisap2

2013/11/16 05:37:49 VCS INFO V-16-6-15004 (rsgisap2) hatrigger:Failed to send trigger for postoffline; script doesn't exist
2013/11/16 05:51:43 VCS INFO V-16-6-15004 (rsgisap1) hatrigger:Failed to send trigger for postonline; script doesn't exist

2013/11/16 05:51:45 VCS INFO V-16-10031-1046 (rsgisap1) CVMVolDg:cvmvoldg1:online:resource cvmvoldg1 is online
2013/11/16 05:51:45 VCS INFO V-16-10031-1046 (rsgisap1) CVMVolDg:cvmvoldg3:online:resource cvmvoldg3 is online
2013/11/16 05:51:45 VCS INFO V-16-10031-1046 (rsgisap1) CVMVolDg:cvmvoldg2:online:resource cvmvoldg2 is online
2013/11/16 05:51:45 VCS INFO V-16-10031-1046 (rsgisap1) CVMVolDg:cvmvoldg4:online:resource cvmvoldg4 is online


2013/11/16 05:51:12 VCS NOTICE V-16-1-10453 Node: 0 changed name from: 'rsgisap1' to: 'rsgisap1'
2013/11/16 05:51:12 VCS NOTICE V-16-1-10322 System rsgisap1 (Node '0') changed state from FAULTED to INITING
2013/11/16 05:51:12 VCS NOTICE V-16-1-10322 System rsgisap1 (Node '0') changed state from INITING to CURRENT_DISCOVER_WAIT
2013/11/16 05:51:12 VCS NOTICE V-16-1-10322 System rsgisap1 (Node '0') changed state from CURRENT_DISCOVER_WAIT to REMOTE_BUILD
2013/11/16 05:51:12 VCS INFO V-16-1-10463 Sending snapshot to node: 0
2013/11/16 05:51:13 VCS NOTICE V-16-1-10322 System rsgisap1 (Node '0') changed state from REMOTE_BUILD to RUNNING

2013/11/16 05:51:14 VCS ERROR V-16-10031-1005 (rsgisap1) CVMCluster:???:monitor:node - state: out of cluster
2013/11/16 05:52:47 VCS ERROR V-16-10031-1005 (rsgisap2) CVMCluster:???:monitor:node - state: out of cluster
 

The OS logs there is no nothing if there was a network problem or storage problem. Cluster nodes are redhat linux 4 and the veritas version is 4.1.

Probably is needed upgrade all , but those servers are in production. I need to know what is root cause.

 

I really appreciate your help.

 

Claudio

 

3 REPLIES 3

Shaf
Level 6

still the cluster filesystem is dow???

Claudio_Alejand
Level 2
Certified

Hi

The CFS are up and running , when master node was rebooted in this node all service group back online, but in the second node I had to start all SG manually.

Thanks

Claudio

 

 

Marianne
Level 6
Partner    VIP    Accredited Certified

Please post more of engine_A log?

Show us all entries that lead up to this point (the first line from the snippet that you posted above):

2013/11/16 05:36:16 VCS ERROR V-16-1-10205 Group cvm is faulted on system rsgisap1