Forum Discussion

Ayan1987's avatar
Ayan1987
Level 3
6 years ago

Veritas 7.4 cluster resources goes offline automatically

2019/01/29 22:14:17 VCS ERROR V-16-2-13067 (LSVPRDMMFE4A) Agent is calling clean for resource(cfsmount5) because the resource became OFFLINE unexpectedly, on its own.

2019/01/29 22:14:18 VCS ERROR V-16-2-13067 (LSVPRDMMFE4A) Agent is calling clean for resource(cfsmount3) because the resource became OFFLINE unexpectedly, on its own.

2019/01/29 22:14:20 VCS ERROR V-16-2-13067 (LSVPRDMMFE4A) Agent is calling clean for resource(cfsmount4) because the resource became OFFLINE unexpectedly, on its own.

2019/01/29 22:14:22 VCS ERROR V-16-2-13067 (LSVPRDMMFE4A) Agent is calling clean for resource(fm_db_Server5) because the resource became OFFLINE unexpectedly, on its own.

2019/01/29 22:14:22 VCS ERROR V-16-2-13067 (LSVPRDMMFE4A) Agent is calling clean for resource(fm_db_ServerERBS) because the resource became OFFLINE unexpectedly, on its own.

2019/01/29 22:14:25 VCS INFO V-16-2-13068 (LSVPRDMMFE4A) Resource(cfsmount4) - clean completed successfully.

2019/01/29 22:14:25 VCS INFO V-16-1-10307 Resource cfsmount4 (Owner: Unspecified, Group: Mediation3_DG) is offline on LSVPRDMMFE4A (Not initiated by VCS)

2019/01/29 22:14:25 VCS INFO V-16-1-50149 VCS shall not initiate failover for service group Mediation3_DG because the faulted resource is not critical and no critical parent resource is affected by the fault

2019/01/29 22:14:25 VCS INFO V-16-6-15015 (LSVPRDMMFE4A) hatrigger:/opt/VRTSvcs/bin/triggers/resfault is not a trigger scripts directory or can not be executed

2019/01/29 22:14:25 VCS INFO V-16-2-13068 (LSVPRDMMFE4A) Resource(cfsmount3) - clean completed successfully.

2019/01/29 22:14:25 VCS INFO V-16-1-10307 Resource cfsmount3 (Owner: Unspecified, Group: Mediation2_DG) is offline on LSVPRDMMFE4A (Not initiated by VCS)

2019/01/29 22:14:25 VCS INFO V-16-1-50149 VCS shall not initiate failover for service group Mediation2_DG because the faulted resource is not critical and no critical parent resource is affected by the fault

2019/01/29 22:14:25 VCS INFO V-16-6-15015 (LSVPRDMMFE4A) hatrigger:/opt/VRTSvcs/bin/triggers/resfault is not a trigger scripts directory or can not be executed

2019/01/29 22:14:25 VCS ERROR V-16-2-13067 (LSVPRDMMFE4A) Agent is calling clean for resource(fm_db_ServerTest1) because the resource became OFFLINE unexpectedly, on its own.

2019/01/29 22:14:25 VCS INFO V-16-2-13068 (LSVPRDMMFE4A) Resource(cfsmount5) - clean completed successfully.

2019/01/29 22:14:25 VCS INFO V-16-1-10307 Resource cfsmount5 (Owner: Unspecified, Group: Mediation4_DG) is offline on LSVPRDMMFE4A (Not initiated by VCS)

2019/01/29 22:14:25 VCS INFO V-16-1-50149 VCS shall not initiate failover for service group Mediation4_DG because the faulted resource is not critical and no critical parent resource is affected by the fault

2019/01/29 22:14:26 VCS INFO V-16-6-15015 (LSVPRDMMFE4A) hatrigger:/opt/VRTSvcs/bin/triggers/resfault is not a trigger scripts directory or can not be executed

2019/01/29 22:14:26 VCS INFO V-16-2-13068 (LSVPRDMMFE4A) Resource(fm_db_Server5) - clean completed successfully.

2019/01/29 22:14:26 VCS INFO V-16-2-13068 (LSVPRDMMFE4A) Resource(fm_db_ServerERBS) - clean completed successfully.

2019/01/29 22:14:26 VCS ERROR V-16-2-13073 (LSVPRDMMFE4A) Resource(fm_db_Server5) became OFFLINE unexpectedly on its own. Agent is restarting (attempt number 1 of 1) the resource.

2019/01/29 22:14:26 VCS ERROR V-16-2-13073 (LSVPRDMMFE4A) Resource(fm_db_ServerERBS) became OFFLINE unexpectedly on its own. Agent is restarting (attempt number 1 of 1) the resource.

  • most common cause of seeing the errors below

     

    2019/01/29 22:14:17 VCS ERROR V-16-2-13067 (LSVPRDMMFE4A) Agent is calling clean for resource(cfsmount5) because the resource became OFFLINE unexpectedly, on its own.

     

    are:

    1. there was a storage related issue resulting in cfsmount issue (offline)

    2. admin mistake (accidentally umount cfs mounts)

    3. server load was too high

    since the issue seems occured on one node only, you can check to see if the same cfsmounts are OK on the othwer node(s) and if the OS on the problematic node sees the storage used by the cfsmounts.  if the OS is not able to access the needed storage, check SAN and stoage to make sure the node can access the luns for cfsmounts.  if everything seems OK but cfsmounts are down. run vxdg list to see if the dgs are still o,[prted.  if the dgs are still im,ported, just run hares -online to online the cfsmounts.

    you can allso run the command below and post the outout

     

    hastatus -sum

     

     

     

  • most common cause of seeing the errors below

     

    2019/01/29 22:14:17 VCS ERROR V-16-2-13067 (LSVPRDMMFE4A) Agent is calling clean for resource(cfsmount5) because the resource became OFFLINE unexpectedly, on its own.

     

    are:

    1. there was a storage related issue resulting in cfsmount issue (offline)

    2. admin mistake (accidentally umount cfs mounts)

    3. server load was too high

    since the issue seems occured on one node only, you can check to see if the same cfsmounts are OK on the othwer node(s) and if the OS on the problematic node sees the storage used by the cfsmounts.  if the OS is not able to access the needed storage, check SAN and stoage to make sure the node can access the luns for cfsmounts.  if everything seems OK but cfsmounts are down. run vxdg list to see if the dgs are still o,[prted.  if the dgs are still im,ported, just run hares -online to online the cfsmounts.

    you can allso run the command below and post the outout

     

    hastatus -sum