cancel
Showing results for 
Search instead for 
Did you mean: 

Disk group need to import by manual after power off the server

Home_224
Level 6

Hi Guys,

I have incident to shutdown the DM cluster server immediately, use operation menu to shutdown the server, check the script it is not to run hatstop , instead of command is use init to shutdown server.  I have DB server , shutdown by use the same menu,  the DB server mount the disk group automatically once start up the server.  But the DM server not able to mount disk group after reboot the server, try to hastop and reboot serveral time, still the same issue.  In this situation, I use vxdg -s import diskgroup to mount the disk one by one node, then it resumed the service.  May I know if there is any problem I need to vxdg import disk group ?

DM cluster server and DB server run in the same version on Solaris 9 .

 

 

1 ACCEPTED SOLUTION

Accepted Solutions

frankgfan
Moderator
Moderator
   VIP   

thanks for re-sending the log.

 

here is my log analysis and some comments.

the system *02 (node ID 1) was restarted at the time below

018/09/25 09:43:15 VCS NOTICE V-16-1-10322 System imduaedms02 (Node '1') changed state from RUNNING to LEAVING

2018/09/25 09:43:17 VCS INFO V-16-1-10494 System imduaedms02 not in RUNNING state

the system *01 was also restarted:

2018/09/25 09:43:43 VCS NOTICE V-16-1-10322 System imduaedms01 (Node '0') changed state from RUNNING to LEAVING

so had was stopped on both nodes!

node *01 came up and had started from local build:

2018/09/25 11:58:07 VCS NOTICE V-16-1-10322 System imduaedms01 (Node '0') changed state from LOCAL_BUILD to RUNNING

the node *02 joined by remote build, after 2 seconds:

2018/09/25 11:58:09 VCS NOTICE V-16-1-10322 System imduaedms02 (Node '1') changed state from REMOTE_BUILD to RUNNING

it took a while for vxconfigd to enter cluster mode so the following errors were logged:

2018/09/25 11:58:10 VCS ERROR V-16-10001-1005 (imduaedms01) CVMCluster:???:monitor:node - state: out of cluster

2018/09/25 11:58:10 VCS INFO V-16-1-10304 Resource cvm_clus (Owner: unknown, Group: cvm) is offline on imduaedms01 (First probe)

2018/09/25 11:58:10 VCS INFO V-16-1-10304 Resource qlogckd (Owner: unknown, Group: cvm) is offline on imduaedms01 (First probe)

 

the errorsl like one below

2018/09/25 11:58:11 VCS ERROR V-16-10001-1005 (imduaedms02) CVMCluster:???:monitor:node - state: out of cluster

logged because cvm was down

dg import error:

2018/09/25 11:58:38 VCS ERROR V-16-10001-1009 (imduaedms01) CVMVolDg:CDG_dms2_datadg:online:could not find diskgroup dms2_datadg imported. If it was previously deported, it will have to be manually imported

2018/09/25 11:58:39 VCS INFO V-16-2-13001 (imduaedms01) Resource(CDG_dms2_datadg): Output of the completed operation (online)

VxVM vxprint ERROR V-5-1-582 Disk group dms2_datadg: No such disk group

for the cvm dg import issue logged above, please refer to this technote: https://www.veritas.com/support/en_US/article.000029451

now the cvm master is up and system *01 is tyhe new cvm master:

2018/09/25 12:30:11 VCS INFO V-16-10001-1003 (imduaedms01) CVMCluster:cvm_clus:online:CVMCluster role is - mode: enabled: cluster active - MASTER

master: imduaedms01

Once the cvm master node is up and the slave node joined, you should be able to manually import the shared dgs on both nodes.

Your VCS is very old: 4.1 on Sol 9

 

2018/09/25 12:51:54 VCS NOTICE V-16-1-11022 VCS engine (had) started

2018/09/25 12:51:54 VCS NOTICE V-16-1-11050 VCS engine version=4.1

2018/09/25 12:51:54 VCS NOTICE V-16-1-11051 VCS engine join version=4.1007

2018/09/25 12:51:54 VCS NOTICE V-16-1-11052 VCS engine pstamp=4.1 11/10/06-22:09:00

2018/09/25 12:51:54 VCS NOTICE V-16-1-10114 Opening GAB library

2018/09/25 12:51:54 VCS NOTICE V-16-1-10619 'HAD' starting on: imduaedms01

 

and Sol is 9:

2018/09/25 09:43:18 VCS INFO V-16-2-13001 (imduaedms02) Resource(JavaMethodServer): Output of the completed operation (offline)

Sun Microsystems Inc.        SunOS 5.9              Generic  May 2002

Both the OS and VCS are no longer supported versions!

 

In summary, you need to check and make sure the followings are all OK:

1. VCS resource dependency

2. storage connectivity from both the cluster nodes

3. CVMTimeout is set to a value long enough fir cvm to be onlined - https://sort.veritas.com/public/documents/sf/5.0/hpux11iv3/html/sfrac_install/ap_sfrac_install_cvmcf...

 

Hopefully the above data analysis and the technote help

Cheers,

Frank

View solution in original post

12 REPLIES 12

RiaanBadenhorst
Moderator
Moderator
Partner    VIP    Accredited Certified

Hello

Post your engine_A and diskgroup logs to check what happened.

Hello,

Thank you  

Let me check with end users if they are available to get the log for me  and get back to you asap

 

frankgfan
Moderator
Moderator
   VIP   

since the dg could be imported manually by running vxdg -s import command meaning there was not  likely a storage nor file system related issue.  It could be a resource online "timing" issue. 

If you just copy and paste the section of the engine_A.log covering the period that the server was rebooted and the dg import failed and I should be able to find the casue of the dg import by VCS failed issue

 

The incident happen on 25 of September 

 

frankgfan
Moderator
Moderator
   VIP   

ks for sending the logs however the logs are too large over 17MB.  Can you please truncate the logs and just send the logs for Sep 25 or from Sep 25 onwards.

Cheers,

Frank

 

yes, i truncate the log and upload as attachment 

 

Thank you very much

frankgfan
Moderator
Moderator
   VIP   

thanks for re-sending the log.

 

here is my log analysis and some comments.

the system *02 (node ID 1) was restarted at the time below

018/09/25 09:43:15 VCS NOTICE V-16-1-10322 System imduaedms02 (Node '1') changed state from RUNNING to LEAVING

2018/09/25 09:43:17 VCS INFO V-16-1-10494 System imduaedms02 not in RUNNING state

the system *01 was also restarted:

2018/09/25 09:43:43 VCS NOTICE V-16-1-10322 System imduaedms01 (Node '0') changed state from RUNNING to LEAVING

so had was stopped on both nodes!

node *01 came up and had started from local build:

2018/09/25 11:58:07 VCS NOTICE V-16-1-10322 System imduaedms01 (Node '0') changed state from LOCAL_BUILD to RUNNING

the node *02 joined by remote build, after 2 seconds:

2018/09/25 11:58:09 VCS NOTICE V-16-1-10322 System imduaedms02 (Node '1') changed state from REMOTE_BUILD to RUNNING

it took a while for vxconfigd to enter cluster mode so the following errors were logged:

2018/09/25 11:58:10 VCS ERROR V-16-10001-1005 (imduaedms01) CVMCluster:???:monitor:node - state: out of cluster

2018/09/25 11:58:10 VCS INFO V-16-1-10304 Resource cvm_clus (Owner: unknown, Group: cvm) is offline on imduaedms01 (First probe)

2018/09/25 11:58:10 VCS INFO V-16-1-10304 Resource qlogckd (Owner: unknown, Group: cvm) is offline on imduaedms01 (First probe)

 

the errorsl like one below

2018/09/25 11:58:11 VCS ERROR V-16-10001-1005 (imduaedms02) CVMCluster:???:monitor:node - state: out of cluster

logged because cvm was down

dg import error:

2018/09/25 11:58:38 VCS ERROR V-16-10001-1009 (imduaedms01) CVMVolDg:CDG_dms2_datadg:online:could not find diskgroup dms2_datadg imported. If it was previously deported, it will have to be manually imported

2018/09/25 11:58:39 VCS INFO V-16-2-13001 (imduaedms01) Resource(CDG_dms2_datadg): Output of the completed operation (online)

VxVM vxprint ERROR V-5-1-582 Disk group dms2_datadg: No such disk group

for the cvm dg import issue logged above, please refer to this technote: https://www.veritas.com/support/en_US/article.000029451

now the cvm master is up and system *01 is tyhe new cvm master:

2018/09/25 12:30:11 VCS INFO V-16-10001-1003 (imduaedms01) CVMCluster:cvm_clus:online:CVMCluster role is - mode: enabled: cluster active - MASTER

master: imduaedms01

Once the cvm master node is up and the slave node joined, you should be able to manually import the shared dgs on both nodes.

Your VCS is very old: 4.1 on Sol 9

 

2018/09/25 12:51:54 VCS NOTICE V-16-1-11022 VCS engine (had) started

2018/09/25 12:51:54 VCS NOTICE V-16-1-11050 VCS engine version=4.1

2018/09/25 12:51:54 VCS NOTICE V-16-1-11051 VCS engine join version=4.1007

2018/09/25 12:51:54 VCS NOTICE V-16-1-11052 VCS engine pstamp=4.1 11/10/06-22:09:00

2018/09/25 12:51:54 VCS NOTICE V-16-1-10114 Opening GAB library

2018/09/25 12:51:54 VCS NOTICE V-16-1-10619 'HAD' starting on: imduaedms01

 

and Sol is 9:

2018/09/25 09:43:18 VCS INFO V-16-2-13001 (imduaedms02) Resource(JavaMethodServer): Output of the completed operation (offline)

Sun Microsystems Inc.        SunOS 5.9              Generic  May 2002

Both the OS and VCS are no longer supported versions!

 

In summary, you need to check and make sure the followings are all OK:

1. VCS resource dependency

2. storage connectivity from both the cluster nodes

3. CVMTimeout is set to a value long enough fir cvm to be onlined - https://sort.veritas.com/public/documents/sf/5.0/hpux11iv3/html/sfrac_install/ap_sfrac_install_cvmcf...

 

Hopefully the above data analysis and the technote help

Cheers,

Frank

Hi Frank,

 

Thank you for your informaton analysis !  

I know that the current production version is EOL , try to upgrade the latest version but the applicaion not working after upgrade the VCS, and it is big deal for change the OS, Application and VCS, the current government project operation for over 10 years until project end. 

I need to time to check with your answer.  As I know that the current cluster with the SAN and heartbeat is working fine before shutdown the server, I also check the powerpath is ideal before shutdown the server. 

Thank you Frank information !

 

frankgfan
Moderator
Moderator
   VIP   

Hello,

Do you need further assistance on this one?  If you do please post the details of your question, if you do not, please close the ticket.

Cheers,

Frank