10-01-2018 09:51 PM
Hi Guys,
I have incident to shutdown the DM cluster server immediately, use operation menu to shutdown the server, check the script it is not to run hatstop , instead of command is use init to shutdown server. I have DB server , shutdown by use the same menu, the DB server mount the disk group automatically once start up the server. But the DM server not able to mount disk group after reboot the server, try to hastop and reboot serveral time, still the same issue. In this situation, I use vxdg -s import diskgroup to mount the disk one by one node, then it resumed the service. May I know if there is any problem I need to vxdg import disk group ?
DM cluster server and DB server run in the same version on Solaris 9 .
Solved! Go to Solution.
10-02-2018 11:32 PM
thanks for re-sending the log.
here is my log analysis and some comments.
the system *02 (node ID 1) was restarted at the time below
018/09/25 09:43:15 VCS NOTICE V-16-1-10322 System imduaedms02 (Node '1') changed state from RUNNING to LEAVING
2018/09/25 09:43:17 VCS INFO V-16-1-10494 System imduaedms02 not in RUNNING state
the system *01 was also restarted:
2018/09/25 09:43:43 VCS NOTICE V-16-1-10322 System imduaedms01 (Node '0') changed state from RUNNING to LEAVING
so had was stopped on both nodes!
node *01 came up and had started from local build:
2018/09/25 11:58:07 VCS NOTICE V-16-1-10322 System imduaedms01 (Node '0') changed state from LOCAL_BUILD to RUNNING
the node *02 joined by remote build, after 2 seconds:
2018/09/25 11:58:09 VCS NOTICE V-16-1-10322 System imduaedms02 (Node '1') changed state from REMOTE_BUILD to RUNNING
it took a while for vxconfigd to enter cluster mode so the following errors were logged:
2018/09/25 11:58:10 VCS ERROR V-16-10001-1005 (imduaedms01) CVMCluster:???:monitor:node - state: out of cluster
2018/09/25 11:58:10 VCS INFO V-16-1-10304 Resource cvm_clus (Owner: unknown, Group: cvm) is offline on imduaedms01 (First probe)
2018/09/25 11:58:10 VCS INFO V-16-1-10304 Resource qlogckd (Owner: unknown, Group: cvm) is offline on imduaedms01 (First probe)
the errorsl like one below
2018/09/25 11:58:11 VCS ERROR V-16-10001-1005 (imduaedms02) CVMCluster:???:monitor:node - state: out of cluster
logged because cvm was down
dg import error:
2018/09/25 11:58:38 VCS ERROR V-16-10001-1009 (imduaedms01) CVMVolDg:CDG_dms2_datadg:online:could not find diskgroup dms2_datadg imported. If it was previously deported, it will have to be manually imported
2018/09/25 11:58:39 VCS INFO V-16-2-13001 (imduaedms01) Resource(CDG_dms2_datadg): Output of the completed operation (online)
VxVM vxprint ERROR V-5-1-582 Disk group dms2_datadg: No such disk group
for the cvm dg import issue logged above, please refer to this technote: https://www.veritas.com/support/en_US/article.000029451
now the cvm master is up and system *01 is tyhe new cvm master:
2018/09/25 12:30:11 VCS INFO V-16-10001-1003 (imduaedms01) CVMCluster:cvm_clus:online:CVMCluster role is - mode: enabled: cluster active - MASTER
master: imduaedms01
Once the cvm master node is up and the slave node joined, you should be able to manually import the shared dgs on both nodes.
Your VCS is very old: 4.1 on Sol 9
2018/09/25 12:51:54 VCS NOTICE V-16-1-11022 VCS engine (had) started
2018/09/25 12:51:54 VCS NOTICE V-16-1-11050 VCS engine version=4.1
2018/09/25 12:51:54 VCS NOTICE V-16-1-11051 VCS engine join version=4.1007
2018/09/25 12:51:54 VCS NOTICE V-16-1-11052 VCS engine pstamp=4.1 11/10/06-22:09:00
2018/09/25 12:51:54 VCS NOTICE V-16-1-10114 Opening GAB library
2018/09/25 12:51:54 VCS NOTICE V-16-1-10619 'HAD' starting on: imduaedms01
and Sol is 9:
2018/09/25 09:43:18 VCS INFO V-16-2-13001 (imduaedms02) Resource(JavaMethodServer): Output of the completed operation (offline)
Sun Microsystems Inc. SunOS 5.9 Generic May 2002
Both the OS and VCS are no longer supported versions!
In summary, you need to check and make sure the followings are all OK:
1. VCS resource dependency
2. storage connectivity from both the cluster nodes
3. CVMTimeout is set to a value long enough fir cvm to be onlined - https://sort.veritas.com/public/documents/sf/5.0/hpux11iv3/html/sfrac_install/ap_sfrac_install_cvmcf...
Hopefully the above data analysis and the technote help
Cheers,
Frank
10-01-2018 10:11 PM
Hello
Post your engine_A and diskgroup logs to check what happened.
10-01-2018 10:38 PM
Hello,
Thank you
Let me check with end users if they are available to get the log for me and get back to you asap
10-01-2018 10:47 PM
since the dg could be imported manually by running vxdg -s import command meaning there was not likely a storage nor file system related issue. It could be a resource online "timing" issue.
If you just copy and paste the section of the engine_A.log covering the period that the server was rebooted and the dg import failed and I should be able to find the casue of the dg import by VCS failed issue
10-02-2018 12:47 AM
10-02-2018 12:47 AM
The incident happen on 25 of September
10-02-2018 12:49 AM
10-02-2018 04:32 AM
ks for sending the logs however the logs are too large over 17MB. Can you please truncate the logs and just send the logs for Sep 25 or from Sep 25 onwards.
Cheers,
Frank
10-02-2018 06:58 PM
10-02-2018 06:59 PM
yes, i truncate the log and upload as attachment
Thank you very much
10-02-2018 11:32 PM
thanks for re-sending the log.
here is my log analysis and some comments.
the system *02 (node ID 1) was restarted at the time below
018/09/25 09:43:15 VCS NOTICE V-16-1-10322 System imduaedms02 (Node '1') changed state from RUNNING to LEAVING
2018/09/25 09:43:17 VCS INFO V-16-1-10494 System imduaedms02 not in RUNNING state
the system *01 was also restarted:
2018/09/25 09:43:43 VCS NOTICE V-16-1-10322 System imduaedms01 (Node '0') changed state from RUNNING to LEAVING
so had was stopped on both nodes!
node *01 came up and had started from local build:
2018/09/25 11:58:07 VCS NOTICE V-16-1-10322 System imduaedms01 (Node '0') changed state from LOCAL_BUILD to RUNNING
the node *02 joined by remote build, after 2 seconds:
2018/09/25 11:58:09 VCS NOTICE V-16-1-10322 System imduaedms02 (Node '1') changed state from REMOTE_BUILD to RUNNING
it took a while for vxconfigd to enter cluster mode so the following errors were logged:
2018/09/25 11:58:10 VCS ERROR V-16-10001-1005 (imduaedms01) CVMCluster:???:monitor:node - state: out of cluster
2018/09/25 11:58:10 VCS INFO V-16-1-10304 Resource cvm_clus (Owner: unknown, Group: cvm) is offline on imduaedms01 (First probe)
2018/09/25 11:58:10 VCS INFO V-16-1-10304 Resource qlogckd (Owner: unknown, Group: cvm) is offline on imduaedms01 (First probe)
the errorsl like one below
2018/09/25 11:58:11 VCS ERROR V-16-10001-1005 (imduaedms02) CVMCluster:???:monitor:node - state: out of cluster
logged because cvm was down
dg import error:
2018/09/25 11:58:38 VCS ERROR V-16-10001-1009 (imduaedms01) CVMVolDg:CDG_dms2_datadg:online:could not find diskgroup dms2_datadg imported. If it was previously deported, it will have to be manually imported
2018/09/25 11:58:39 VCS INFO V-16-2-13001 (imduaedms01) Resource(CDG_dms2_datadg): Output of the completed operation (online)
VxVM vxprint ERROR V-5-1-582 Disk group dms2_datadg: No such disk group
for the cvm dg import issue logged above, please refer to this technote: https://www.veritas.com/support/en_US/article.000029451
now the cvm master is up and system *01 is tyhe new cvm master:
2018/09/25 12:30:11 VCS INFO V-16-10001-1003 (imduaedms01) CVMCluster:cvm_clus:online:CVMCluster role is - mode: enabled: cluster active - MASTER
master: imduaedms01
Once the cvm master node is up and the slave node joined, you should be able to manually import the shared dgs on both nodes.
Your VCS is very old: 4.1 on Sol 9
2018/09/25 12:51:54 VCS NOTICE V-16-1-11022 VCS engine (had) started
2018/09/25 12:51:54 VCS NOTICE V-16-1-11050 VCS engine version=4.1
2018/09/25 12:51:54 VCS NOTICE V-16-1-11051 VCS engine join version=4.1007
2018/09/25 12:51:54 VCS NOTICE V-16-1-11052 VCS engine pstamp=4.1 11/10/06-22:09:00
2018/09/25 12:51:54 VCS NOTICE V-16-1-10114 Opening GAB library
2018/09/25 12:51:54 VCS NOTICE V-16-1-10619 'HAD' starting on: imduaedms01
and Sol is 9:
2018/09/25 09:43:18 VCS INFO V-16-2-13001 (imduaedms02) Resource(JavaMethodServer): Output of the completed operation (offline)
Sun Microsystems Inc. SunOS 5.9 Generic May 2002
Both the OS and VCS are no longer supported versions!
In summary, you need to check and make sure the followings are all OK:
1. VCS resource dependency
2. storage connectivity from both the cluster nodes
3. CVMTimeout is set to a value long enough fir cvm to be onlined - https://sort.veritas.com/public/documents/sf/5.0/hpux11iv3/html/sfrac_install/ap_sfrac_install_cvmcf...
Hopefully the above data analysis and the technote help
Cheers,
Frank
10-03-2018 12:37 AM
Hi Frank,
Thank you for your informaton analysis !
I know that the current production version is EOL , try to upgrade the latest version but the applicaion not working after upgrade the VCS, and it is big deal for change the OS, Application and VCS, the current government project operation for over 10 years until project end.
I need to time to check with your answer. As I know that the current cluster with the SAN and heartbeat is working fine before shutdown the server, I also check the powerpath is ideal before shutdown the server.
Thank you Frank information !
10-05-2018 04:09 AM
Hello,
Do you need further assistance on this one? If you do please post the details of your question, if you do not, please close the ticket.
Cheers,
Frank