01-28-2013 03:08 AM
Environment
RHEL = 5.3
SFHA/DR = 5.0 MP3RP3
Primary Site = Two Nodes Cluster
DR Site = One Node Cluster
Query
We are planning to update our existing 5.0 version with the last updated version of 5.0. (After some time we had a plan to go with latest version which may be 6.0 but this need to update our OS as well. So for a short term plan we need to update our sfha 5.0 MP3 RP3 till the last available patch for sfha 5.0 )
- Our understanding is we can run the below highlighted rolling patch sfha 5.0 MP4RP1 directly on sfha 5.0 MP3 RP3
Any quick update will be higly appriciated
Solved! Go to Solution.
01-28-2013 03:34 AM
The 5.0MP4RP1 release notes say:
The Veritas Storage Foundation and High Availability (SFHA) 5.0 MaintenancePack (MP) 4 Rolling Patch (RP) 1 release is cumulative with and based on theVeritas Storage Foundation 5.0 MP3 release.
01-28-2013 03:26 AM
Rather upgrade to SF/HA 5.1 SP1 (with subsequent patches).
RHEL 5.3 is still supported plus 5.1 allows for rolling upgrades when you eventually upgrade the OS.
01-28-2013 03:34 AM
The 5.0MP4RP1 release notes say:
The Veritas Storage Foundation and High Availability (SFHA) 5.0 MaintenancePack (MP) 4 Rolling Patch (RP) 1 release is cumulative with and based on theVeritas Storage Foundation 5.0 MP3 release.
01-28-2013 04:38 AM
Thanks both guys for your kind words.
(Marking thumb for now)
@Marianne You means we can directly install sfha 5.1 SP1 on sfha 5.0 MP3RP3 product ?
01-28-2013 03:01 PM
I upgraded the patch as below procedure:
- Offline Service Group which also deported DiskGroup so no vxfs filesystem is mounted.
- Stop HAD on one Cluster Node
# hastop -local
- Stop the below services as per Release Notes
# /etc/init.d/vxodm stop (this daemon is not installed so not able to stop)
# /etc/init.d/vxgms stop (this daemon is not installed so not able to stop)
# /etc/init.d/vxglm stop (this daemon is not installed so not able to stop)
# /etc/init.d/vxfen stop
# /etc/init.d/gab stop
# /etc/init.d/llt stop
- Apply the patch via the below way
cd /patches/StorageFoundation/rpms
# rpm -Uvh *.rpm
(Support only said to run rpm -Uvh *.rpm at the mentioned folder"/patches/StorageFoundation/rpms" and told not need to run rpm -Uvh *.rpm at the mentioned folder"/patches/Veritas_Cluster/rpms" So the same we did.)
- Verify the MP4 is either installed or not
rpm -qa | grep VRTSvcs
VRTSvcssy-5.0.41.000-MP4RP1_RHEL5
VRTSvcsvr-5.0.30.00-MP3_GENERIC
VRTSvcs-5.0.30.20-MP3RP2_RHEL5
VRTSvcsdb-5.0.41.000-MP4RP1_GENERIC
VRTSvcsmg-5.0.30.00-MP3_GENERIC
VRTSvcsmn-5.0.30.00-MP3_GENERIC
VRTSvcsor-5.0.41.000-MP4RP1_RHEL5
VRTSvcsdr-5.0.41.000-MP4RP1_RHEL5
VRTSvcsag-5.0.30.20-MP3RP2_RHEL5
-Rebooted the System
After the NODE up the HAD service is not able to UP.
# hastart
# hastatus
attempting to connect....
VCS ERROR V-16-1-10600 Cannot connect to VCS engine attempting to connect....not available; will retry attempting to connect....retrying
=======================================
Discussed with Symantec Support and they suggested the below reason:
Check /etc/llttab, /etc/llthosts, /etc/VRTSvcs/conf/sysname and /etc/VRTSvcs/conf/config/main.cf for any inconsistencies in naming conventions across the nodes in the cluster. This incident related to use of a fully qualified domain name being used in all files except the main.cf file.
I made the same name of Cluster Node in all above mentioned files on the Cluster Node only where I upgraded the Rolling Patch (Although I changed but a deep Concern that if the Name is matter then why it was running in Earlier version). Below is as a reference:
( Previous Names PRI and SEC .. New Names are PRIMARY_NODE and SEC_NODE where I applied the patch)
# cat /etc/VRTSvcs/conf/config/main.cf |grep PRIMARY_NODE
system PRIMARY_NODE (
SystemList = { PRIMARY_NODE = 0, sec = 1, secphoenix = 2 }
AutoStartList = { PRIMARY_NODE, secphoenix }
SystemList = { PRIMARY_NODE = 0, secphoenix = 1 }
AutoStartList = { PRIMARY_NODE, secphoenix }
SystemList = { PRIMARY_NODE = 0, secphoenix = 1 }
AutoStartList = { PRIMARY_NODE, secphoenix }
# cat /etc/VRTSvcs/conf/config/main.cf |grep SEC_NODE
system SEC_NODE (
system SEC_NODE (
SystemList = { PRIMARY_NODE = 0, sec = 1, SEC_NODE = 2 }
AutoStartList = { PRIMARY_NODE, SEC_NODE }
SystemList = { PRIMARY_NODE = 0, SEC_NODE = 1 }
AutoStartList = { PRIMARY_NODE, SEC_NODE }
SystemList = { PRIMARY_NODE = 0, SEC_NODE = 1 }
AutoStartList = { PRIMARY_NODE, SEC_NODE }
# cat /etc/llttab
set-node SEC_NODE
set-cluster 789
link eth1 eth-00:15:17:95:36:35 - ether - -
link eth2 eth-00:24:e8:2e:e1:ec - ether - -
link-lowpri eth7 eth-00:10:18:2e:8b:4d - ether - -
# cat /etc/llthosts
0 PRIMARY_NODE
1 SEC_NODE
# cat /etc/VRTSvcs/conf/sysname
SEC_NODE
But Still not able to UP the HAD
=================
Below engine log details are for reference
2013/01/29 03:13:15 VCS NOTICE V-16-1-11022 VCS engine (had) started
2013/01/29 03:13:15 VCS NOTICE V-16-1-11050 VCS engine version=5.0
2013/01/29 03:13:15 VCS NOTICE V-16-1-11051 VCS engine join version=5.0.30.2
2013/01/29 03:13:15 VCS NOTICE V-16-1-11052 VCS engine pstamp=Veritas-5.0MP3RP2-01/22/09-15:17:00
2013/01/29 03:13:15 VCS NOTICE V-16-1-10114 Opening GAB library
2013/01/29 03:13:15 VCS INFO V-16-1-10196 Cluster logger started
2013/01/29 03:13:15 VCS NOTICE V-16-1-10619 'HAD' starting on: SEC_NODE
2013/01/29 03:13:15 VCS ERROR V-16-1-10621 Local cluster configuration error
2013/01/29 03:13:15 VCS INFO V-16-1-10125 GAB timeout set to 15000 ms
2013/01/29 03:13:20 VCS INFO V-16-1-10077 Received new cluster membership
2013/01/29 03:13:20 VCS NOTICE V-16-1-10112 System (SEC_NODE) - Membership: 0x3, DDNA: 0x0
2013/01/29 03:13:20 VCS NOTICE V-16-1-10322 System (Node '0') changed state from UNKNOWN to INITING
2013/01/29 03:13:20 VCS NOTICE V-16-1-10086 System (Node '0') is in Regular Membership - Membership: 0x3
2013/01/29 03:13:20 VCS NOTICE V-16-1-10086 System secphoenix (Node '1') is in Regular Membership - Membership: 0x3
2013/01/29 03:13:20 VCS WARNING V-16-1-50129 Operation 'haclus -modify' rejected as the node is in STALE_DISCOVER_WAIT state
2013/01/29 03:13:20 VCS WARNING V-16-1-50129 Operation 'haclus -modify' rejected as the node is in STALE_DISCOVER_WAIT state
2013/01/29 03:13:20 VCS NOTICE V-16-1-10453 Node: 0 changed name from: '' to: 'pri'
2013/01/29 03:13:20 VCS NOTICE V-16-1-10322 System pri (Node '0') changed state from INITING to RUNNING
2013/01/29 03:13:20 VCS NOTICE V-16-1-10322 System SEC_NODE (Node '1') changed state from STALE_DISCOVER_WAIT to REMOTE_BUILD
2013/01/29 03:13:20 VCS NOTICE V-16-1-10464 Requesting snapshot from node: 0
2013/01/29 03:13:20 VCS NOTICE V-16-1-10465 Getting snapshot. snapped_membership: 0x3 current_membership: 0x3 current_jeopardy_membership: 0x0
2013/01/29 03:13:20 VCS ERROR V-16-1-10391 System 'PRI' listed in main.cf but absent in LLT config file
2013/01/29 03:13:20 VCS ERROR V-16-1-10068 System PRI - VCS engine will exit to prevent any inconsistency. Please review names of systems listed in VCS configuration and make sure they are present and node ids are consistent in LLT configuration files on all systems in the cluster
I am not able to understand why the old name which is PRI still showing even I changed this name with PRIMARY_NODE as I shown above
=================================
=================================
Usual help will be highly appriciated
01-29-2013 06:51 AM
Have a look in /etc/VRTSvcs/conf/sysname
Mike
01-30-2013 09:49 PM
System 'PRI' listed in main.cf but absent in LLT config file.
Have you tried to find reference to PRI in main.cf?
You have only showed us output of grep PRIMARY_NODE.
01-31-2013 03:26 AM
Zahid,
Were you nodes names:
PRI and SEC_NODE when you has 5.0 MP3RP3 installed
and did you change PRI to PRIMARY_NODE as part of the process of upgrading to 5.0 MP4RP1.
If, so I think your issue is you only stopped VCS on one node and so the main.cf is in the memory of PRI node and when you start VCS on SEC_NODE, then SEC_NODE will ignore the local main.cf (which I guess you edited to change name from PRI to PRIMARY_NODE) and will get config from memory of PRI - the logs verify this as you see the following lines in the log:
2013/01/29 03:13:20 VCS NOTICE V-16-1-10322 System SEC_NODE (Node '1') changed state from STALE_DISCOVER_WAIT to REMOTE_BUILD2013/01/29 03:13:20 VCS NOTICE V-16-1-10464 Requesting snapshot from node: 02013/01/29 03:13:20 VCS NOTICE V-16-1-10465 Getting snapshot. snapped_membership: 0x3 current_membership: 0x3 current_jeopardy_membership: 0x0
So this is showing VCS did a REMOTE_BUILD (get in-memory main.cf from another node), not a LOCAL_BUILD (get local on-disk main.cf). You will also see, that if you check main.cf on SEC_NODE, that it has been overwritten by copy from PRI
If you want to change name then you need to stop VCS on both nodes (you can use hastop -all -force, if you want to leave Apps running) and you may need to restart LLT and GAB on both nodes also. Then restart everything.
You do need to change entry in /etc/VRTSvcs/conf/sysname, which I assume you have, but you only gave output from SEC_NODE, not PRIMARY_NODE.
As an aside, it is not a good idea to make 2 unrelated changes at the same time as if things go wrong, it is not always clear which change caused the issue.
Mike