β11-16-2009 02:41 AM
Hi all,
I have installed a VCS version 5.0 MP3 on a two RedHat EL 5.3 64bit nodes, this servers share four SAN disks that are connected by two fibre channel on each server, the multiple links with the SAN are managed with linux multipath daemon.
I have configured two types of resources in a one service group: Volume group and File system. The cluster, at the boot time, should import the Volume Group (and activate the logical volume that it contain) and mount the filesystems that is associated with the logical volume. This steps works fine without problems and the Volume group and File system start correctly and works fine, but, after some minutes, these resources do an auto-switch on the second node without reason.
In the engine.log I can see this at the switch time :
The first message speak about a resource "oradbvg" (is one of my shared volume group) that became online on the second node, but I don't start the resource on the second node.
2009/11/13 14:38:49 VCS INFO V-16-1-10299 Resource oradbvg (Owner: unknown, Group: oracle) is online on rh5masiemdb2 (Not initiated by VCS)
2009/11/13 14:38:49 VCS ERROR V-16-1-10214 Concurrency Violation:CurrentCount increased above 1 for failover group oracle
2009/11/13 14:38:49 VCS NOTICE V-16-1-10233 Clearing Restart attribute for group oracle on all nodes
2009/11/13 14:38:49 VCS INFO V-16-1-10299 Resource sysvg (Owner: unknown, Group: oracle) is online on rh5masiemdb2 (Not initiated by VCS)
2009/11/13 14:38:49 VCS ERROR V-16-1-10214 Concurrency Violation:CurrentCount increased above 1 for failover group oracle
2009/11/13 14:38:49 VCS NOTICE V-16-1-10233 Clearing Restart attribute for group oracle on all nodes
2009/11/13 14:38:49 VCS INFO V-16-1-10299 Resource redovg (Owner: unknown, Group: oracle) is online on rh5masiemdb2 (Not initiated by VCS)
2009/11/13 14:38:49 VCS ERROR V-16-1-10214 Concurrency Violation:CurrentCount increased above 1 for failover group oracle
2009/11/13 14:38:49 VCS NOTICE V-16-1-10233 Clearing Restart attribute for group oracle on all nodes
2009/11/13 14:38:49 VCS WARNING V-16-6-15034 (rh5masiemdb2) violation:Offlining group oracle on system rh5masiemdb2
2009/11/13 14:38:49 VCS WARNING V-16-6-15034 (rh5masiemdb2) violation:Offlining group oracle on system rh5masiemdb2
2009/11/13 14:38:49 VCS NOTICE V-16-1-10167 Initiating manual offline of group oracle on system rh5masiemdb2
2009/11/13 14:38:49 VCS NOTICE V-16-1-10300 Initiating Offline of Resource oradbvg (Owner: unknown, Group: oracle) on System rh5masiemdb2
2009/11/13 14:38:49 VCS NOTICE V-16-1-10300 Initiating Offline of Resource redovg (Owner: unknown, Group: oracle) on System rh5masiemdb2
2009/11/13 14:38:49 VCS NOTICE V-16-1-10300 Initiating Offline of Resource sysvg (Owner: unknown, Group: oracle) on System rh5masiemdb2
2009/11/13 14:38:49 VCS NOTICE V-16-1-10167 Initiating manual offline of group oracle on system rh5masiemdb2
2009/11/13 14:38:49 VCS WARNING V-16-6-15034 (rh5masiemdb2) violation:Offlining group oracle on system rh5masiemdb2
2009/11/13 14:38:49 VCS INFO V-16-6-15002 (rh5masiemdb2) hatrigger:hatrigger executed /opt/VRTSvcs/bin/triggers/violation rh5masiemdb2 oracle successfully
2009/11/13 14:38:49 VCS INFO V-16-6-15002 (rh5masiemdb2) hatrigger:hatrigger executed /opt/VRTSvcs/bin/triggers/violation rh5masiemdb2 oracle successfully
2009/11/13 14:38:49 VCS NOTICE V-16-1-10167 Initiating manual offline of group oracle on system rh5masiemdb2
2009/11/13 14:38:49 VCS INFO V-16-6-15002 (rh5masiemdb2) hatrigger:hatrigger executed /opt/VRTSvcs/bin/triggers/violation rh5masiemdb2 oracle successfully
2009/11/13 14:38:49 VCS INFO V-16-1-10299 Resource archivevg (Owner: unknown, Group: oracle) is online on rh5masiemdb2 (Not initiated by VCS)
2009/11/13 14:38:49 VCS ERROR V-16-1-10214 Concurrency Violation:CurrentCount increased above 1 for failover group oracle
2009/11/13 14:38:49 VCS NOTICE V-16-1-10233 Clearing Restart attribute for group oracle on all nodes
2009/11/13 14:38:49 VCS WARNING V-16-6-15034 (rh5masiemdb2) violation:Offlining group oracle on system rh5masiemdb2
2009/11/13 14:38:49 VCS NOTICE V-16-1-10167 Initiating manual offline of group oracle on system rh5masiemdb2
2009/11/13 14:38:49 VCS NOTICE V-16-1-10300 Initiating Offline of Resource archivevg (Owner: unknown, Group: oracle) on System rh5masiemdb2
2009/11/13 14:38:49 VCS INFO V-16-6-15002 (rh5masiemdb2) hatrigger:hatrigger executed /opt/VRTSvcs/bin/triggers/violation rh5masiemdb2 oracle successfully
2009/11/13 14:38:51 VCS ERROR V-16-2-13067 (rh5masiemdb1) Agent is calling clean for resource(archivevg) because the resource became OFFLINE unexpectedly, on its own.
2009/11/13 14:38:51 VCS ERROR V-16-2-13067 (rh5masiemdb1) Agent is calling clean for resource(oradbvg) because the resource became OFFLINE unexpectedly, on its own.
2009/11/13 14:38:51 VCS ERROR V-16-2-13067 (rh5masiemdb1) Agent is calling clean for resource(redovg) because the resource became OFFLINE unexpectedly, on its own.
2009/11/13 14:38:51 VCS ERROR V-16-2-13067 (rh5masiemdb1) Agent is calling clean for resource(sysvg) because the resource became OFFLINE unexpectedly, on its own.
2009/11/13 14:38:51 VCS INFO V-16-1-10305 Resource redovg (Owner: unknown, Group: oracle) is offline on rh5masiemdb2 (VCS initiated)
2009/11/13 14:38:52 VCS INFO V-16-1-10305 Resource oradbvg (Owner: unknown, Group: oracle) is offline on rh5masiemdb2 (VCS initiated)
2009/11/13 14:38:52 VCS INFO V-16-1-10305 Resource sysvg (Owner: unknown, Group: oracle) is offline on rh5masiemdb2 (VCS initiated)
2009/11/13 14:38:52 VCS INFO V-16-1-10305 Resource archivevg (Owner: unknown, Group: oracle) is offline on rh5masiemdb2 (VCS initiated)
2009/11/13 14:38:52 VCS NOTICE V-16-1-10446 Group oracle is offline on system rh5masiemdb2
2009/11/13 14:38:52 VCS INFO V-16-10031-15005 (rh5masiemdb2) triggers:???:nfs_postoffline:(postoffline) Invoked with arguments rh5masiemdb2, oracle
2009/11/13 14:38:52 VCS INFO V-16-6-15002 (rh5masiemdb2) hatrigger:hatrigger executed /opt/VRTSvcs/bin/triggers/nfs_postoffline rh5masiemdb2 oracle successfully
2009/11/13 14:38:52 VCS INFO V-16-6-15004 (rh5masiemdb2) hatrigger:Failed to send trigger for postoffline; script doesn't exist
2009/11/13 14:38:52 VCS INFO V-16-2-13068 (rh5masiemdb1) Resource(redovg) - clean completed successfully.
2009/11/13 14:38:52 VCS INFO V-16-2-13068 (rh5masiemdb1) Resource(oradbvg) - clean completed successfully.
2009/11/13 14:38:52 VCS INFO V-16-2-13068 (rh5masiemdb1) Resource(sysvg) - clean completed successfully.
2009/11/13 14:38:52 VCS INFO V-16-2-13068 (rh5masiemdb1) Resource(archivevg) - clean completed successfully.
2009/11/13 14:38:53 VCS INFO V-16-1-10307 Resource oradbvg (Owner: unknown, Group: oracle) is offline on rh5masiemdb1 (Not initiated by VCS)
2009/11/13 14:38:53 VCS NOTICE V-16-1-10300 Initiating Offline of Resource vip (Owner: unknown, Group: oracle) on System rh5masiemdb1
2009/11/13 14:38:53 VCS INFO V-16-1-10307 Resource sysvg (Owner: unknown, Group: oracle) is offline on rh5masiemdb1 (Not initiated by VCS)
2009/11/13 14:38:53 VCS INFO V-16-1-10307 Resource redovg (Owner: unknown, Group: oracle) is offline on rh5masiemdb1 (Not initiated by VCS)
2009/11/13 14:38:53 VCS INFO V-16-1-10307 Resource archivevg (Owner: unknown, Group: oracle) is offline on rh5masiemdb1 (Not initiated by VCS)
2009/11/13 14:38:53 VCS INFO V-16-6-15004 (rh5masiemdb1) hatrigger:Failed to send trigger for resfault; script doesn't exist
2009/11/13 14:38:53 VCS INFO V-16-6-15004 (rh5masiemdb1) hatrigger:Failed to send trigger for resfault; script doesn't exist
2009/11/13 14:38:53 VCS INFO V-16-6-15004 (rh5masiemdb1) hatrigger:Failed to send trigger for resfault; script doesn't exist
2009/11/13 14:38:53 VCS INFO V-16-6-15004 (rh5masiemdb1) hatrigger:Failed to send trigger for resfault; script doesn't exist
2009/11/13 14:38:54 VCS INFO V-16-1-10305 Resource vip (Owner: unknown, Group: oracle) is offline on rh5masiemdb1 (VCS initiated)
2009/11/13 14:38:54 VCS NOTICE V-16-1-10300 Initiating Offline of Resource archivefs (Owner: unknown, Group: oracle) on System rh5masiemdb1
2009/11/13 14:38:54 VCS NOTICE V-16-1-10300 Initiating Offline of Resource databasefs (Owner: unknown, Group: oracle) on System rh5masiemdb1
2009/11/13 14:38:54 VCS NOTICE V-16-1-10300 Initiating Offline of Resource redofs (Owner: unknown, Group: oracle) on System rh5masiemdb1
2009/11/13 14:38:54 VCS NOTICE V-16-1-10300 Initiating Offline of Resource sysfs (Owner: unknown, Group: oracle) on System rh5masiemdb1
2009/11/13 14:38:56 VCS INFO V-16-1-10305 Resource archivefs (Owner: unknown, Group: oracle) is offline on rh5masiemdb1 (VCS initiated)
2009/11/13 14:38:56 VCS INFO V-16-1-10305 Resource databasefs (Owner: unknown, Group: oracle) is offline on rh5masiemdb1 (VCS initiated)
2009/11/13 14:38:56 VCS INFO V-16-1-10305 Resource redofs (Owner: unknown, Group: oracle) is offline on rh5masiemdb1 (VCS initiated)
2009/11/13 14:38:56 VCS INFO V-16-1-10305 Resource sysfs (Owner: unknown, Group: oracle) is offline on rh5masiemdb1 (VCS initiated)
2009/11/13 14:38:56 VCS ERROR V-16-1-10205 Group oracle is faulted on system rh5masiemdb1
2009/11/13 14:38:56 VCS NOTICE V-16-1-10446 Group oracle is offline on system rh5masiemdb1
2009/11/13 14:38:56 VCS INFO V-16-1-10493 Evaluating rh5masiemdb1 as potential target node for group oracle
2009/11/13 14:38:56 VCS INFO V-16-1-50010 Group oracle is online or faulted on system rh5masiemdb1
2009/11/13 14:38:56 VCS INFO V-16-1-10493 Evaluating rh5masiemdb2 as potential target node for group oracle
2009/11/13 14:38:56 VCS NOTICE V-16-1-10301 Initiating Online of Resource archivevg (Owner: unknown, Group: oracle) on System rh5masiemdb2
2009/11/13 14:38:56 VCS NOTICE V-16-1-10301 Initiating Online of Resource oradbvg (Owner: unknown, Group: oracle) on System rh5masiemdb2
2009/11/13 14:38:56 VCS NOTICE V-16-1-10301 Initiating Online of Resource redovg (Owner: unknown, Group: oracle) on System rh5masiemdb2
2009/11/13 14:38:56 VCS NOTICE V-16-1-10301 Initiating Online of Resource sysvg (Owner: unknown, Group: oracle) on System rh5masiemdb2
2009/11/13 14:38:56 VCS INFO V-16-10031-15005 (rh5masiemdb1) triggers:???:nfs_postoffline:(postoffline) Invoked with arguments rh5masiemdb1, oracle
2009/11/13 14:38:56 VCS INFO V-16-6-15002 (rh5masiemdb1) hatrigger:hatrigger executed /opt/VRTSvcs/bin/triggers/nfs_postoffline rh5masiemdb1 oracle successfully
2009/11/13 14:38:56 VCS INFO V-16-6-15004 (rh5masiemdb1) hatrigger:Failed to send trigger for postoffline; script doesn't exist
2009/11/13 14:38:58 VCS INFO V-16-1-10298 Resource sysvg (Owner: unknown, Group: oracle) is online on rh5masiemdb2 (VCS initiated)
2009/11/13 14:38:58 VCS NOTICE V-16-1-10301 Initiating Online of Resource sysfs (Owner: unknown, Group: oracle) on System rh5masiemdb2
2009/11/13 14:38:58 VCS INFO V-16-1-10298 Resource archivevg (Owner: unknown, Group: oracle) is online on rh5masiemdb2 (VCS initiated)
2009/11/13 14:38:58 VCS NOTICE V-16-1-10301 Initiating Online of Resource archivefs (Owner: unknown, Group: oracle) on System rh5masiemdb2
2009/11/13 14:38:58 VCS INFO V-16-1-10298 Resource oradbvg (Owner: unknown, Group: oracle) is online on rh5masiemdb2 (VCS initiated)
2009/11/13 14:38:58 VCS NOTICE V-16-1-10301 Initiating Online of Resource databasefs (Owner: unknown, Group: oracle) on System rh5masiemdb2
2009/11/13 14:38:58 VCS INFO V-16-1-10298 Resource redovg (Owner: unknown, Group: oracle) is online on rh5masiemdb2 (VCS initiated)
2009/11/13 14:38:58 VCS NOTICE V-16-1-10301 Initiating Online of Resource redofs (Owner: unknown, Group: oracle) on System rh5masiemdb2
2009/11/13 14:38:59 VCS INFO V-16-1-10298 Resource sysfs (Owner: unknown, Group: oracle) is online on rh5masiemdb2 (VCS initiated)
2009/11/13 14:38:59 VCS INFO V-16-1-10298 Resource databasefs (Owner: unknown, Group: oracle) is online on rh5masiemdb2 (VCS initiated)
2009/11/13 14:38:59 VCS INFO V-16-1-10298 Resource redofs (Owner: unknown, Group: oracle) is online on rh5masiemdb2 (VCS initiated)
2009/11/13 14:38:59 VCS INFO V-16-1-10298 Resource archivefs (Owner: unknown, Group: oracle) is online on rh5masiemdb2 (VCS initiated)
2009/11/13 14:38:59 VCS NOTICE V-16-1-10301 Initiating Online of Resource vip (Owner: unknown, Group: oracle) on System rh5masiemdb2
2009/11/13 14:39:07 VCS INFO V-16-1-10298 Resource vip (Owner: unknown, Group: oracle) is online on rh5masiemdb2 (VCS initiated)
2009/11/13 14:39:07 VCS NOTICE V-16-1-10447 Group oracle is online on system rh5masiemdb2
2009/11/13 14:39:07 VCS NOTICE V-16-1-10448 Group oracle failed over to system rh5masiemdb2
2009/11/13 14:39:07 VCS INFO V-16-6-15004 (rh5masiemdb2) hatrigger:Failed to send trigger for postonline; script doesn't exist
2009/11/13 14:39:21 VCS INFO V-16-1-50135 User root fired command: MSG_CLUSTER_STOP_SYS from localhost
2009/11/13 14:39:21 VCS NOTICE V-16-1-10322 System rh5masiemdb1 (Node '0') changed state from RUNNING to LEAVING
2009/11/13 14:39:21 VCS NOTICE V-16-1-10300 Initiating Offline of Resource ntfr (Owner: unknown, Group: ClusterService) on System rh5masiemdb1
2009/11/13 14:39:22 VCS INFO V-16-1-10305 Resource ntfr (Owner: unknown, Group: ClusterService) is offline on rh5masiemdb1 (VCS initiated)
2009/11/13 14:39:22 VCS NOTICE V-16-1-10446 Group ClusterService is offline on system rh5masiemdb1
2009/11/13 14:39:22 VCS NOTICE V-16-1-10010 Stopping all agents
2009/11/13 14:39:22 VCS INFO V-16-1-10493 Evaluating rh5masiemdb1 as potential target node for group ClusterService
2009/11/13 14:39:22 VCS INFO V-16-1-10494 System rh5masiemdb1 not in RUNNING state
2009/11/13 14:39:22 VCS INFO V-16-1-10493 Evaluating rh5masiemdb2 as potential target node for group ClusterService
2009/11/13 14:39:22 VCS NOTICE V-16-1-10490 Resetting Evacuating (lastval=0) for group ClusterService for node rh5masiemdb1
2009/11/13 14:39:22 VCS NOTICE V-16-1-10301 Initiating Online of Resource ntfr (Owner: unknown, Group: ClusterService) on System rh5masiemdb2
2009/11/13 14:39:22 VCS NOTICE V-16-1-10322 System rh5masiemdb1 (Node '0') changed state from LEAVING to EXITING
2009/11/13 14:39:22 VCS NOTICE V-16-1-10322 System rh5masiemdb1 (Node '0') changed state from EXITING to EXITED
Thanks in advantage for the support.
Bye
Solved! Go to Solution.
β01-27-2010 04:50 AM
β11-16-2009 08:27 AM
β11-16-2009 08:09 PM
β11-17-2009 01:33 AM
Hi Gaurav, thanks a lot for you support. I have put following the informations that you have required.
For the hearthbeats links the situation is this : Now, we have two heartbeats connected by two cross cables with the second node, for the future the heartbeats cards will be connected through a switch (maybe two for redundancy reasons) with a VLAN dedicated for the haertbeat traffic.
The heartbeats are, on each nodes, two internal Nics of type "Broadcom NetXtreme II BCM5709 1000Base-T (C0)" alias "eth0" and "eth1"
[root@rh5masiemdb1 ~]# more /etc/VRTSvcs/conf/config/main.cf
include "types.cf"
cluster rh5masiemdb (
UserNames = { admin = hKLdKFkHLgLLjTLfKI }
Administrators = { admin }
)
system rh5masiemdb1 (
)
system rh5masiemdb2 (
)
group ClusterService (
SystemList = { rh5masiemdb1 = 0, rh5masiemdb2 = 1 }
AutoStartList = { rh5masiemdb1, rh5masiemdb2 }
OnlineRetryLimit = 3
OnlineRetryInterval = 120
)
NIC csgnic (
Device = bond0
)
NotifierMngr ntfr (
SnmpConsoles = { "10.155.147.18" = Information }
SmtpServer = "10.155.147.50"
SmtpRecipients = { "arc_dbvcs_ma_soc@terna.it" = Information }
)
ntfr requires csgnic
// resource dependency tree
//
// group ClusterService
// {
// NotifierMngr ntfr
// {
// NIC csgnic
// }
// }
group oracle (
SystemList = { rh5masiemdb1 = 0, rh5masiemdb2 = 1 }
)
LVMVolumeGroup redovg (
VolumeGroup = oraredovg
StartVolumes = 1
)
Mount redofs (
MountPoint = "/oracle/redo"
BlockDevice = "/dev/mapper/oraredovg-oraredolv"
FSType = ext3
FsckOpt = "-n"
CkptUmount = 0
)
redofs requires redovg
// resource dependency tree
//
// group oracle
// {
// Mount redofs
// {
// LVMVolumeGroup redovg
// }
// }
[root@rh5masiemdb1 ~]# lltstat -vvn | head
LLT node information:
Node State Link Status Address
* 0 rh5masiemdb1 OPEN
eth0 UP 00:21:5E:36:44:24
eth1 UP 00:21:5E:36:44:26
1 rh5masiemdb2 OPEN
eth0 UP 00:21:5E:36:33:D8
eth1 UP 00:21:5E:36:33:DA
2 CONNWAIT
eth0 DOWN
[root@rh5masiemdb1 ~]# lltstat
LLT statistics:
61637 Snd data packets
0 Snd retransmit data
5983644 Snd connect packets
1601709 Snd independent ACKs
19992 Snd piggyback ACKs
0 Snd independent NACKs
0 Snd piggyback NACKs
91117 Snd loopback packets
31607 Rcv data packets
0 Rcv out of window
0 Rcv duplicates
0 Rcv datagrams dropped
0 Rcv multiblock data
0 Rcv misaligned data
0 Snd chained header
LLT errors:
2 Rcv not connected
0 Rcv unconfigured
0 Rcv bad dest address
0 Rcv bad source address
0 Rcv bad generation
0 Rcv no buffer
0 Rcv malformed packet
0 Rcv bad SAP
0 Rcv bad STREAM primitive
0 Rcv bad DLPI primitive
0 Rcv DLPI error
90 Snd not connected
0 Snd no buffer
0 Snd stream flow drops
0 Snd no links up
0 Rcv bad checksum
0 Rcv bad udp/ether source address
0 Snd udp queue full drops
[root@rh5masiemdb1 ~]# cat /etc/llttab
set-node rh5masiemdb1
set-cluster 0
link eth0 eth-00:21:5e:36:44:24 - ether - -
link eth1 eth-00:21:5e:36:44:26 - ether - -
[root@rh5masiemdb1 ~]# cat /etc/llthosts
0 rh5masiemdb1
1 rh5masiemdb2
β11-17-2009 02:06 AM
β01-25-2010 07:13 AM
β01-27-2010 04:50 AM
β05-09-2010 09:20 AM
Hi lallo,
I'm having the same issue with my VCS: VCS 5.0 MP4 on two Red Hat Linux 5.4 x86_64 servers connected to an EVA4400 disk array via FC. I'm using embedded Linux multipath daemon.
Could you please tell us what was the fix, or whether you have got an official response from Symantec.
Ratheeshk: is what you have written above the official uniq procedure provided by Symantec or it's a solution among others.
Thanks for your help,
β07-09-2010 01:16 PM
β07-12-2010 08:19 AM