VCS dependency question - Service group is not runninng on the intended node upon cluster startup.
Dear All, I have a questions regarding the service group dependency. I have two parent service group "Group1" and "Group2" . They are dependent on a child service group "ServerGroup1_DG" which has been configured as parallel service. In the main.cf file, I have configured the service group "Group1" to be started on Node1 and service group "Group2" to be started on Node2. However, I don't know why the cluster do not start the service group "ServerGroup1_DG" on Node2 before starting the service group "Group2". During the cluster startup, the cluster evaluated Node1 to be a target node for service group "Group2", and it went on to online the service group on Node1. This is not the wanted behaviour. I would like to have the service group "Group1" running on Node1 and service group "Group2" running on Node2 when the cluster starts up. Can anyone help to shed some light on how I can solve this problem? Thanks. p/s: The VCS version is 5.0 on solaris platform. main.cf:- group Group1 ( SystemList = { Node1 = 1, Node2 = 2 } AutoStartList = { Node1, Node2 } ) requires group ServerGroup1_DG online local firm group Group2 ( SystemList = { Node1 = 2, Node2 = 1 } AutoStartList = { Node2, Node1 } ) requires group ServerGroup1_DG online local firm group ServerGroup1_DG ( SystemList = { Node1 = 0, Node2 = 1 } AutoFailOver = 0 Parallel = 1 AutoStartList = { Node1, Node2 } ) CFSMount cfsmount2 ( Critical = 0 MountPoint = "/var/opt/xxxx/ServerGroup1" BlockDevice = "/dev/vx/dsk/xxxxdg/vol01" MountOpt @Node1 = "cluster" MountOpt @Node2 = "cluster" NodeList = { Node1, Node2 } ) CVMVolDg cvmvoldg2 ( Critical = 0 CVMDiskGroup = xxxxdg CVMActivation @Node1 = sw CVMActivation @Node2 = sw ) requires group cvm online local firm cfsmount2 requires cvmvoldg2 Engine Log:- 2010/06/25 16:05:47 VCS NOTICE V-16-1-10447 Group ServerGroup1_DG is online on system Node1 2010/06/25 16:05:47 VCS WARNING V-16-1-50045 Initiating online of parent group Group3, PM will select the best node 2010/06/25 16:05:47 VCS WARNING V-16-1-50045 Initiating online of parent group Group2, PM will select the best node 2010/06/25 16:05:47 VCS WARNING V-16-1-50045 Initiating online of parent group Group1, PM will select the best node 2010/06/25 16:05:47 VCS INFO V-16-1-10493 Evaluating Node2 as potential target node for group Group3 2010/06/25 16:05:47 VCS INFO V-16-1-10163 Group dependency is not met if group Group3 goes online on system Node2 2010/06/25 16:05:47 VCS INFO V-16-1-10493 Evaluating Node1 as potential target node for group Group3 2010/06/25 16:05:47 VCS INFO V-16-1-10493 Evaluating Node2 as potential target node for group Group2 2010/06/25 16:05:47 VCS INFO V-16-1-10163 Group dependency is not met if group Group2 goes online on system Node2 2010/06/25 16:05:47 VCS INFO V-16-1-10493 Evaluating Node1 as potential target node for group Group2 2010/06/25 16:06:15 VCS NOTICE V-16-1-10447 Group ServerGroup1_DG is online on system Node2 Regards, RyanSolved4.7KViews0likes2CommentsDynamic multipath using EMC storage Veritas 3.5
I am trying to setup multipathing with EMC Clariion. The problem is that vxdisk list fabric_0 only shows one path. The EMC array is in auto-trespass mode. This is solaris 8 and format shows two paths. # vxdisk list fabric_2 Device: fabric_2 devicetag: fabric_2 type: sliced hostid: ncsun1 disk: name=disk05 id=1302111549.6037.ncsun1 group: name=rootdg id=1072877341.1025.nc1 info: privoffset=1 flags: online ready private autoconfig autoimport imported pubpaths: block=/dev/vx/dmp/fabric_2s4 char=/dev/vx/rdmp/fabric_2s4 privpaths: block=/dev/vx/dmp/fabric_2s3 char=/dev/vx/rdmp/fabric_2s3 version: 2.2 iosize: min=512 (bytes) max=2048 (blocks) public: slice=4 offset=0 len=1048494080 private: slice=3 offset=1 len=32511 update: time=1302111558 seqno=0.5 headers: 0 248 configs: count=1 len=23969 logs: count=1 len=3631 Defined regions: config priv 000017-000247[000231]: copy=01 offset=000000 enabled config priv 000249-023986[023738]: copy=01 offset=000231 enabled log priv 023987-027617[003631]: copy=01 offset=000000 enabled Multipathing information: numpaths: 1 c10t500601613B241045d5s2 state=enabled formt 8. c10t500601613B241045d0 <DGC-RAID5-0428 cyl 63998 alt 2 hd 256 sec 64> /ssm@0,0/pci@19,700000/SUNW,qlc@2/fp@0,0/ssd@w500601613b241045,0 16. c16t500601603B241045d0 <DGC-RAID5-0428 cyl 63998 alt 2 hd 256 sec 64> /ssm@0,0/pci@18,700000/SUNW,qlc@1/fp@0,0/ssd@w500601603b241045,0 vxdisk -o alldgs list show both paths. Two things here it should only show one of the paths and also the second path it shows with diskgroup in ( ). Another issue is why the disk dont show up as EMC_0 or similiar. *****The server has T3 connect and EMC which we are migrating from T3 to EMC. The EMC is the fabric naming convention. # vxdisk -o alldgs list DEVICE TYPE DISK GROUP STATUS T30_0 sliced disk01 rootdg online T30_1 sliced disk02 rootdg online T31_0 sliced disk03 rootdg online T31_1 sliced disk04 rootdg online T32_0 sliced rootdg00 rootdg online T32_1 sliced rootdg01 rootdg online c1t0d0s2 sliced - - error c1t1d0s2 sliced - - error fabric_0 sliced - - error fabric_1 sliced - - error fabric_2 sliced disk05 rootdg online fabric_3 sliced disk06 rootdg online fabric_4 sliced disk07 rootdg online fabric_5 sliced disk08 rootdg online fabric_6 sliced disk09 rootdg online fabric_7 sliced disk10 rootdg online fabric_8 sliced - - error fabric_9 sliced - - error fabric_10 sliced - (rootdg) online fabric_11 sliced - (rootdg) online fabric_12 sliced - (rootdg) online fabric_13 sliced - (rootdg) online fabric_14 sliced - (rootdg) online fabric_15 sliced - (rootdg) online Here is the ASL...There is no APM prior to version Veritas 4.0. vxddladm listsupport snippet for brevity..... libvxDGCclariion.so A/P DGC CLARiiON The c10 and c16 are the paths for the EMC # vxdmpadm listctlr all CTLR-NAME ENCLR-TYPE STATE ENCLR-NAME ===================================================== c1 OTHER_DISKS ENABLED OTHER_DISKS c10 OTHER_DISKS ENABLED OTHER_DISKS c16 OTHER_DISKS ENABLED OTHER_DISKS # vxdmpadm getsubpaths ctlr=c10 NAME STATE PATH-TYPE DMPNODENAME ENCLR-TYPE ENCLR-NAME ====================================================================== c10t500601613B241045d7s2 ENABLED - fabric_0 OTHER_DISKS OTHER_DISKS c10t500601613B241045d6s2 ENABLED - fabric_1 OTHER_DISKS OTHER_DISKS c10t500601613B241045d5s2 ENABLED - fabric_2 OTHER_DISKS OTHER_DISKS c10t500601613B241045d4s2 ENABLED - fabric_3 OTHER_DISKS OTHER_DISKS c10t500601613B241045d3s2 ENABLED - fabric_4 OTHER_DISKS OTHER_DISKS c10t500601613B241045d2s2 ENABLED - fabric_5 OTHER_DISKS OTHER_DISKS c10t500601613B241045d1s2 ENABLED - fabric_6 OTHER_DISKS OTHER_DISKS c10t500601613B241045d0s2 ENABLED - fabric_7 OTHER_DISKS OTHER_DISKS # vxdmpadm getsubpaths ctlr=c16 NAME STATE PATH-TYPE DMPNODENAME ENCLR-TYPE ENCLR-NAME ====================================================================== c16t500601603B241045d7s2 ENABLED - fabric_8 OTHER_DISKS OTHER_DISKS c16t500601603B241045d6s2 ENABLED - fabric_9 OTHER_DISKS OTHER_DISKS c16t500601603B241045d5s2 ENABLED - fabric_10 OTHER_DISKS OTHER_DISKS c16t500601603B241045d4s2 ENABLED - fabric_11 OTHER_DISKS OTHER_DISKS c16t500601603B241045d3s2 ENABLED - fabric_12 OTHER_DISKS OTHER_DISKS c16t500601603B241045d2s2 ENABLED - fabric_13 OTHER_DISKS OTHER_DISKS c16t500601603B241045d1s2 ENABLED - fabric_14 OTHER_DISKS OTHER_DISKS c16t500601603B241045d0s2 ENABLED - fabric_15 OTHER_DISKS OTHER_DISKS Thanks for any helpSolvedadding new volumes to a DG that has a RVG under VCS cluster
hi, i am having a VCS cluster with GCO and VVR. on each node of the cluster i have a DG with an associated RVG, this RVG contains 11 data volume for Oracle database, these volumes are getting full so i am going to add new disks to the DG and create new volumes and mount points to be used by the Oracle Database. my question:can i add the disks to the DG and volumes to RVGwhile the database is UP and the replication is ON? if the answer is no, please let me know what should be performed on the RVG and rlinkto add these volumes also what to perform on the database resource group to not failover. thanks in advance.Solved4.4KViews0likes14CommentsNFS share doesn't failover due to being busy
Hello! We are trying to implement a failover cluster, which hosts database and files on clustered NFS share. Files are used by the clustered application itself, and by several other hosts. The problem is, that when active node fails (I mean an ungraceful server shutdown or some clustered service stop), the other hosts still continue to use files on our cluster-hosted NFS share. That leads to an NFS-share "hanging", when it doesn't work on the first node, and still cannot be brought online of the second node. Other hosts also experience hanging of requests to that NFS share. Later, I will attach logs, where problem can be observed. The only possible corrective action found by us is total shutdown and sequential start of all cluster nodes and other hosts. Please recommend us a best-practice actions, required for using NFS share on veritas cluster server (maybe, some start/stop/clean scripts being included as a cluster resource, or additional cluster configuration options). Thank you, in advance! Best regards, Maxim Semenov.Solved4.4KViews3likes13Commentsmissing disks and reboot wont solve it
I am very new to veritas. We have AIX 7.1 server using veritas DMP. When I look at the VIO all of the virtual fibre channel adapters are logged in, but on the lpar it is failing to see any disks on fscsi0 and fscsi1. I have been going back and forth with IBM and symantec and cannot get this resolved, so decided to pick your brains here. # lsdev| grep fscsi fscsi0 Available 01-T1-01 FC SCSI I/O Controller Protocol Device fscsi1 Available 02-T1-01 FC SCSI I/O Controller Protocol Device fscsi2 Available 03-T1-01 FC SCSI I/O Controller Protocol Device fscsi3 Available 04-T1-01 FC SCSI I/O Controller Protocol Device fscsi4 Available 05-T1-01 FC SCSI I/O Controller Protocol Device fscsi5 Available 06-T1-01 FC SCSI I/O Controller Protocol Device fscsi6 Available 07-T1-01 FC SCSI I/O Controller Protocol Device fscsi7 Available 08-T1-01 FC SCSI I/O Controller Protocol Device # vxdmpadm listctlr CTLR_NAME ENCLR_TYPE STATE ENCLR_NAME PATH_COUNT ========================================================================= fscsi2 Hitachi_VSP ENABLED hitachi_vsp0 44 fscsi3 Hitachi_VSP ENABLED hitachi_vsp0 44 fscsi4 Hitachi_VSP ENABLED hitachi_vsp0 44 fscsi5 Hitachi_VSP ENABLED hitachi_vsp0 44 fscsi6 Hitachi_VSP ENABLED hitachi_vsp0 44 fscsi7 Hitachi_VSP ENABLED hitachi_vsp0 44 ^ Above you will see that fscsiX seen by OS is not being seen by veritas. How can I force them into veritas? I have already tried rebooting the VIO and LPAR and that doesnt seem to help. FWIW, i deleted the disks that were in defined state. Usually when MPIO is being used and we lose path, deleting the disks and virtual fibrechannel adapter and running cfgmgr solves the issue, but that doesnt seem to help here.SolvedSFHACFS vs Hitachi-VSP & Host mode option 22
Hi! I'm installing several new SFHACFS clusters and during the failover testing I run into annoying problem - when I fence one node of the cluster, DMP logs high number of path down/path up events which in the end causes the disk to disconnect even on other active nodes. We've found out that our disks were exported without the host mode option 22, so we fixed this on storage. Even after this the clusters bahaved the same. Later I've read somewhere on internet, that it's good idea to relabel the disks, so I've requested new disks from storage and did vxevac to new disks. This fixed just two clusters we have, but the other two are behaving still the same. Have anybody experienced anything similar? Do you know anything I can test / check on the servers to determine the difference? The only difference between the enviroment is that the not-working clusters have the disks mirrored from two storage systems, while the working ones have the data disks only from one storage.SolvedListener resource remain faulted
Hello, we are doing some failure tests for a customer. We have VCS 6.2 running on solaris 10. We have an Oracle database and of course the listener associated with it. We try to simulate different kind of failures. One of them is to kill the listener. In this situation the cluster observes that the listener has died, and it fails over the service to the other node. BUT the listener resource will remain in FAULTED state on the original node, and the group to which belongs will be in OFFLINE FAULTED state. In this situation if something goes wrong on the second node the service will not fail back to the original one until we manually run hagrp -clear. Is there anything we can do to fix this? (to have the clear done automatically) Here are some lines from the log: 2015/03/30 17:26:10 VCS ERROR V-16-2-13067 (node2p) Agent is calling clean for resource(ora_listener-res) because the resource became OFFLINE unexpectedly, on its own. 2015/03/30 17:26:11 VCS INFO V-16-2-13068 (node2p) Resource(ora_listener-res) - clean completed successfully. 2015/03/30 17:26:11 VCS INFO V-16-1-10307 Resource ora_listener-res (Owner: Unspecified, Group: oracle_rg) is offline on node2p (Not initiated by VCS) in these it says that clean for the resource has completed successfully, but the resource is still faulted. but if I run hares -clear manually, the the fault goes away. 20150330-173628:root@node1p:~# hares -state ora_listener-res #Resource Attribute System Value ora_listener-res State node1p ONLINE ora_listener-res State node2p FAULTED 20150330-173636:root@node1p:~# hares -clear ora_listener-res 20150330-173653:root@node1p:~# hares -state ora_listener-res #Resource Attribute System Value ora_listener-res State node1p ONLINE ora_listener-res State node2p OFFLINE 20150330-173655:root@node1p:~#Solved3.4KViews0likes5CommentsHave VCS to send email to admin upon fault
We a looking for a way to have VCS sends off an e-mail alert when the cluster faults to an admin. Looking at some of the previous solutions, I have come across two possible solutions. First I need to setup the traps to detect the fault. https://www-secure.symantec.com/connect/forums/vcs-notifierhanotify Then I need to create a resource to send an alert. https://www-secure.symantec.com/connect/forums/alerting-feature-applicationha Is this correct? I might need help with finding the correct trap status for the fault alert. ThanksSolved3.1KViews1like9CommentsApplication HA clustering has dropped a disk... :-/
Hi, After I completed the Application HA clustering of SQL2008 across two Windows 2008 R2 nodes, I found that the SQL installation'svirtual Backup disk (M: for reference) remained on the first node after I'd initiated a switch to the second node via the High Availability tab in the vSphere Client. The other disks re-attached to the second node ok. On closer inspection in Cluster Explorer on one of the nodes I discovered that the Mount Point for that disk was completely missing! I attempted to manually create the Mount Point and brought the resource online in the cluster on the first node, but attempting the switch operation again failed as the virtual disk failed to re-attach to the second node. How to fix without blatting the clustering / SQL installations? Thanks!Solved2.9KViews0likes10Comments