Veritas cluster MultiNICA, IP Conservation Mode (ICM), not recover corectly
Hi to all, I have a cluster 5.1 Redhat last version or parches. and I have two network interfaces, eth0 and eth4 that are MultiNICA IP Conservation Mode (this is because are external ips and don't want pay for more IP. This is with simulate ips machine1 eth0 has 192.168.1.10 (in the configuration of redhat /etc/networking this eth0 has 192.168.1.10) machine1 eth4 has 192.168.1.10 (in the configuration of redhat /etc/networking this eth4 not has any IP) machine2 eth0 has 192.168.1.11 (in the configuration of redhat /etc/networking this eth0 has 192.168.1.11) machine2 eth4 has 192.168.1.11 (in the configuration of redhat /etc/networking this eth4 not has any IP) Service has 192.168.1.12 Today I make some testing for running service, and pluging off the eth0 the sevice and the ip go to eth4.... OK (Machine1) pluging off the eth4 in machine1 the service fails and go to machine 2 OK.. pluging on eth0 and eth4 in machine 1 and the service allways said that are in fail. this is the message MultiNICA:nic205:monitor:On eth0 current statistics 0 are same as that of Previous statistics 0 I will go to the machine and see ifconfig -a and see that the machine don't have any ip configured in any interfaces (if make a network restart the system go to ok again but if use Performance Mode the system recovers good I don't need restart the network.. I don't want to use ip command to configure the ips, and for this I put Options "broadcast 192.168.1.255" Because I don't want to clean the default interfaces. Any idea why this are happeing? regards!!547Views3likes1Commentneed a solution
we have 2 node cluster and with version 5.1 we experienced outage and I think it was due to below error messages can someone shed some light on these messages qlc: [ID 630585 kern.info] NOTICE: Qlogic qlc(1): Loop OFFLINE qlc: [ID 630585 kern.info] NOTICE: Qlogic qlc(1): Loop ONLINE fctl: [ID 999315 kern.warning] WARNING: fctl(4): AL_PA=0xe8 doesn't exist in LILP map scsi: [ID 107833 kern.warning] WARNING: /pci@0,600000/pci@0/pci@9/SUNW,qlc@0/fp@0,0/ssd@w203400a0b875f9d9,0 (ssd3): Command failed to complete...Device is gone scsi: [ID 107833 kern.warning] WARNING: /pci@0,600000/pci@0/pci@9/SUNW,qlc@0/fp@0,0/ssd@w203400a0b875f9d9,0 (ssd3): Command failed to complete...Device is gone scsi: [ID 107833 kern.warning] WARNING: /pci@0,600000/pci@0/pci@9/SUNW,qlc@0/fp@0,0/ssd@w203400a0b875f9d9,0 (ssd3): Command failed to complete...Device is gone scsi: [ID 243001 kern.info] /pci@0,600000/pci@0/pci@9/SUNW,qlc@0/fp@0,0 (fcp4): offlining lun=0 (trace=0), target=e8 (trace=2800004) vxdmp: [ID 631182 kern.notice] NOTICE: VxVM vxdmp V-5-0-0 removed disk array 600A0B800075F9D9000000004D2334F5, datype = ST2540- vxdmp: [ID 443116 kern.notice] NOTICE: VxVM vxdmp V-5-0-0 i/o error occured (errno=0x6) on dmpnode 334/0x2c last message repeated 59 times vxdmp: [ID 480808 kern.notice] NOTICE: VxVM vxdmp V-5-0-112 disabled path 118/0x18 belonging to the dmpnode 334/0x28 due to open failure vxdmp: [ID 824220 kern.notice] NOTICE: VxVM vxdmp V-5-0-111 disabled dmpnode 334/0x28 what is this dmpnode 334/0x28 signify, I forget how to map this to device as i only remember is tht its in hexadecimal. Also, what could be the cause of it ... is it due to HBA as issue starts with the message like below qlc: [ID 630585 kern.info] NOTICE: Qlogic qlc(1): Loop OFFLINE qlc: [ID 630585 kern.info] NOTICE: Qlogic qlc(1): Loop ONLINE fctl: [ID 999315 kern.warning] WARNING: fctl(4): AL_PA=0xe8 doesn't exist in LILP map1.8KViews2likes4CommentsVCS Error Codes for all platforms
Hello Gents, Do we have a list of all erro codes for VCS ? also if all the error codes are generic and are common for all platforms (including Linux,Solaris, Windows,AIX) Need this confirmation urgently, planning to design a common monitoring agent. Best Regards, NimishSolved6.3KViews2likes6CommentsBest practices for failover without heartbeat
I dont think there is a solution to this, becuse it would violate HA, butI need to ask the question. Prerequisite: HA Setup is up and running. Steps: 1). Unplug the heartbeat cable. (eth1) 2). Unplug/failover the primary system. Secondary system does not come up without manually enabling the autodisable on the faulted primary. They want the secondary system to come up. Automaticly. Is this possable? is it recomemned?968Views2likes2CommentsSG is not switching to next node.
Hi All, I am new to VCS but good in HACMP. In our environment we are using VCS-6.0, I one server we found that the SG is not moving from one node to another node when we tried manual failover using the bellow command. hagrp -switch <SGnamg> -to <sysname> We able to see that the SG is offline in the currnent node but it's not coming online in the secondary node. There is no error locked in engine_A.log except the bellow entry cpus load more than 60% <Secondary node name> Can anyone help me to find the solution for this. I will provide the output of any commands if you need more info to help me out to get this trouble shooted :) Thanks,Solved1.8KViews1like8CommentsFencing and Reservation Conflict
Hi to all I have redhat linux 5.9 64bit with SFHA 5.1 SP1 RP4 with fencing enable ( our storage device is IBM .Storwize V3700 SFF scsi3 compliant [root@mitoora1 ~]# vxfenadm -d I/O Fencing Cluster Information: ================================ Fencing Protocol Version: 201 Fencing Mode: SCSI3 Fencing SCSI3 Disk Policy: dmp Cluster Members: * 0 (mitoora1) 1 (mitoora2) RFSM State Information: node 0 in state 8 (running) node 1 in state 8 (running) ******************************************** in /etc/vxfenmode (scsi3_disk_policy=dmp and vxfen_mode=scsi3) vxdctl scsi3pr scsi3pr: on [root@mitoora1 etc]# more /etc/vxfentab # # /etc/vxfentab: # DO NOT MODIFY this file as it is generated by the # VXFEN rc script from the file /etc/vxfendg. # /dev/vx/rdmp/storwizev70000_000007 /dev/vx/rdmp/storwizev70000_000008 /dev/vx/rdmp/storwizev70000_000009 ****************************************** [root@mitoora1 etc]# vxdmpadm listctlr all CTLR-NAME ENCLR-TYPE STATE ENCLR-NAME ===================================================== c0 Disk ENABLED disk c10 StorwizeV7000 ENABLED storwizev70000 c7 StorwizeV7000 ENABLED storwizev70000 c8 StorwizeV7000 ENABLED storwizev70000 c9 StorwizeV7000 ENABLED storwizev70000 main.cf cluster drdbonesales ( UserNames = { admin = hlmElgLimHmmKumGlj } ClusterAddress = "10.90.15.30" Administrators = { admin } UseFence = SCSI3 ) ********************************************** I configured the coordinator fencing so I have 3 lun in a veritas disk group ( dmp coordinator ) All seems works fine but I noticed a lot of reservation conflict in the messages of both nodes On the log of the server I am constantly these messages: /var/log/messages Nov 26 15:14:09 mitoora2 kernel: sd 7:0:1:1: reservation conflict Nov 26 15:14:09 mitoora2 kernel: sd 8:0:0:1: reservation conflict Nov 26 15:14:09 mitoora2 kernel: sd 8:0:1:1: reservation conflict Nov 26 15:14:09 mitoora2 kernel: sd 10:0:0:1: reservation conflict Nov 26 15:14:09 mitoora2 kernel: sd 10:0:1:1: reservation conflict Nov 26 15:14:09 mitoora2 kernel: sd 9:0:1:1: reservation conflict Nov 26 15:14:09 mitoora2 kernel: sd 9:0:0:1: reservation conflict Nov 26 15:14:09 mitoora2 kernel: sd 7:0:1:3: reservation conflict Nov 26 15:14:09 mitoora2 kernel: sd 8:0:0:3: reservation conflict Nov 26 15:14:09 mitoora2 kernel: sd 8:0:1:3: reservation conflict Nov 26 15:14:09 mitoora2 kernel: sd 10:0:1:3: reservation conflict You have any idea? Best Regards VincenzoSolved11KViews1like15CommentsVCS failovers and copies the crontabs
Hello, I am using VCS on Oracle M9000 machines. I have three node cluster. The question is when I failover the services from one node to another I want all the crontabs to be copied to the other live node as well. Which doesnt seem to be working fine for now in my Domain. Can you please help me out that where to define this 'copy cron' procedure so evertime when one enviorment fails over to another node it also copies the same cron from the previous system. Or if there is any procedure which copies the crontabs of every user daily on all cluster nodes. I need to know if this can be configured in VCS. All useful replies are welcome. Best Regards, Mohammad Ali Sarwar1.2KViews1like3CommentsVeritas Cluster -engine_A.log
Getting the following error in engine_A.log: 2013/05/31 14:16:12 VCS INFO V-16-1-10307 Resource cvmvoldg5 (Owner: unknown, Group: lic_DG) is offline on dwlemm2b (Not initiated by VCS) 2013/05/31 14:16:14 VCS INFO V-16-6-15004 (dwlemm2b) hatrigger:Failed to send trigger for resfault; script doesn't exist 2013/05/31 14:16:15 VCS NOTICE V-16-10001-5510 (dwlemm2b) CFSMount:cfsmount5:offline:Attempting fuser TERM : Mount Point : /var/opt/sentinel 2013/05/31 14:16:18 VCS NOTICE V-16-10001-5510 (dwlemm2b) CFSMount:cfsmount2:offline:Attempting fuser TERM : Mount Point : /var/opt/BGw/ServerGroup1 2013/05/31 15:11:51 VCS INFO V-16-1-10307 Resource fmm (Owner: unknown, Group: FMMgrp) is offline on dwlemm2a (Not initiated by VCS) and many more similar kind of error.May anyone knows the cause of any of these error.Solved1.3KViews1like4CommentsFreeze SG or Freeze Node ?
hi there, I have a scnario where applications are under cluster [2 node cluster].I had an issue in the past where hastop -local was executed before bringing down applications manually which resulted in appsg to go in faulted state. My question is should I freeze the service groups individuallyor should I freeze the nodes while performing OS patch maintainance ? Does freezing node solves the same purpose as that freezing sg's in a perticular node ? Regards, Saurabh5.9KViews1like9CommentsHeartbeat timeout value
Hi, Recently one of the cluster node got rebooted due to all heartbeat network down (Due to some changes on switch. It took about app 60 Secs) We informed about the reboot to Network Team and in turn they suggested to change the heartbeat timeout value to 60 Secs. Requesting your help - Is it advisable to change the heartbeat timeout value to 60 Secs. I think the default value is 15 Secs. If we change the value from default, what is the consequences? Please advise. Divakar1.7KViews1like4Comments