Fencing and Reservation Conflict
Hi to all I have redhat linux 5.9 64bit with SFHA 5.1 SP1 RP4 with fencing enable ( our storage device is IBM .Storwize V3700 SFF scsi3 compliant [root@mitoora1 ~]# vxfenadm -d I/O Fencing Cluster Information: ================================ Fencing Protocol Version: 201 Fencing Mode: SCSI3 Fencing SCSI3 Disk Policy: dmp Cluster Members: * 0 (mitoora1) 1 (mitoora2) RFSM State Information: node 0 in state 8 (running) node 1 in state 8 (running) ******************************************** in /etc/vxfenmode (scsi3_disk_policy=dmp and vxfen_mode=scsi3) vxdctl scsi3pr scsi3pr: on [root@mitoora1 etc]# more /etc/vxfentab # # /etc/vxfentab: # DO NOT MODIFY this file as it is generated by the # VXFEN rc script from the file /etc/vxfendg. # /dev/vx/rdmp/storwizev70000_000007 /dev/vx/rdmp/storwizev70000_000008 /dev/vx/rdmp/storwizev70000_000009 ****************************************** [root@mitoora1 etc]# vxdmpadm listctlr all CTLR-NAME ENCLR-TYPE STATE ENCLR-NAME ===================================================== c0 Disk ENABLED disk c10 StorwizeV7000 ENABLED storwizev70000 c7 StorwizeV7000 ENABLED storwizev70000 c8 StorwizeV7000 ENABLED storwizev70000 c9 StorwizeV7000 ENABLED storwizev70000 main.cf cluster drdbonesales ( UserNames = { admin = hlmElgLimHmmKumGlj } ClusterAddress = "10.90.15.30" Administrators = { admin } UseFence = SCSI3 ) ********************************************** I configured the coordinator fencing so I have 3 lun in a veritas disk group ( dmp coordinator ) All seems works fine but I noticed a lot of reservation conflict in the messages of both nodes On the log of the server I am constantly these messages: /var/log/messages Nov 26 15:14:09 mitoora2 kernel: sd 7:0:1:1: reservation conflict Nov 26 15:14:09 mitoora2 kernel: sd 8:0:0:1: reservation conflict Nov 26 15:14:09 mitoora2 kernel: sd 8:0:1:1: reservation conflict Nov 26 15:14:09 mitoora2 kernel: sd 10:0:0:1: reservation conflict Nov 26 15:14:09 mitoora2 kernel: sd 10:0:1:1: reservation conflict Nov 26 15:14:09 mitoora2 kernel: sd 9:0:1:1: reservation conflict Nov 26 15:14:09 mitoora2 kernel: sd 9:0:0:1: reservation conflict Nov 26 15:14:09 mitoora2 kernel: sd 7:0:1:3: reservation conflict Nov 26 15:14:09 mitoora2 kernel: sd 8:0:0:3: reservation conflict Nov 26 15:14:09 mitoora2 kernel: sd 8:0:1:3: reservation conflict Nov 26 15:14:09 mitoora2 kernel: sd 10:0:1:3: reservation conflict You have any idea? Best Regards VincenzoSolved11KViews1like15CommentsVCS ERROR V-16-1-10600 Cannot connect to VCS engine
Ihave installed Symantec Storage Foundation v.4.0 for Linux, Veritas Cluster Servises 4.1 with oracle agents on Red Hat Enterprise Linux 4 update 8. I have installed two nodes in the cluster. Everything worked perfectly. There has been a power failure. After that one node has changed status to UNKNOWN. In log file: II tried to implement the recommendations inhttp://www.symantec.com/business/support/index?page=content&id=TECH54873 It did not help. What to do to startanother node (erpnode2)?Solved7.6KViews0likes4Commentssolution needed for vxfen issue
<!-- @page { margin: 0.79in } P { margin-bottom: 0.08in } A:link { so-language: zxx } --> there is a two node cluster and we split two node cluster for upgrade. The isolated node is not coming up as vxfen is not starting /02/01 11:52:05 VCS CRITICAL V-16-1-10037 VxFEN driver not configured. Retrying... 2013/02/01 11:52:20 VCS CRITICAL V-16-1-10037 VxFEN driver not configured. Retrying... 2013/02/01 11:52:35 VCS CRITICAL V-16-1-10037 VxFEN driver not configured. Retrying... 2013/02/01 11:52:50 VCS CRITICAL V-16-1-10037 VxFEN driver not configured. Retrying... 2013/02/01 11:53:05 VCS CRITICAL V-16-1-10037 VxFEN driver not configured. Retrying... 2013/02/01 11:53:20 VCS CRITICAL V-16-1-10031 VxFEN driver not configured. VCS Stopping. Manually restart VCS after configuring fencing ^C IOFENCING configuration seems okay on node 2 as /etc/vxfentab has the entry for co-ordinator disks and /etc/vxfendg has diskgroup entry vxfendg2 and these disks and diskgroup are visible too. DEVICE TYPE DISK GROUP STATUS c0t5006016047201339d0s2 auto:sliced - (ossdg) online c0t5006016047201339d1s2 auto:sliced - (sybasedg) online c0t5006016047201339d2s2 auto:sliced - (vxfendg1) online c0t5006016047201339d3s2 auto:sliced - (vxfendg1) online c0t5006016047201339d4s2 auto:sliced - (vxfendg1) online c0t5006016047201339d5s2 auto:sliced - (vxfendg2) online c0t5006016047201339d6s2 auto:sliced - (vxfendg2) online c0t5006016047201339d7s2 auto:sliced - - online c2t0d0s2 auto:SVM - - SVM c2t1d0s2 auto:SVM - - SVM On checking vxfen.log nvoked S97vxfen. Starting Fri Feb 1 11:50:37 CET 2013 starting vxfen.. Fri Feb 1 11:50:37 CET 2013 calling start_fun Fri Feb 1 11:50:38 CET 2013 found vxfenmode file Fri Feb 1 11:50:38 CET 2013 calling generate_vxfentab Fri Feb 1 11:50:38 CET 2013 checking for /etc/vxfendg Fri Feb 1 11:50:38 CET 2013 found /etc/vxfendg. Fri Feb 1 11:50:38 CET 2013 calling generate_disklist Fri Feb 1 11:50:38 CET 2013 Starting vxfen.. Done. Fri Feb 1 11:50:38 CET 2013 starting in vxfen-startup Fri Feb 1 11:50:38 CET 2013 calling regular vxfenconfig Fri Feb 1 11:50:38 CET 2013 return value from above operation is 1 Fri Feb 1 11:50:38 CET 2013 output was VXFEN vxfenconfig ERROR V-11-2-1003 At least three coordinator disks must be defined Log Buffer: 0xfffffffff4041090 refadm2-oss1{root} # cat /etc/vxfendg vxfendg2 and there are below mentioned two disks in vxfendg2 c0t5006016047201339d5s2 auto:sliced - (vxfendg2) online c0t5006016047201339d6s2 auto:sliced - (vxfendg2) online is it due to two disks in coordinator diskgroup? Is it a known issue ?Solved7.5KViews1like16CommentsAgent failed in VCS
Agent are in failed status. Below are the messages in engineA.log file. Please let me knwo the cause of this issue VCS WARNING V-16-1-53025 Agent Script has faulted; ipm connection was lost; restarting the agent VCS ERROR V-16-1-10015 Cannot start /opt/VRTSvcs/bin/Script/ScriptAgent please check file VCS WARNING V-16-1-53025 Agent NIC has faulted; ipm connection was lost; restarting the agent VCS ERROR V-16-1-10008 Agent NIC has faulted 6 times since VCS ERROR V-16-1-10015 Cannot start /opt/VRTSvcs/bin/NIC/NICAgent please check file VCS WARNING V-16-10001-4028 (unix) IP:Unix-G1-IP:monitor:Empty NetMask is supplied, default netmask will be used. VCS WARNING V-16-1-10023 Agent DiskGroup not sending alive messages since VCS WARNING V-16-1-53025 Agent DiskGroup has faulted; ipm connection was lost; restarting the agent VCS ERROR V-16-1-10015 Cannot start /opt/VRTSvcs/bin/DiskGroup/DiskGroupAgent please check fileSolved7.3KViews2likes4CommentsVeritas Cluster Server Heartbeat link down, jeapordy state..
Hello Everyone, I am having this problem with the VCS hearbeat links. The VCS are being run on a Solaris machine v440. The VCS version is 4.0 on Solaris 9, I know it's old& EOL. Im just hoping to find and pinpoint the soloution to this problem. The VCS heartbeat links are running on 2 seperate Vlans.This is a 2 node cluster. Recently the old switch was taken out and a new switch CISCO 3750 was added. The switch shows the cables are connected and I am able to see link up from the switch side. Thelinks in ce4 of both servers are not linking. Any ideas besides faulty VLAN? How do I test the communications on that particular VLAN? Here are the results of various commands, any help is apperciated! Thank you! #lltstat -n LLT node information: Node State Links 0 node1 OPEN 1 * 1 node2 OPEN 2 #lltstat -nvv|head LLT node information: Node State Link Status Address 0 node1 OPEN ce4 DOWN ce6 UP 00:03:BA:94:F8:61 * 1 node2 OPEN ce4 UP 00:03:BA:94:A4:6F ce6 UP 00:03:BA:94:A4:71 2 CONNWAIT ce4 DOWN #lltstat -n node information: Node LLT no State Links * 0 node1 OPEN 2 1 node2 OPEN 1 #lltstat -nvv|head LLT node information: Node State Link Status Address * 0 node1 OPEN ce4 UP 00:03:BA:94:F8:5F ce6 UP 00:03:BA:94:F8:61 1 node2 OPEN ce4 DOWN ce6 UP 00:03:BA:94:A4:71 2 CONNWAIT ce4 DOWN #gabconfig -a GAB Port Memberships =============================================================== Port a gen 49c917 membership 01 Port a gen 49c917 jeopardy ;1 Port h gen 49c91e membership 01 Port h gen 49c91e jeopardy ;7.2KViews1like18CommentsVCS with VxVM - newgroup is auto-disabled in cluster
Hi Guys, new to Veritas World. trying to create and Active/passive on two node VCS cluster with VxVM. Installed SFCFS filesets on two AIX nodes. I have a shared disk of 5G on both nodes. I'm able to deport Diskgroup and import on second node and mount filesystem. The error which I'm getting is - # hagrp -online newgroup -sys tiefaphap603 VCS WARNING V-16-1-10159 Group newgroup is auto-disabled in cluster. This can happen if group is not probed on all alive nodes in group's SystemList or VCS engine is not running on all alive nodes in group's SystemList. #vi main.cf include "OracleASMTypes.cf" include "types.cf" include "CFSTypes.cf" include "Db2udbTypes.cf" include "OracleTypes.cf" include "SybaseTypes.cf" cluster tstcls ( UserNames = { admin = bklJknKikEkqGfhFhh } ClusterAddress = "10.68.73.180" Administrators = { admin } ) system tiefaphap603 ( ) system tiefaphap604 ( ) group ClusterService ( SystemList = { tiefaphap603 = 0, tiefaphap604 = 1 } AutoStartList = { tiefaphap603, tiefaphap604 } OnlineRetryLimit = 3 OnlineRetryInterval = 120 ) IP webip ( Device = en0 Address = "10.68.73.180" NetMask = "255.255.255.0" ) NIC csgnic ( Device = en0 NetworkHosts @tiefaphap603 = { "10.68.73.205" } NetworkHosts @tiefaphap604 = { "10.68.73.184" } ) webip requires csgnic // resource dependency tree // // group ClusterService // { // IP webip // { // NIC csgnic // } // } group newgroup ( SystemList = { tiefaphap603 = 0, tiefaphap604 = 1 } AutoStartList = { tiefaphap603 } ) DiskGroup data_dg ( DiskGroup = clshareddg ) Mount mnt ( MountPoint = "/clsfs02" BlockDevice = "/dev/vx/dsk/clshareddg/clslv02" FSType = vxfs ) mnt requires data_dg // resource dependency tree // // group newgroup // { // Mount mnt // { // DiskGroup data_dg // } // }Solved6.1KViews0likes4CommentsService group does not fail over on another node on force power down.
VCS 6.0.1 Hi i have configured a two node cluster with local storage and running two service groups. They both running fine and i am able to switch over them to any node on the cluster but when i forcely power down a node where both service groups are active, just one service group fails over to another node and the one running apache resource gets faild and do not fail over. below pasted the contents of main.cf file. ========================================== cat /etc/VRTSvcs/conf/config/main.cf include "OracleASMTypes.cf" include "types.cf" include "Db2udbTypes.cf" include "OracleTypes.cf" include "SybaseTypes.cf" cluster mycluster ( UserNames = { admin = IJKcJEjGKfKKiSKeJH, root = ejkEjiIhjKjeJh } ClusterAddress = "192.168.25.101" Administrators = { admin, root } ) system server3 ( ) system server4 ( ) group ClusterService ( SystemList = { server3 = 0, server4 = 1 } AutoStartList = { server3, server4 } OnlineRetryLimit = 3 OnlineRetryInterval = 120 ) IP webip ( Device = eth0 Address = "192.168.25.101" NetMask = "255.255.255.0" ) NIC csgnic ( Device = eth0 ) webip requires csgnic // resource dependency tree // // group ClusterService // { // IP webip // { // NIC csgnic // } // } group httpsg ( SystemList = { server3 = 0, server4 = 1 } AutoStartList = { server3, server4 } OnlineRetryLimit = 3 OnlineRetryInterval = 15 ) Apache apachenew ( httpdDir = "/usr/sbin" ConfigFile = "/etc/httpd/conf/httpd.conf" ) IP ipresource ( Device = eth0 Address = "192.168.25.102" NetMask = "255.255.255.0" ) apachenew requires ipresource // resource dependency tree // // group httpsg // { // Apache apachenew // { // IP ipresource // } // } # ===================== engine logs while the powerdown occurs says - 2013/03/12 16:33:02 VCS INFO V-16-1-10077 Received new cluster membership 2013/03/12 16:33:02 VCS NOTICE V-16-1-10112 System (server3) - Membership: 0x1, DDNA: 0x0 2013/03/12 16:33:02 VCS ERROR V-16-1-10079 System server4 (Node '1') is in Down State - Membership: 0x1 2013/03/12 16:33:02 VCS ERROR V-16-1-10322 System server4 (Node '1') changed state from RUNNING to FAULTED 2013/03/12 16:33:02 VCS NOTICE V-16-1-10449 Group httpsg autodisabled on node server4 until it is probed 2013/03/12 16:33:02 VCS NOTICE V-16-1-10449 Group VCShmg autodisabled on node server4 until it is probed 2013/03/12 16:33:02 VCS NOTICE V-16-1-10446 Group ClusterService is offline on system server4 2013/03/12 16:33:02 VCS NOTICE V-16-1-10446 Group httpsg is offline on system server4 2013/03/12 16:33:02 VCS ERROR V-16-1-10205 Group ClusterService is faulted on system server4 2013/03/12 16:33:02 VCS NOTICE V-16-1-10446 Group ClusterService is offline on system server4 2013/03/12 16:33:02 VCS INFO V-16-1-10493 Evaluating server3 as potential target node for group ClusterService 2013/03/12 16:33:02 VCS INFO V-16-1-10493 Evaluating server4 as potential target node for group ClusterService 2013/03/12 16:33:02 VCS INFO V-16-1-10494 System server4 not in RUNNING state 2013/03/12 16:33:02 VCS NOTICE V-16-1-10301 Initiating Online of Resource webip (Owner: Unspecified, Group: ClusterService) on System server3 2013/03/12 16:33:02 VCS WARNING V-16-1-11141 LLT heartbeat link status changed. Previous status =eth1, UP; Current status =eth1, DOWN. 2013/03/12 16:33:02 VCS INFO V-16-6-15015 (server3) hatrigger:/opt/VRTSvcs/bin/triggers/sysoffline is not a trigger scripts directory or can not be executed 2013/03/12 16:33:14 VCS INFO V-16-1-10298 Resource webip (Owner: Unspecified, Group: ClusterService) is online on server3 (VCS initiated) 2013/03/12 16:33:14 VCS NOTICE V-16-1-10447 Group ClusterService is online on system server3 as per the above logs, the default SG ClusterService has been failed over to another node but SG httpsg faild. please suggest on it. Thanks....Solved5.5KViews1like16CommentsUnable to use vcs commands after installation of vcs
Hi, After installation of VCS 6.0 in Sol10,I am unable to run even hastatus -sum command it says, command not found however, llttab shows the O/p as bash-3.00# cat /etc/llttab set-node papa set-cluster 37074 link e1000g0 /dev/e1000g:0 - ether - - link e1000g1 /dev/e1000g:1 - ether - - link-lowpri e1000g2 /dev/e1000g:2 - ether - - ===================================== bash-3.00# cat /etc/gabtab /sbin/gabconfig -c -n1 ============================ checked with pkginfo, bash-3.00# pkginfo |grep vcs system VRTSvcs Veritas Cluster Server by Symantec system VRTSvcsag Veritas Cluster Server Bundled Agents by Symantec system VRTSvcsea Veritas High Availability Enterprise Agents by Symantec ============================================================================ bash-3.00# df -h Filesystem size used avail capacity Mounted on /dev/dsk/c1t0d0s0 14G 2.3G 12G 17% / /devices 0K 0K 0K 0% /devices ctfs 0K 0K 0K 0% /system/contract proc 0K 0K 0K 0% /proc mnttab 0K 0K 0K 0% /etc/mnttab swap 2.2G 1.1M 2.2G 1% /etc/svc/volatile objfs 0K 0K 0K 0% /system/object sharefs 0K 0K 0K 0% /etc/dfs/sharetab /dev/dsk/c1t0d0s5 9.6G 4.3G 5.3G 45% /usr /usr/lib/libc/libc_hwcap1.so.1 9.6G 4.3G 5.3G 45% /lib/libc.so.1 fd 0K 0K 0K 0% /dev/fd /dev/dsk/c1t0d0s3 1.9G 93M 1.7G 5% /var swap 2.2G 44K 2.2G 1% /tmp swap 2.2G 32K 2.2G 1% /var/run /dev/dsk/c1t0d0s4 2.9G 819M 2.0G 29% /opt /dev/dsk/c1t0d0s6 103M 1.0M 92M 2% /temp /dev/dsk/c1t0d0s7 18G 19M 18G 1% /export/home /vol/dev/dsk/c0t0d0/sol_10_910_x86 2.0G 2.0G 0K 100% /cdrom/sol_10_910_x86 /dev/odm 0K 0K 0K 0% /dev/odm /dev/vx/dsk/datadg/datavol 200M 103M 91M 54% /pictor ========================================================== bash-3.00# vxdisk list DEVICE TYPE DISK GROUP STATUS disk_0 auto:none - - online invalid disk_1 auto:none - - online invalid disk_2 auto:cdsdisk datadg01 datadg online disk_3 auto:cdsdisk datadg02 datadg online PLease help..5.4KViews0likes3CommentsError During VCS configuration
Hi All, I have setup VCS Lab on my own laptop Environment is as follows: Veritas Workstation 8.0 2 x VMS with 1.5 GB RAM each, Solaris 10 U9 64-bit OS. With All pre-requisites like 3 NICS. 1 for Public interface 2 for Private interface i.e. for I/O fencing. Authentication without root password. for shared Storage I have created one more VM with Solaris 10 OS and few disks are shared using iscsi protocol to two node cluster. I am able to install x_86 VCS 5.1 on this two node cluster sucessfully. But when I am trying to configure cluster using ./installvcs -configure I observed following error : Veritas Cluster Server 5.1 SP1 Configure Program [m [7m prod dev [m [1m 1[0m) Configure heartbeat links using LLT over Ethernet [1m 2[0m) Configure heartbeat links using LLT over UDP [1m 3[0m) Automatically detect configuration for LLT over Ethernet [1m b[0m) Back to previous menu [1mHow would you like to configure heartbeat links? [1-3,b,q,?] (3)[0m 1 Discovering NICs on prod ............................................................................................ Discovered e1000g0 e1000g1 e1000g2 e1000g3 [1mEnter the NIC for the first private heartbeat link on prod: [b,q,?] (e1000g2)[0m [1mWould you like to configure a second private heartbeat link? [y,n,q,b,?] (n)[0m y [1mEnter the NIC for the second private heartbeat link on prod: [b,q,?] (e1000g3)[0m [1mWould you like to configure a third private heartbeat link? [y,n,q,b,?] (n)[0m [1mDo you want to configure an additional low-priority heartbeat link? [y,n,q,b,?] (n)[0m [1mAre you using the same NICs for private heartbeat links on all systems? [y,n,q,b,?] (y)[0m Can't use string ("VCS51") as a HASH ref while "strict refs" in use at ../scripts/CPIP/Prod/VCS51.pm line 3870, <STDIN> line 10. I tried to google above problem but could not find the answer , if anybody encounter such error and resolved this pls help me to resolve. For any further info. require pls reply. Thanks & Regards AnishSolved4.9KViews2likes17Commentslltconfig ERROR V-14-2-15121 LLT unconfigure aborted,
Hello, I've a 2 node cluster, with on CP server and 2 fencing disks. But think something is wrong on my fencing configuration. OS : RHEL6.6 VCS 6.2.1 # gabconfig -a GAB Port Memberships =============================================================== Port a gen 816e15 membership 01 Port a gen 816e15 jeopardy ;1 Port b gen 816e1b membership 01 Port b gen 816e1b jeopardy ;1 Port h gen 816e1e membership 01 Port h gen 816e1e jeopardy ;1 # lltconfig -a list Link 0 (bond0): Node 0 Node01 : 00:XX:XX:XX:XX:YY Node 1 Node02 : 00:XX:XX:XX:XX:YX permanent # cat /etc/llttab set-node Node02 set-cluster 229 link bond0 bond0 - ether - - link eth2 eth-00:11:AA:77:BB:22- ether - - I don't have a eth2 on my server, I tried to remove the second line of my/etc/llttab and stop and restart # /etc/init.d/llt stop Stopping LLT: LLT lltconfig ERROR V-14-2-15121 LLT unconfigure aborted, unregister 4 port(s) LLT:Warning: lltconfig failed. Retrying [1] LLT lltconfig ERROR V-14-2-15121 LLT unconfigure aborted, unregister 4 port(s) LLT:Warning: lltconfig failed. Retrying [2] LLT lltconfig ERROR V-14-2-15121 LLT unconfigure aborted, unregister 4 port(s) LLT:Warning: lltconfig failed. Retrying [3] LLT lltconfig ERROR V-14-2-15121 LLT unconfigure aborted, unregister 4 port(s) LLT:Warning: lltconfig failed. Retrying [4] LLT lltconfig ERROR V-14-2-15121 LLT unconfigure aborted, unregister 4 port(s) LLT:Warning: lltconfig failed. Retrying [5] LLT:Error: lltconfig failed Any suggestions will be welcome ... thanks CédricSolved4.6KViews0likes4Comments