I/O fencing disk test
Hello, While testing the disk to be used for fencing prior to configuration, I am being prompted for password for both nodes over and over again. Kindly advise. vxfentsthdw -r -w /dev/hdiskpower2 -n use /usr/bin/rsh VERITAS vxfentsthdw version 5.1.1.0 AIX The utility vxfentsthdw works on the two nodes of the cluster. The utility verifies that the shared storage one intends to use is configured to support I/O fencing. It issues a series of vxfenadm commands to setup SCSI-3 registrations on the disk, verifies the registrations on the disk, and removes the registrations from the disk. The logfile generated for vxfentsthdw is /var/VRTSvcs/log/vxfen/vxfentsthdw.log.1 Enter the first node of the cluster: ibwebprod root@ibwebprod's [1] password: Enter the second node of the cluster: ibwebstdby root@ibwebstdby's [2] password: root@ibwebprod's [3] password:Solved625Views6likes2CommentsCan LLT heartbeats communicate between NICs with different device names?
One 2-node vcs cluster, the heartbeat NICs are eth2 and eth3 on each node, IF eth2 on node1 down, and eth3 on node2 down. Does this mean the 2 heartbeat Links both down, and the Cluster is in split brain situation? Can LLT heartbeats communicate between NIC eth2 and NIC eth3? Since the 《VCS Installation Guide》requires the 2 heartbeat Links in different networks.We should put eth2 of both nodes in the VLAN (VLAN1), and put eth3 of both nodes in another vlan (VLAN2). So in this situation heartbeats cannot communicate between eth2 and eth3. But, in a production cluster system, we found out the 4 NICs--eth2 and eth3 of both nodes are all in a same VLAN. and this lead me to post the discussion thread to ask this question: IF eth2 on node1 down, and eth3 on node2 down, What will happen to the cluster (which is in active-standby mode) ? Thanks!Solved1.6KViews5likes5CommentsFailed to execute postonline trigger
Postline trigger is getting failed to execute on onlining a resource. Getting a below error. 2015/02/27 01:25:47 VCS INFO V-16-6-15003 (dsacl2ip) hatrigger:Failed to execute /opt/VRTSvcs/bin/triggers/postonline dsacl2ip SG-APP-TM-003 2015/02/27 01:29:27 VCS INFO V-16-6-15003 (dsacl2ip) hatrigger:Failed to execute /opt/VRTSvcs/bin/triggers/postonline dsacl2ip SG-APP-TM-003 i manually executed the command hatrigger -postonline 0 dsacl2ip SG-APP-TM-003 but again same error. Please let me know what can be the issue. Below is my script ccmId="@(#)%name: postonline % %version: 5 %, %date_modified: Thu Feb 19 12:13:59 2015 %" ##################################################### # Configure required IP addresses for TM in this section ##################################################### export TMVIRTIP=10.225.230.57 export TMIPADDRS=" 10.15.14.139 10.15.15.137 10.215.17.0 " ##################################################### # Configure required IP addresses for APPin this section ##################################################### export RMVIRTIP=10.225.238.58 export RMIPADDRS=" 192.168.17.4 " ##################################################### # # DO NOT CHANGE ANYTHING BELOW THIS LINE # ##################################################### ##################################################### if [ -z "$VCS_HOME" ]; then VCS_HOME=/opt/VRTSvcs fi ##################################################### # Source VCS helper script if [ -r $VCS_HOME/bin/ag_i18n_inc.sh ]; then . $VCS_HOME/bin/ag_i18n_inc.sh else function VCSAG_LOG_MSG { msgtype=$1 msgtxt="$2" category="$3" case $msgtype in E) msgtype="ERROR:";; W) msgtype="WARNING:";; I) msgtype="INFO:";; esac echo "`date +'%m/%d %T'`: $msgtype $msgtxt ($category)" } fi ##################################################### # Check if relevant arguments were provided system=$1 group=$2 if [ -z "$system" ]; then VCSAG_LOG_MSG "W" "Failed to continue; undefined system name" 15028 exit elif [ -z "$group" ]; then VCSAG_LOG_MSG "W" "Failed to continue; undefined group name" 15031 exit fi ##################################################### # Determine the interface that the APP virtual IP is bound to function getInterface { /bin/netstat -ie | grep -B1 "$1" | sed -ne "s/\(^$2.*\) *Link.*/\1/p" } ##################################################### function addRoutes { typeset device=$1 typeset gwAddr=$2 shift 2 typeset maskAddr for ipAddr in $*; do if [ "${ipAddr##*.}" = "0" ];then maskAddr=255.255.255.0 else maskAddr=255.255.255.255 fi echo /sbin/route add -net $ipAddr netmask $maskAddr gw $gwAddr dev $device done } ##################################################### case $group in SG-APP-TM-003) virtInterface=$(getInterface $TMVIRTIP bond) if [ -z "$virtInterface" ]; then VCSAG_LOG_MSG "E" "Could not determine interface for virtual IP" 15032 else VCSAG_LOG_MSG "I" "Adding routes for interface $virtInterface" 15032 addRoutes $virtInterface 10.225.238.2 $TMIPADDRS fi ;; SG-APP-DS_RM-004) virtInterface=$(getInterface $RMVIRTIP bond) if [ -z "$virtInterface" ]; then VCSAG_LOG_MSG "E" "Could not determine interface for virtual IP" 15032 else VCSAG_LOG_MSG "I" "Adding routes for interface $virtInterface" 15032 addRoutes $virtInterface 10.225.238.5 $RMIPADDRS fi ;; *) ;; esacSolved940Views4likes1CommentNeed inputs on configuring alerts to send email upon faults
I am planning to setup alerts for my application to notify users upon faults. I understand that i need to configure a NotifierMngr resourse under ClusterService. Please answer following questions that i have for my understanding. 1) Should I configure a new ClusterService group or use the existing one. Currently I have following resources in my existing ClusterService group. # hagrp -resources ClusterService webip csgnic Wouldnt be a risk if Notifier resource is configured in this group itself? If i need to configure a seperate a ClusterService, could you please guide me how do i do that using command line. 2)How to do give multiple severity to a single recipient. What i mean is that if i want error, severe error and warning alerts to reach my email ID, how do i configure? is below command fine hares -modify ntfr SmtpRecipients parakh.agarwal@abc.com Error Warning SevereError 3) How to i give multiple email ID with their respective severity labels. Do i need to modify main.cf file and give email IDs semi colon seperated or is it possible through command line as well? 4) Are below commands to add Notifier resource OK? haconf -makerw hares -add ntfr NotifierMngr ClusterService hares -modify ntfr SmtpServer mailhost hares -modify ntfr SmtpRecipients parakh.agarwal@abc.com Error Warning hares -modify ntfr Enabled 1 haconf -dump -makero 5) is there anything that needs to be done apart from this.Solved1.5KViews4likes14CommentsVeritas cluster issue
Hi We have a VCS of two node. Two file systems are under VCS configured from VXVM. One file system got 100% full.Now we have rebooted the cluster node & the mount points started to show,but after some time that file system got disappear,Checked with concerned DG that also got disabed,we tried to manually import & got succeded.But after starting the volumes we tried to mount the filesystem under the specified mount point. But the below error giving--- # mount /dev/vx/dsk/bgw1dg/vol01 /var/opt/BGw/Server1 mount: /dev/vx/dsk/bgw1dg/vol01 is not this fstype # Kindly suggest what could be the cause. regards...ArupSolved2.2KViews4likes5CommentsFile permission change for directory in vcs
Hi All , We have a directory say /test which is under vcs in AIX ,The filesystem is JFS2 . The owner and group of directory is root:root .The filesystem is currently mounted by vcs . canwe chnage the permission with chown command . NOTE: i dont see ownership assigned for mountpoint when i checked via hares -display <Mountpoint> Regards DilipSolved1.8KViews3likes4Commentsvcs resource status abnormal
I am facing a problem,i found a resource become online monitor timeout , and i don't know how to fix it ; could anybody give me some suggestion. thanks; #hastatus .............. group resource system message --------------- -------------------- -------------------- -------------------- cfsmount1 CCCG09BER ONLINE cfsmount1 CCCG10BER ONLINE cfsmount2 CCCG09BER ONLINE cfsmount2 CCCG10BER ONLINE ------------------------------------------------------------------------- cvmvoldg1 CCCG09BER ONLINE cvmvoldg1 CCCG10BER ONLINE group resource system message --------------- -------------------- --------------- -------------------- cvmvoldg1 CCCG09BER |MONITOR TIMEDOUT|UNABLE TO OFFLINE| cvmvoldg2 CCCG09BER ONLINE cvmvoldg2 CCCG10BER ONLINE cvmvoldg2 CCCG09BER |MONITOR TIMEDOUT|UNABLE TO OFFLINE| --------------------------------------------------------------------------------------------------------------------------------------------- # hares -state cvmvoldg1 #Resource Attribute System Value cvmvoldg1 State CCCG09BER ONLINE|MONITOR TIMEDOUT cvmvoldg1 State CCCG10BER ONLINE # # vxdisk list DEVICE TYPE DISK GROUP STATUS c0t0d0s2 auto:SVM - - SVM c0t1d0s2 auto:SVM - - SVM c0t2d0s2 auto:none - - online invalid c0t3d0s2 auto:none - - online invalid c3t0d0 auto:cdsdisk - - online c3t0d1 auto:sliced mmdatadgemcpower4 mmdatadg online shared c3t0d2 auto:cdsdisk - - online c3t0d3 auto:cdsdisk - - online c3t0d4 auto:sliced mmdbdgemcpower2 mmdbdg online shared c3t0d5s2 auto:sliced mmdatadgc3t0d5 mmdatadg online shared c3t0d6s2 auto:sliced mmdatadgc3t0d6 mmdatadg online shared c3t0d7s2 auto:sliced mmdatadgc3t0d7 mmdatadg online shared c3t0d8s2 auto:sliced mmdatadgc3t0d8 mmdatadg online shared c3t0d9s2 auto:sliced mmdatadgc3t0d9 mmdatadg online shared c3t0d10s2 auto:sliced mmdatadgc3t0d10 mmdatadg online shared # vxprint Disk group: mmdbdg TY NAME ASSOC KSTATE LENGTH PLOFFS STATE TUTIL0 PUTIL0 dg mmdbdg mmdbdg - - - - - - dm mmdbdgemcpower2 c3t0d4 - 419348384 - - - - v vol01 fsgen ENABLED 419346432 - ACTIVE - - pl vol01-01 vol01 ENABLED 419346432 - ACTIVE - - sd mmdbdgemcpower2-01 vol01-01 ENABLED 419346432 0 - - - Disk group: mmdatadg TY NAME ASSOC KSTATE LENGTH PLOFFS STATE TUTIL0 PUTIL0 dg mmdatadg mmdatadg - - - - - - dm mmdatadgc3t0d5 c3t0d5s2 - 1467866928 - - - - dm mmdatadgc3t0d6 c3t0d6s2 - 1887246288 - - - - dm mmdatadgc3t0d7 c3t0d7s2 - 1836919616 - - - - dm mmdatadgc3t0d8 c3t0d8s2 - 1887246288 - - - - dm mmdatadgc3t0d9 c3t0d9s2 - 1887246288 - - - - dm mmdatadgc3t0d10 c3t0d10s2 - 1417511040 - - - - dm mmdatadgemcpower4 c3t0d1 - 1887354784 - - - - v vol01 fsgen ENABLED 12271390720 - ACTIVE - - pl vol01-01 vol01 ENABLED 12271390720 - ACTIVE - - sd mmdatadgemcpower4-01 vol01-01 ENABLED 1887354784 0 - - - sd mmdatadgc3t0d5-01 vol01-01 ENABLED 1467866928 1887354784 - - - sd mmdatadgc3t0d6-01 vol01-01 ENABLED 1887246288 3355221712 - - - sd mmdatadgc3t0d7-01 vol01-01 ENABLED 1836919616 5242468000 - - - sd mmdatadgc3t0d8-01 vol01-01 ENABLED 1887246288 7079387616 - - - sd mmdatadgc3t0d9-01 vol01-01 ENABLED 1887246288 8966633904 - - - sd mmdatadgc3t0d10-01 vol01-01 ENABLED 1417510528 10853880192 - - - messages in engine_A log: 2015/03/06 13:44:21 VCS ERROR V-16-2-13027 (CCCG09BER) Resource(cvmvoldg2) - monitor procedure did not complete within the expected time. 2015/03/06 13:44:22 VCS ERROR V-16-2-13027 (CCCG09BER) Resource(cvmvoldg1) - monitor procedure did not complete within the expected time. 2015/03/06 13:46:28 VCS INFO V-16-1-50086 CPU usage on CCCG09BER is 61% 2015/03/06 13:50:21 VCS ERROR V-16-2-13210 (CCCG09BER) Agent is calling clean for resource(cvmvoldg2) because 4 successive invocatio ns of the monitor procedure did not complete within the expected time. 2015/03/06 13:50:21 VCS ERROR V-16-2-13210 (CCCG09BER) Agent is calling clean for resource(cvmvoldg1) because 4 successive invocatio ns of the monitor procedure did not complete within the expected time. 2015/03/06 13:50:39 VCS INFO V-16-2-13068 (CCCG09BER) Resource(cvmvoldg2) - clean completed successfully. 2015/03/06 13:50:53 VCS INFO V-16-2-13068 (CCCG09BER) Resource(cvmvoldg1) - clean completed successfully. 2015/03/06 13:51:40 VCS ERROR V-16-2-13077 (CCCG09BER) Agent is unable to offline resource(cvmvoldg2). Administrative intervention m ay be required. 2015/03/06 13:51:55 VCS ERROR V-16-2-13077 (CCCG09BER) Agent is unable to offline resource(cvmvoldg1). Administrative intervention m ay be required.Solved1.1KViews3likes1CommentVCS verificaiton and Automation
Hello Guys, We do configure many VCS cluster and we would like to verify all the confiugraion before we hand over to customer. We would like to automate the verification. here is my plan to do basic verificaiton - Verify LLT config -Verify GAB config - Check main.cf , and see volume group or VIP is not configured in boot up. Anything else I need to check ? feell free to share any ideaSolved942Views3likes2Commentsgco fencing
hi, i think that vxfencing could be setup only within evey cluster from every site.i am not aware if vxfencing could be setup as global fencing.suppose i have a node in site 1 and other node in site 2,and a failover group uses node1 or node2.if the heartbeat link between the node1 and node2 fails,then a split brain happens.then how this issue is solved not to corrupt data? in my opinion the advantage of gco over classic disaster recovery is the speed of recovery,i mean the failover groups will failover fast on the other site if a site goes down.right? tnx a lot.Solved1.6KViews3likes6CommentsSQL Server Service Group can't bring online on the second node
Hello everyone, Please I need a solution about my SQL cluster in test environnement. I have issues with the service group that I can"t bring it online on the second node. But in the first node its can be online without errors. For testing the intergity of the DATA on the ISCSI partition, I deleted the service group. And i mounted the partition on the second node, I recrated the service group, I can bring it online easily. When I switch it to the other node (first node on the cluster), same error, its can't be bring in online. Below some error on the event viewer about this problem : Thank you in advance for your helpSolved1.5KViews3likes5Comments