AIX

71 Topics

VCS stop and service group offline quesiton
I have a quick question regarding service group offline. We only want to bring up/down services when we execute hagrp -online/-offline manually. When VCS shutdown, we don't want to bring up the service offline. I know if we freeze it won't go to offline entry. Do we have any other option? For an exmaple, for only we can disable the autostart. Do we have any thing for offline?
Solved
mokkan
5 years ago Place Cluster Server
3.3KViews
0likes
2Comments
DSI
Hello folks. I am running SFRAC 6.1 under AIX 6.1 Following few storage issues, i gotdisassociated plexes. while trying to reassociate them with their volumes using vxplex -g <mydg> att, i am hitting a panic on the server. the panic string is DSI (DATA STORAGE INTERRUPT). this panic immediately crash the OS which start dumping and then restarting. Did anyone experienced similar issues before ? Thanks
kwakou
9 years ago Place Storage Foundation
1.1KViews
0likes
3Comments
Remove or modify HA Fire Drill actions
Looking for a method to modify or remove particular action points for the HA Fire Drill. As an example, the Mount Agent action 'mountentry.vfd' specifically checks for mount entries in /etc/filesystems. By chance, our environment and admins require entries to be left in /etc/filesystems but auto mount to FALSE. The configuration is 'ok' for VCS, but will fail the Virtual Fire Drill every time. Removing this action point, or modifying to accept as configured, would allow HA Fire Drill to have valid response of 'Success' or 'Failure' for this environment. Example /etc/filesystem for VCS resource entry: /ora/eomdma/admin: dev = /dev/eomdma_admin vfs = jfs2 log = /dev/eomdma_log mount = false check = false account = false Example non-VCS filesystem /app/tws/twsq: dev = /dev/lv_TWSq vfs = jfs2 log = /dev/hd8 mount = true quota = no account = false Is the only option to forcibly modify the Mount Agent attributes?
notB
9 years ago Place Cluster Server
876Views
0likes
0Comments
I want to switch the application to minos2，but I cannot select the system
I want to switch the application to minos2，but I cannot select the system No system，why？ root@minos1(/)# hastatus -sum -- SYSTEM STATE -- System State Frozen A minos1 RUNNING 0 -- GROUP STATE -- Group System Probed AutoDisabled State B ClusterService minos1 Y N ONLINE B EMS minos1 Y N ONLINE B replication minos1 Y N ONLINE -- REMOTE CLUSTER STATE -- Cluster State N minos2 RUNNING -- REMOTE SYSTEM STATE -- cluster:system State Frozen O minos2:minos2 RUNNING 0
liqiyuan
9 years ago Place Cluster Server
1.5KViews
0likes
3Comments
disk group import failure
I created a full snapshot volume some months ago from Server A to Server B. Both servers are running veritas cluster 6.1 under AIX Now when i want to refresh the snap vol with its original source; I stopped all cluster service on Server B and deported the diskgroup that contain the snap vol But when i try to import it back on Server A, i get the following message VxVM vxdg ERROR V-5-1-10978 Disk group sa_com_dgsp: import failed: Disk write failure
kwakou
9 years ago Place Storage Foundation
1.1KViews
0likes
4Comments
set-addr is /etc/ltttab file
Hello, I have configured VCS cluster with putting the MAC addresses of the partnered node, but I am looking some of our old cluster they configured mac addresses as well in /etc/llttab file. set-addr 1 en1 xx.xx.xxx.xx.xx is it important ?
mokkan
9 years ago Place Cluster Server
486Views
0likes
1Comment
after database integration LPAR went to HUNG state.
Hi, I have intergrated oracledatabase in vcs active-pasice LPAR cluster node. after integration have online server grp onnode A. which is done successfully. for testing i have switch over my db SG on another node but on node B oracle home was not set properly so it was got timout & again failover to node A. but when DB servicegrp come to NODE A, my node A went to HUNG state (unable to login but we can ping the server). so we reboot the lpar. once lpar come up, get worked for some time (almost 5 min) but again it went to hung state. so we reboot thenode A & when server came up, I stopped cluster service on nodeA & it's working fine. After some time almost 1 hrs I run hastart on nodeA &now it;s working fine. can u help me to find out the why server went to hung state after db switch back to node A... also whn i again test switch over & switch back then again node A went to hung state.
Tiger09
9 years ago Place Cluster Server
592Views
0likes
1Comment
unexpected reboot
Hi all, I am runing a SFRAC environnement and one of my 2 nodes cluster frequently reboots unexpectedly. I went through the OS logs and the VCS engine_A but didnt find any clue. Is that an eviction ? when i run lltstat, i can see some errors counters - what does it exactly mean ? LLT errors: 0 Rcv not connected 0 Rcv unconfigured 0 Rcv bad dest address 0 Rcv bad source address 0 Rcv bad generation 0 Rcv no buffer 0 Rcv malformed packet 0 Rcv wrong length packet 0 Rcv bad SAP 0 Rcv bad STREAM primitive 0 Rcv bad DLPI primitive 0 Rcv DLPI error 120 Snd not connected 0 Snd no buffer 0 Snd stream flow drops 42867 Snd no links up 0 Rcv bad checksum 0 Rcv bad udp/ether source address 0 Rcv DLPI link-down error how to efficiently trace these reboots ??
Solved
kwakou
9 years ago Place Cluster Server
1.9KViews
0likes
5Comments
how to change a DISKID?
I have one broken DG: localhost.vtb24.ru# vxprint -hqt Disk group: somedg dg somedg default default 11000 1364281050.8.localhost dm disk1 disk1 auto 65536 2147417808 - dm disk2 - - - - NODEVICE v vol01 - DISABLED ACTIVE 4294834176 SELECT - fsgen pl vol01-01 vol01 DISABLED NODEVICE 4294834176 STRIPE 2/128 RW sd disk1-01 vol01-01 disk1 0 2147417088 0/0 disk1 ENA sd disk2-01 vol01-01 disk2 0 2147417088 1/0 - NDEV localhost.vtb24.ru# vxprint -g somedg -dF'%last_da_name %name %diskid' disk1 disk1 1431951508.19.localhost disk2 disk2 1431951509.21.localhost localhost.vtb24.ru# vxdisk -x DISKID -x DGID -p list DEVICE DISKID DGID ibm_vscsi0_0 - - ibm_vscsi0_1 - - disk1 1431951508.19.localhost 1364281050.8.localhost disk2 1431951508.19.localhost 1364281050.8.localhost Both discs have the same diskid. How can i fix this?
Solved
ikomyagin
9 years ago Place Storage Foundation
1.2KViews
0likes
1Comment
Resource faults issue
Hi Team, Many times we face resource faults alerts on our AIX server,but when we check it shows all are running fine. Also, when checked logs it shows not (initiated by VCS) alerts. Our main concern is to troubleshoot, why we get these alerts and if we get these alerts then why cluster resource is not showing faulty. Below are the logs, -- SYSTEM STATE -- System State Frozen A xxxibm012 RUNNING 0 A xxxibm014 RUNNING 0 -- GROUP STATE -- Group System Probed AutoDisabled State B ClusterService xxxibm012 Y N ONLINE B ClusterService xxxibm014 Y N OFFLINE B DB_INSIGHT_STAGE xxxibm012 Y N ONLINE B DB_INSIGHT_STAGE xxxibm014 Y N OFFLINE ============================================================= 2015/04/21 10:14:53 VCS INFO V-16-1-53504 VCS Engine Alive message!! 2015/04/21 12:57:32 VCS WARNING V-16-10011-5611 (clnibm014) NIC:csgnic:monitor:Second PingTest failed for Virtual Interface en4. Resource is OFFLINE 2015/04/21 12:57:32 VCS INFO V-16-1-50135 User root fired command: hares -modify csgnic ConfidenceMsg from localhost 2015/04/21 12:57:33 VCS ERROR V-16-1-54031 Resource csgnic (Owner: Unspecified, Group: ClusterService) is FAULTED on sys clnibm014 2015/04/21 12:57:33 VCS INFO V-16-6-0 (clnibm014) resfault:(resfault) Invoked with arg0=clnibm014, arg1=csgnic, arg2=ONLINE 2015/04/21 12:57:49 VCS INFO V-16-6-15002 (clnibm014) hatrigger:hatrigger executed /opt/VRTSvcs/bin/triggers/resfault clnibm014 csgnic ONLINE successfully 2015/04/21 12:58:18 VCS ERROR V-16-1-54031 Resource proxy_DB_INSPRD (Owner: Unspecified, Group: DB_INSIGHT_STAGE) is FAULTED on sys clnibm014 2015/04/21 12:58:18 VCS INFO V-16-6-0 (clnibm014) resfault:(resfault) Invoked with arg0=clnibm014, arg1=proxy_DB_INSPRD, arg2=ONLINE 2015/04/21 12:58:29 VCS INFO V-16-6-15002 (clnibm014) hatrigger:hatrigger executed /opt/VRTSvcs/bin/triggers/resfault clnibm014 proxy_DB_INSPRD ONLINE successfully 2015/04/21 12:58:33 VCS INFO V-16-1-50135 User root fired command: hares -modify csgnic ConfidenceMsg Primary test to confirm Online status succeeded. from localhost 2015/04/21 12:58:34 VCS INFO V-16-1-10299 Resource csgnic (Owner: Unspecified, Group: ClusterService) is online on clnibm014 (Not initiated by VCS) 2015/04/21 12:58:34 VCS NOTICE V-16-1-10233 Clearing Restart attribute for group ClusterService on all nodes 2015/04/21 12:58:34 VCS NOTICE V-16-1-51034 Failover group ClusterService is already active. Ignoring Restart 2015/04/21 12:59:18 VCS INFO V-16-1-10299 Resource proxy_DB_INSPRD (Owner: Unspecified, Group: DB_INSIGHT_STAGE) is online on clnibm014 (Not initiated by VCS) 2015/04/21 12:59:18 VCS NOTICE V-16-1-10233 Clearing Restart attribute for group DB_INSIGHT_STAGE on all nodes 2015/04/21 12:59:18 VCS NOTICE V-16-1-51034 Failover group DB_INSIGHT_STAGE is already active. Ignoring Restart 2015/04/21 12:59:34 VCS INFO V-16-1-50135 User root fired command: hares -modify csgnic ConfidenceMsg Primary test to confirm Online status succeeded. from localhost 2015/04/21 13:18:53 VCS INFO V-16-1-50135 User root fired command: hares -modify csgnic ConfidenceMsg Relying on secondary test to confirm Online status. from localhost 2015/04/21 13:19:34 VCS INFO V-16-1-50135 User root fired command: hares -modify csgnic ConfidenceMsg Primary test to confirm Online status succeeded. from localhost 2015/04/21 13:44:49 VCS INFO V-16-1-50135 User root fired command: hares -modify csgnic ConfidenceMsg Relying on secondary test to confirm Online status. from localhost 2015/04/21 13:45:34 VCS INFO V-16-1-50135 User root fired command: hares -modify csgnic ConfidenceMsg Primary test to confirm Online status succeeded. from localhost 2015/04/21 14:14:54 VCS INFO V-16-1-53504 VCS Engine Alive message!! 2015/04/21 16:48:59 VCS INFO V-16-1-50135 User root fired command: hares -modify csgnic ConfidenceMsg Relying on secondary test to confirm Online status. from localhost 2015/04/21 16:49:34 VCS INFO V-16-1-50135 User root fired command: hares -modify csgnic ConfidenceMsg Primary test to confirm Online status succeeded.
Solved
allaboutunix
9 years ago Place Cluster Server
4.7KViews
1like
9Comments