Clustering

214 Topics

VCS Warning for Unknown State
Hi, I just curious why I received a Warning notification for Netlsnr Resource Group when the error is not logged into engine_A.log I have read the VCS documentation but the only hint I have is this. Resource state is unknown. Warning VCS cannot identify the state of the resource. Can anyone provide better explanation what could have caused VCS to send the warning email?
Solved
Elvis_L_
14 years ago Place Cluster Server
8.6KViews
0likes
17Comments
Cannot unload GAB and LLT on RHEL 6.0
Hi all, i have next problem # lltstat -n LLT node information: Node State Links 0 srv-n1 OPEN 2 * 1 srv-n2 OPEN 2 # gabconfig -a GAB Port Memberships =============================================================== Port a gen 7b4d01 membership 01 Port b gen 7b4d05 membership 01 Port d gen 7b4d04 membership 01 Port h gen 7b4d11 membership 01 # /opt/VRTSvcs/bin/haconf -dump -makero VCS WARNING V-16-1-10369 Cluster not writable. # /opt/VRTSvcs/bin/hastop -all -force # /etc/init.d/vxfen stop Stopping vxfen.. Stopping vxfen.. Done # /etc/init.d/gab stop Stopping GAB: ERROR! Cannot unload GAB module. Clients still exist Kill/Stop clients corresponding to following ports. GAB Port Memberships =============================================================== Port d gen 7b4d04 membership 01 # /etc/init.d/llt stop Stopping LLT: LLT lltconfig ERROR V-14-2-15121 LLT unconfigure aborted, unregister 3 port(s) LLT:Warning: lltconfig failed. Retrying [1] LLT lltconfig ERROR V-14-2-15121 LLT unconfigure aborted, unregister 3 port(s) LLT:Warning: lltconfig failed. Retrying [2] LLT lltconfig ERROR V-14-2-15121 LLT unconfigure aborted, unregister 3 port(s) LLT:Warning: lltconfig failed. Retrying [3] LLT lltconfig ERROR V-14-2-15121 LLT unconfigure aborted, unregister 3 port(s) LLT:Warning: lltconfig failed. Retrying [4] LLT lltconfig ERROR V-14-2-15121 LLT unconfigure aborted, unregister 3 port(s) LLT:Warning: lltconfig failed. Retrying [5] LLT:Error: lltconfig failed OK, i see lsmod and modinfo for details... # lsmod Module Size Used by vxodm 206291 1 vxgms 284352 0 vxglm 289848 0 gab 283317 4 llt 180985 5 gab autofs4 27683 3 sunrpc 241630 1 dmpCLARiiON 11771 1 dmpap 9390 1 vxspec 3174 6 vxio 3261814 1 vxspec vxdmp 377776 20 vxspec,vxio cpufreq_ondemand 10382 1 acpi_cpufreq 8593 3 freq_table 4847 2 cpufreq_ondemand,acpi_cpufreq ipv6 321209 60 vxportal 5940 0 fdd 53457 1 vxodm vxfs 2957815 2 vxportal,fdd exportfs 4202 0 serio_raw 4816 0 i2c_i801 11190 0 iTCO_wdt 11708 0 iTCO_vendor_support 3022 1 iTCO_wdt ioatdma 57872 9 dca 7099 1 ioatdma i5k_amb 5039 0 hwmon 2464 1 i5k_amb i5000_edac 8833 0 edac_core 46055 3 i5000_edac sg 30186 0 shpchp 33448 0 e1000e 140051 0 ext4 353979 3 mbcache 7918 1 ext4 jbd2 89033 1 ext4 dm_mirror 14003 1 dm_region_hash 12200 1 dm_mirror dm_log 10088 3 dm_mirror,dm_region_hash sr_mod 16162 0 cdrom 39769 1 sr_mod sd_mod 37221 18 crc_t10dif 1507 1 sd_mod pata_acpi 3667 0 ata_generic 3611 0 ata_piix 22588 0 ahci 39105 4 qla2xxx 280129 24 scsi_transport_fc 50893 1 qla2xxx scsi_tgt 12107 1 scsi_transport_fc radeon 797054 1 ttm 46942 1 radeon drm_kms_helper 32113 1 radeon drm 200778 3 radeon,ttm,drm_kms_helper i2c_algo_bit 5664 1 radeon i2c_core 31274 5 i2c_i801,radeon,drm_kms_helper,drm,i2c_algo_bit dm_mod 76856 20 dm_mirror,dm_log # rmmod gab ERROR: Module gab is in use [root@srv-vrts-n2 ~]# modinfo gab filename: /lib/modules/2.6.32-71.el6.x86_64/veritas/vcs/gab.ko license: Proprietary. Send bug reports to support@veritas.com description: Group Membership and Atomic Broadcast 5.1.120.000-SP1PR2 author: VERITAS Software Corp. srcversion: F43C75576C05662FB0ED8C8 depends: llt vermagic: 2.6.32-71.el6.x86_64 SMP mod_unload modversions parm: gab_logflag:int parm: gab_numnids:maximum nodes in the cluster (1-128) (int) parm: gab_numports:maximum gab ports allowed (1-32) (int) parm: gab_flowctrl:queue depth that causes flow-control (1-128) (int) parm: gab_logbufsize:internal log buffer size in bytes (8100-65400) (int) parm: gab_msglogsize:maximum messages in internal message log (128-4096) (int) parm: gab_isolate_time:maximum time to wait for isolated client (16000-240000) (int) parm: gab_kill_ntries:number of times to attempt to kill client (3-10) (int) parm: gab_kstat_size:Number of system statistics to maintain in GAB 60-240 (int) parm: gab_conn_wait:maximum number of wait for CONNECTS message (1-256) (int) parm: gab_ibuf_count:maximum number of intermediate buffers (0-32) (int) # modinfo llt filename: /lib/modules/2.6.32-71.el6.x86_64/veritas/vcs/llt.ko license: Proprietary. Send bug reports to support@veritas.com author: VERITAS Software Corp. description: Low Latency Transport 5.1.120.000-SP1PR2 srcversion: AF11D9C04A71073E1ADCFC8 depends: vermagic: 2.6.32-71.el6.x86_64 SMP mod_unload modversions parm: llt_maxnids:maximum nodes in the cluster (1-128) (int) parm: llt_maxports:maximum llt ports allowed (1-32) (int) parm: llt_nqthread:number of kernel threads to use (2-5) (int) parm: llt_basetimer:frequency of base timer ((10 * 1000)-(500 * 1000)) (int) Hm.... ok, i run /etc/init.d/gab stop with debug ... + echo 'Stopping GAB: ' Stopping GAB: + mod_isloaded ++ lsmod ++ grep '^gab\ ' + return + mod_isconfigured ++ LANG=C ++ LC_ALL=C ++ /sbin/gabconfig -l ++ grep 'Driver state' ++ grep -q Configured + return + /sbin/gabconfig -U + ret=1 + '[' '!' 1 -eq 0 ']' + echo 'ERROR! Cannot unload GAB module. Clients still exist' ERROR! Cannot unload GAB module. Clients still exist + echo 'Kill/Stop clients corresponding to following ports.' Kill/Stop clients corresponding to following ports. + LANG=C + LC_ALL=C + /sbin/gabconfig -a + grep -v 'Port a gen' GAB Port Memberships ... ok, i use /sbin/gabconfig -l and -U # /sbin/gabconfig -l GAB Driver Configuration Driver state : Configured Partition arbitration: Disabled Control port seed : Enabled Halt on process death: Disabled Missed heartbeat halt: Disabled Halt on rejoin : Disabled Keep on killing : Disabled Quorum flag : Disabled Restart : Enabled Node count : 2 Send queue limit : 128 Recv queue limit : 128 IOFENCE timeout (ms) : 15000 Stable timeout (ms) : 5000 # /sbin/gabconfig -U GAB /sbin/gabconfig ERROR V-15-2-25014 clients still registered but it did not help... i find this topic https://www-secure.symantec.com/connect/forums/unable-stop-gab-llt-vcs51solaris-10 and stop ODM # /etc/init.d/vxodm stop Stopping ODM and run /etc/init.d/gab stop , but # /etc/init.d/gab stop Stopping GAB: GAB has usage count greater than zero. Cannot unload I again see /sbin/gabconfig -l # /sbin/gabconfig -l GAB Driver Configuration Driver state : Unconfigured Partition arbitration: Disabled Control port seed : Disabled Halt on process death: Disabled Missed heartbeat halt: Disabled Halt on rejoin : Disabled Keep on killing : Disabled Quorum flag : Disabled Restart : Disabled Node count : 0 Send queue limit : 128 Recv queue limit : 128 IOFENCE timeout (ms) : 15000 Stable timeout (ms) : 5000 [root@srv-vrts-n2 ~]# /sbin/gabconfig -a GAB Port Memberships =============================================================== and again run with debug /etc/init.d/gab stop ... + echo 'Stopping GAB: ' Stopping GAB: + mod_isloaded ++ lsmod ++ grep '^gab\ ' + return + mod_isconfigured ++ LANG=C ++ LC_ALL=C ++ /sbin/gabconfig -l ++ grep 'Driver state' ++ grep -q Configured + return + mod_unload ++ lsmod ++ grep '^gab ' ++ awk '{print $3}' + USECNT=1 + '[' -z 1 ']' + '[' 5 '!=' 0 ']' + ps -e + grep gablogd + '[' 1 -ne 0 ']' + GAB_UNLOAD_RETRIES=0 + '[' 0 '!=' 0 ']' ++ lsmod ++ grep '^gab ' ++ awk '{print $3}' + USECNT=1 + '[' 1 -gt 0 ']' + echo 'GAB has usage count greater than zero. Cannot unload' GAB has usage count greater than zero. Cannot unload + return 1 ... and agian run lsmod lsmod | grep gab gab 283317 1 llt 180985 1 gab [root@srv-vrts-n2 ~]# rmmod gab ERROR: Module gab is in use what can be done in such a situation?
Solved
sirin_zarin
13 years ago Place Cluster Server
7.8KViews
1like
4Comments
IP agent for same mac address interface
Hi all, Our environment as the following: OS: redhat 6.5 VCS: VCS 6.2.1 Our server have two physical network port, namely eth0 and eth1. We do create tagged vlan, vlan515, vlan516, vlan518, vlan520 based on eth0 and eth1. We are able to create resource IP on vlan518 and failover between two nodes. However, when we create resource IP on vlan515, it is not able to bring it online. According to the link, https://support.symantec.com/en_US/article.TECH214469.html, It knows that duplicate mac address would cause the problem. However, it can't figure out where "MACAddress" attribute in VCS Java Console as mentioned in the solution. I did manually add "MACAddress" attribute on main.cf on either NIC or IP resource, it come with not support with haconf -verify command. Any hints or solution for the problem when configure the IP agent resource on same mac address? Thanks, Xentar
Solved
Xentar
10 years ago Place Cluster Server
5.4KViews
0likes
22Comments
How to recover from failed attempt to switch to a different node in cluster
Hello everyone. I have a two node cluster to serve as an Oracle database server. I have the Oracle binaries installed on disks local to each of the nodes (so they are outside the control of the Cluster Manager). I have a diskgroup which is three 1TB LUNs from my SAN, six volumes on the diskgroup (u02 through u07), six mount points (/u02 through /u07), a database listener and the actual Oracle database. I was able to successfully manually bring up these individual components and confirmed that the database was up an running. I then tried a "Switch To" operation to see if everything would mode to the other node of the cluster. It turns out this was a bad idea. Within the Cluster Manager gui, the diskgroup has a state of Online, Istate of "Waiting to go offline propogate" and Flag of "Unable to offline". The volumes show as "Offline on all systems" but the mounts still show as online with "Status Unknown". When I try to take the mount points offline, I get the message "VCS ERROR V-16-1-10277 The Service Group i1025prd to which Resource Mnt_scratch belongs has failed or switch, online, and offline operations are prohibited." Can anyone tell me how I can fix this? Ken
Solved
khemmerl
13 years ago Place Cluster Server
4.9KViews
1like
17Comments
VCS dependency question - Service group is not runninng on the intended node upon cluster startup.
Dear All, I have a questions regarding the service group dependency. I have two parent service group "Group1" and "Group2" . They are dependent on a child service group "ServerGroup1_DG" which has been configured as parallel service. In the main.cf file, I have configured the service group "Group1" to be started on Node1 and service group "Group2" to be started on Node2. However, I don't know why the cluster do not start the service group "ServerGroup1_DG" on Node2 before starting the service group "Group2". During the cluster startup, the cluster evaluated Node1 to be a target node for service group "Group2", and it went on to online the service group on Node1. This is not the wanted behaviour. I would like to have the service group "Group1" running on Node1 and service group "Group2" running on Node2 when the cluster starts up. Can anyone help to shed some light on how I can solve this problem? Thanks. p/s: The VCS version is 5.0 on solaris platform. main.cf:- group Group1 ( SystemList = { Node1 = 1, Node2 = 2 } AutoStartList = { Node1, Node2 } ) requires group ServerGroup1_DG online local firm group Group2 ( SystemList = { Node1 = 2, Node2 = 1 } AutoStartList = { Node2, Node1 } ) requires group ServerGroup1_DG online local firm group ServerGroup1_DG ( SystemList = { Node1 = 0, Node2 = 1 } AutoFailOver = 0 Parallel = 1 AutoStartList = { Node1, Node2 } ) CFSMount cfsmount2 ( Critical = 0 MountPoint = "/var/opt/xxxx/ServerGroup1" BlockDevice = "/dev/vx/dsk/xxxxdg/vol01" MountOpt @Node1 = "cluster" MountOpt @Node2 = "cluster" NodeList = { Node1, Node2 } ) CVMVolDg cvmvoldg2 ( Critical = 0 CVMDiskGroup = xxxxdg CVMActivation @Node1 = sw CVMActivation @Node2 = sw ) requires group cvm online local firm cfsmount2 requires cvmvoldg2 Engine Log:- 2010/06/25 16:05:47 VCS NOTICE V-16-1-10447 Group ServerGroup1_DG is online on system Node1 2010/06/25 16:05:47 VCS WARNING V-16-1-50045 Initiating online of parent group Group3, PM will select the best node 2010/06/25 16:05:47 VCS WARNING V-16-1-50045 Initiating online of parent group Group2, PM will select the best node 2010/06/25 16:05:47 VCS WARNING V-16-1-50045 Initiating online of parent group Group1, PM will select the best node 2010/06/25 16:05:47 VCS INFO V-16-1-10493 Evaluating Node2 as potential target node for group Group3 2010/06/25 16:05:47 VCS INFO V-16-1-10163 Group dependency is not met if group Group3 goes online on system Node2 2010/06/25 16:05:47 VCS INFO V-16-1-10493 Evaluating Node1 as potential target node for group Group3 2010/06/25 16:05:47 VCS INFO V-16-1-10493 Evaluating Node2 as potential target node for group Group2 2010/06/25 16:05:47 VCS INFO V-16-1-10163 Group dependency is not met if group Group2 goes online on system Node2 2010/06/25 16:05:47 VCS INFO V-16-1-10493 Evaluating Node1 as potential target node for group Group2 2010/06/25 16:06:15 VCS NOTICE V-16-1-10447 Group ServerGroup1_DG is online on system Node2 Regards, Ryan
Solved
Ryan_Chun
15 years ago Place Cluster Server
4.7KViews
0likes
2Comments
Reservation conflict on all LUNs of the cluster nodes.
Hi All, We have recently done the solaris live upgrade from solaris 9 to solaris 10 along with the maintenance patch live upgrade from 5.0MP1 to 5.0MP3.After making the alternate boot environment active we got the reservation conflict error for all the LUNs and also it createdmultipathing problem.below are the logs fromserver,I would be thankfull for any suggetions or recommandations. Aug 16 03:28:27 P0111CRMDB scsi: [ID 107833 kern.notice] Sense Key: Unit Attention Aug 16 03:28:27 P0111CRMDB scsi: [ID 107833 kern.notice] ASC: 0x2a (reservations released), ASCQ: 0x4, FRU: 0x0 Aug 16 03:28:27 P0111CRMDB gab: [ID 316943 kern.notice] GAB INFO V-15-1-20036 Port w gen 103c72d membership 0 Aug 16 03:28:27 P0111CRMDB gab: [ID 674723 kern.notice] GAB INFO V-15-1-20038 Port w gen 103c72d k_jeopardy ;1 Aug 16 03:28:27 P0111CRMDB gab: [ID 513393 kern.notice] GAB INFO V-15-1-20040 Port w gen 103c72d visible ;1 Aug 16 03:28:27 P0111CRMDB gab: [ID 316943 kern.notice] GAB INFO V-15-1-20036 Port v gen 103c72b membership 0 Aug 16 03:28:27 P0111CRMDB gab: [ID 674723 kern.notice] GAB INFO V-15-1-20038 Port v gen 103c72b k_jeopardy ;1 Aug 16 03:28:27 P0111CRMDB gab: [ID 513393 kern.notice] GAB INFO V-15-1-20040 Port v gen 103c72b visible ;1 Aug 16 03:28:28 P0111CRMDB vxfs: [ID 779698 kern.notice] GLM recovery : gen 103c727 mbr 1 0 0 0 flags 0 Aug 16 03:28:28 P0111CRMDB vxfs: [ID 702911 kern.notice] NOTICE: msgcnt 2 mesg 125: V-2-125: GLM restart callback, protocol f lag 0 Aug 16 03:28:28 P0111CRMDB scsi: [ID 107833 kern.warning] WARNING: /ssm@0,0/pci@1a,600000/SUNW,qlc@1,1/fp@0,0/ssd@w50060e8005 438657,90 (ssd618): Aug 16 03:28:28 P0111CRMDB Error for Command: read(10) Error Level: Retryable Aug 16 03:28:28 P0111CRMDB scsi: [ID 107833 kern.notice] Requested Block: 12813040 Error Block: 12813 040 Aug 16 03:28:28 P0111CRMDB scsi: [ID 107833 kern.notice] Vendor: HITACHI Serial Number: 50 04386244F Aug 16 03:28:28 P0111CRMDB scsi: [ID 107833 kern.notice] Sense Key: Unit Attention Aug 16 03:28:28 P0111CRMDB scsi: [ID 107833 kern.notice] ASC: 0x2a (reservations released), ASCQ: 0x4, FRU: 0x0 Aug 16 03:28:28 P0111CRMDB vxvm:vxconfigd: [ID 702911 daemon.notice] V-5-1-7899 CVM_VOLD_CHANGE command received Aug 16 03:28:28 P0111CRMDB scsi: [ID 107833 kern.warning] WARNING: /ssm@0,0/pci@1a,600000/SUNW,qlc@1,1/fp@0,0/ssd@w50060e8005 438657,a9 (ssd987): Aug 16 03:28:28 P0111CRMDB Error for Command: write(10) Error Level: Retryable Aug 16 03:28:28 P0111CRMDB scsi: [ID 107833 kern.notice] Requested Block: 55124544 Error Block: 55124 544 Aug 16 03:28:28 P0111CRMDB scsi: [ID 107833 kern.notice] Vendor: HITACHI Serial Number: 50 043862106 Aug 16 03:28:28 P0111CRMDB scsi: [ID 107833 kern.notice] Sense Key: Unit Attention Aug 16 03:28:28 P0111CRMDB scsi: [ID 107833 kern.notice] ASC: 0x2a (reservations released), ASCQ: 0x4, FRU: 0x0 Aug 16 03:28:28 P0111CRMDB scsi: [ID 107833 kern.warning] WARNING: /ssm@0,0/pci@1a,600000/SUNW,qlc@1,1/fp@0,0/ssd@w50060e8005 438657,94 (ssd614): Aug 16 03:28:28 P0111CRMDB Error for Command: write(10) Error Level: Retryable Aug 16 03:28:28 P0111CRMDB scsi: [ID 107833 kern.notice] Requested Block: 65844000 Error Block: 65844 000 Aug 16 03:28:28 P0111CRMDB scsi: [ID 107833 kern.notice] Vendor: HITACHI Serial Number: 50 043862450 Aug 16 03:28:28 P0111CRMDB scsi: [ID 107833 kern.notice] Sense Key: Unit Attention Aug 16 03:28:28 P0111CRMDB scsi: [ID 107833 kern.notice] ASC: 0x2a (reservations released), ASCQ: 0x4, FRU: 0x0 Aug 16 03:28:28 P0111CRMDB vxvm:vxconfigd: [ID 702911 daemon.notice] V-5-1-13170 Preempting CM NID 1 Aug 16 03:28:28 P0111CRMDB scsi: [ID 107833 kern.warning] WARNING: /ssm@0,0/pci@18,600000/SUNW,qlc@1,1/fp@0,0/ssd@w50060e8005 438645,97 (ssd7): Aug 16 03:28:28 P0111CRMDB Error for Command: read(10) Error Level: Retryable Aug 16 03:28:28 P0111CRMDB scsi: [ID 107833 kern.notice] Requested Block: 69328 Error Block: 69328 Aug 16 03:28:28 P0111CRMDB scsi: [ID 107833 kern.notice] Vendor: HITACHI Serial Number: 50 043862051 Aug 16 03:28:28 P0111CRMDB scsi: [ID 107833 kern.notice] Sense Key: Unit Attention Aug 16 03:28:28 P0111CRMDB scsi: [ID 107833 kern.notice] ASC: 0x2a (reservations released), ASCQ: 0x4, FRU: 0x0 Aug 16 03:28:28 P0111CRMDB scsi: [ID 107833 kern.warning] WARNING: /ssm@0,0/pci@1a,600000/SUNW,qlc@1,1/fp@0,0/ssd@w50060e8005 438657,ae (ssd982): Aug 16 03:28:28 P0111CRMDB Error for Command: read(10) Error Level: Retryable Aug 16 03:28:28 P0111CRMDB scsi: [ID 107833 kern.notice] Requested Block: 14741104 Error Block: 14741 104 Aug 16 03:28:28 P0111CRMDB scsi: [ID 107833 kern.notice] Vendor: HITACHI Serial Number: 50 043862507 Aug 16 03:28:28 P0111CRMDB scsi: [ID 107833 kern.notice] Sense Key: Unit Attention Aug 16 03:28:28 P0111CRMDB scsi: [ID 107833 kern.notice] ASC: 0x2a (reservations released), ASCQ: 0x4, FRU: 0x0 Aug 16 03:28:28 P0111CRMDB scsi: [ID 107833 kern.warning] WARNING: /ssm@0,0/pci@18,600000/SUNW,qlc@1,1/fp@0,0/ssd@w50060e8005 438647,93 (ssd170): Aug 16 03:28:28 P0111CRMDB Error for Command: write(10) Error Level: Retryable Aug 16 03:28:28 P0111CRMDB scsi: [ID 107833 kern.notice] Requested Block: 65844000 Error Block: 65844 000 Aug 16 03:28:28 P0111CRMDB scsi: [ID 107833 kern.notice] Vendor: HITACHI Serial Number: 50 043862050 Aug 16 03:28:28 P0111CRMDB scsi: [ID 107833 kern.notice] Sense Key: Unit Attention Aug 16 03:28:28 P0111CRMDB scsi: [ID 107833 kern.notice] ASC: 0x2a (reservations released), ASCQ: 0x4, FRU: 0x0 Aug 16 03:28:28 P0111CRMDB scsi: [ID 107833 kern.warning] WARNING: /ssm@0,0/pci@18,600000/SUNW,qlc@1,1/fp@0,0/ssd@w50060e8005 438647,8c (ssd177): Aug 16 03:28:28 P0111CRMDB Error for Command: read(10) Error Level: Retryable Aug 16 03:28:28 P0111CRMDB scsi: [ID 107833 kern.notice] Requested Block: 69475488 Error Block: 69475 488 Aug 16 03:28:28 P0111CRMDB scsi: [ID 107833 kern.notice] Vendor: HITACHI Serial Number: 50 04386244E Aug 16 03:28:28 P0111CRMDB scsi: [ID 107833 kern.notice] Sense Key: Unit Attention Aug 16 03:28:28 P0111CRMDB scsi: [ID 107833 kern.notice] ASC: 0x2a (reservations released), ASCQ: 0x4, FRU: 0x0 Aug 16 03:28:28 P0111CRMDB scsi: [ID 107833 kern.warning] WARNING: /ssm@0,0/pci@1a,600000/SUNW,qlc@1,1/fp@0,0/ssd@w50060e8005 438657,97 (ssd611): Aug 16 03:28:28 P0111CRMDB Error for Command: write(10) Error Level: Retryable Aug 16 03:28:28 P0111CRMDB scsi: [ID 107833 kern.notice] Requested Block: 69328 Error Block: 69328 Aug 16 03:28:28 P0111CRMDB scsi: [ID 107833 kern.notice] Vendor: HITACHI Serial Number: 50 043862051 Aug 16 03:28:28 P0111CRMDB scsi: [ID 107833 kern.notice] Sense Key: Unit Attention Aug 16 03:28:28 P0111CRMDB scsi: [ID 107833 kern.notice] ASC: 0x2a (reservations released), ASCQ: 0x4, FRU: 0x0 Aug 16 03:28:28 P0111CRMDB scsi: [ID 107833 kern.warning] WARNING: /ssm@0,0/pci@1a,600000/SUNW,qlc@1,1/fp@0,0/ssd@w50060e8005 438657,84 (ssd630): Aug 16 03:28:28 P0111CRMDB Error for Command: read(10) Error Level: Retryable Aug 16 03:28:28 P0111CRMDB scsi: [ID 107833 kern.notice] Requested Block: 39521056 Error Block: 39521 056 Aug 16 03:28:28 P0111CRMDB scsi: [ID 107833 kern.notice] Vendor: HITACHI Serial Number: 50 04386244C Aug 16 03:28:28 P0111CRMDB scsi: [ID 107833 kern.notice] Sense Key: Unit Attention Aug 16 03:28:28 P0111CRMDB scsi: [ID 107833 kern.notice] ASC: 0x2a (reservations released), ASCQ: 0x4, FRU: 0x0 Aug 16 03:28:28 P0111CRMDB scsi: [ID 107833 kern.warning] WARNING: /ssm@0,0/pci@18,600000/SUNW,qlc@1,1/fp@0,0/ssd@w50060e8005 438647,ab (ssd913): Aug 16 03:28:28 P0111CRMDB Error for Command: read(10) Error Level: Retryable Aug 16 03:28:28 P0111CRMDB scsi: [ID 107833 kern.notice] Requested Block: 31795296 Error Block: 31795 296 Aug 16 03:28:28 P0111CRMDB scsi: [ID 107833 kern.notice] Vendor: HITACHI Serial Number: 50 043862906 Aug 16 03:28:28 P0111CRMDB scsi: [ID 107833 kern.notice] Sense Key: Unit Attention Aug 16 03:28:28 P0111CRMDB scsi: [ID 107833 kern.notice] ASC: 0x2a (reservations released), ASCQ: 0x4, FRU: 0x0 Aug 16 03:28:29 P0111CRMDB scsi: [ID 107833 kern.warning] WARNING: /ssm@0,0/pci@1a,600000/SUNW,qlc@1,1/fp@0,0/ssd@w50060e8005 438657,ac (ssd984): Aug 16 03:28:29 P0111CRMDB Error for Command: read(10) Error Level: Retryable Aug 16 03:28:29 P0111CRMDB scsi: [ID 107833 kern.notice] Requested Block: 14757984 Error Block: 14757 984 Aug 16 03:28:29 P0111CRMDB scsi: [ID 107833 kern.notice] Vendor: HITACHI Serial Number: 50 043862D06 Aug 16 03:28:29 P0111CRMDB scsi: [ID 107833 kern.notice] Sense Key: Unit Attention Aug 16 03:28:29 P0111CRMDB scsi: [ID 107833 kern.notice] ASC: 0x2a (reservations released), ASCQ: 0x4, FRU: 0x0 Aug 16 03:28:29 P0111CRMDB scsi: [ID 107833 kern.warning] WARNING: /ssm@0,0/pci@1a,600000/SUNW,qlc@1,1/fp@0,0/ssd@w50060e8005 438657,7f (ssd635): Aug 16 03:28:29 P0111CRMDB Error for Command: read(10) Error Level: Retryable Aug 16 03:28:29 P0111CRMDB scsi: [ID 107833 kern.notice] Requested Block: 39857696 Error Block: 39857 696 Aug 16 03:28:29 P0111CRMDB scsi: [ID 107833 kern.notice] Vendor: HITACHI Serial Number: 50 04386204B Aug 16 03:28:29 P0111CRMDB scsi: [ID 107833 kern.notice] Sense Key: Unit Attention Aug 16 03:28:29 P0111CRMDB scsi: [ID 107833 kern.notice] ASC: 0x2a (reservations released), ASCQ: 0x4, FRU: 0x0 Aug 16 03:28:29 P0111CRMDB scsi: [ID 107833 kern.warning] WARNING: /ssm@0,0/pci@1a,600000/SUNW,qlc@1,1/fp@0,0/ssd@w50060e8005 438657,87 (ssd627): Aug 16 03:28:29 P0111CRMDB Error for Command: read(10) Error Level: Retryable Aug 16 03:28:29 P0111CRMDB scsi: [ID 107833 kern.notice] Requested Block: 17532480 Error Block: 17532 480 Aug 16 03:28:29 P0111CRMDB scsi: [ID 107833 kern.notice] Vendor: HITACHI Serial Number: 50 04386204D Aug 16 03:28:29 P0111CRMDB scsi: [ID 107833 kern.notice] Sense Key: Unit Attention Aug 16 03:28:29 P0111CRMDB scsi: [ID 107833 kern.notice] ASC: 0x2a (reservations released), ASCQ: 0x4, FRU: 0x0 Aug 16 03:28:29 P0111CRMDB scsi: [ID 107833 kern.warning] WARNING: /ssm@0,0/pci@1a,600000/SUNW,qlc@1,1/fp@0,0/ssd@w50060e8005 438657,86 (ssd628): Aug 16 03:28:29 P0111CRMDB Error for Command: read(10) Error Level: Retryable Aug 16 03:28:29 P0111CRMDB scsi: [ID 107833 kern.notice] Requested Block: 39816976 Error Block: 39816 976 Aug 16 03:28:29 P0111CRMDB scsi: [ID 107833 kern.notice] Vendor: HITACHI Serial Number: 50 043862C4C Aug 16 03:28:29 P0111CRMDB scsi: [ID 107833 kern.notice] Sense Key: Unit Attention Aug 16 03:28:29 P0111CRMDB scsi: [ID 107833 kern.notice] ASC: 0x2a (reservations released), ASCQ: 0x4, FRU: 0x0 Aug 16 03:28:29 P0111CRMDB scsi: [ID 107833 kern.warning] WARNING: /ssm@0,0/pci@18,600000/SUNW,qlc@1,1/fp@0,0/ssd@w50060e8005 438645,80 (ssd30): Aug 16 03:28:29 P0111CRMDB Error for Command: read(10) Error Level: Retryable Aug 16 03:28:29 P0111CRMDB scsi: [ID 107833 kern.notice] Requested Block: 50250576 Error Block: 50250 576 Aug 16 03:28:29 P0111CRMDB scsi: [ID 107833 kern.notice] Vendor: HITACHI Serial Number: 50 04386244B Aug 16 03:28:29 P0111CRMDB scsi: [ID 107833 kern.notice] Sense Key: Unit Attention Aug 16 03:28:29 P0111CRMDB scsi: [ID 107833 kern.notice] ASC: 0x2a (reservations released), ASCQ: 0x4, FRU: 0x0 Aug 16 03:28:29 P0111CRMDB scsi: [ID 107833 kern.warning] WARNING: /ssm@0,0/pci@1a,600000/SUNW,qlc@1,1/fp@0,0/ssd@w50060e8005 438655,93 (ssd456): Aug 16 03:28:29 P0111CRMDB Error for Command: write(10) Error Level: Retryable Aug 16 03:28:29 P0111CRMDB scsi: [ID 107833 kern.notice] Requested Block: 65844240 Error Block: 65844 240 Aug 16 03:28:29 P0111CRMDB scsi: [ID 107833 kern.notice] Vendor: HITACHI Serial Number: 50 043862050 Aug 16 03:28:29 P0111CRMDB scsi: [ID 107833 kern.notice] Sense Key: Unit Attention Aug 16 03:28:29 P0111CRMDB scsi: [ID 107833 kern.notice] ASC: 0x2a (reservations released), ASCQ: 0x4, FRU: 0x0 Aug 16 03:28:29 P0111CRMDB scsi: [ID 107833 kern.warning] WARNING: /ssm@0,0/pci@18,600000/SUNW,qlc@1,1/fp@0,0/ssd@w50060e8005 438645,92 (ssd12): Aug 16 03:28:29 P0111CRMDB Error for Command: write(10) Error Level: Retryable Aug 16 03:28:29 P0111CRMDB scsi: [ID 107833 kern.notice] Requested Block: 65843280 Error Block: 65843 280
Solved
Pranali
14 years ago Place Cluster Server
4.5KViews
1like
9Comments
NFS share doesn't failover due to being busy
Hello! We are trying to implement a failover cluster, which hosts database and files on clustered NFS share. Files are used by the clustered application itself, and by several other hosts. The problem is, that when active node fails (I mean an ungraceful server shutdown or some clustered service stop), the other hosts still continue to use files on our cluster-hosted NFS share. That leads to an NFS-share "hanging", when it doesn't work on the first node, and still cannot be brought online of the second node. Other hosts also experience hanging of requests to that NFS share. Later, I will attach logs, where problem can be observed. The only possible corrective action found by us is total shutdown and sequential start of all cluster nodes and other hosts. Please recommend us a best-practice actions, required for using NFS share on veritas cluster server (maybe, some start/stop/clean scripts being included as a cluster resource, or additional cluster configuration options). Thank you, in advance! Best regards, Maxim Semenov.
Solved
semenov_m_o
11 years ago Place Cluster Server
4.4KViews
3likes
13Comments
Upgrade Linux Kernel on VCS installed
Hi all, How i can update\upgrade linux kernel on my system, with VCS installed to the target system? I update 2.6.32-71.el6.x86_64 to 2.6.32-279.el6.x86_64, but i have old kernel modules.... ------------------------------------------ ls /etc/vx/kernel/ dmpaaa.ko.2.6.32-71.el6.x86_64 dmpCLARiiON.ko.2.6.32-71.el6.x86_64 dmpsvc.ko.2.6.32-71.el6.x86_64 vxio.ko.2.6.32-71.el6.x86_64 dmpaa.ko.2.6.32-71.el6.x86_64 dmpEngenio.ko.2.6.32-71.el6.x86_64 fdd.ko.2.6.32-71.el6.x86_64 vxodm.ko.2.6.32-71.el6.x86_64 dmpalua.ko.2.6.32-71.el6.x86_64 dmphuawei.ko.2.6.32-71.el6.x86_64 vxdmp.ko.2.6.32-71.el6.x86_64 vxportal.ko.2.6.32-71.el6.x86_64 dmpapf.ko.2.6.32-71.el6.x86_64 dmpjbod.ko.2.6.32-71.el6.x86_64 vxfs.ko.2.6.32-71.el6.x86_64 vxspec.ko.2.6.32-71.el6.x86_64 dmpapg.ko.2.6.32-71.el6.x86_64 dmpnetapp.ko.2.6.32-71.el6.x86_64 vxglm.ko.2.6.32-71.el6.x86_64 dmpap.ko.2.6.32-71.el6.x86_64 dmpsun7x10alua.ko.2.6.32-71.el6.x86_64 vxgms.ko.2.6.32-71.el6.x86_64 ------------------------------------------ # cat /etc/redhat-release Red Hat Enterprise Linux Server release 6.3 (Santiago) ------------------------------------------ How i can generate new modules for kernel?! UPD [root@s-vrts-n2 ~]# lsmod Module Size Used by vxglm 289848 0 vxfen 297667 0 gab 283317 1 vxfen llt 180985 5 gab ipv6 322541 40 exportfs 4236 0 e1000e 222041 0 serio_raw 4818 0 i2c_i801 11231 0 sg 30124 0 iTCO_wdt 13934 0 iTCO_vendor_support 3088 1 iTCO_wdt i5000_edac 8867 0 edac_core 46773 3 i5000_edac i5k_amb 5105 0 ioatdma 58482 12 dca 7197 1 ioatdma shpchp 33482 0 ext4 371331 3 mbcache 8144 1 ext4 jbd2 93312 1 ext4 sd_mod 39488 2 crc_t10dif 1541 1 sd_mod sr_mod 16228 0 cdrom 39803 1 sr_mod qla2xxx 400728 0 scsi_transport_fc 55235 1 qla2xxx scsi_tgt 12173 1 scsi_transport_fc ahci 40871 2 pata_acpi 3701 0 ata_generic 3837 0 ata_piix 22846 0 radeon 846316 1 ttm 82562 1 radeon drm_kms_helper 34122 1 radeon drm 250067 3 radeon,ttm,drm_kms_helper i2c_algo_bit 5762 1 radeon i2c_core 31276 5 i2c_i801,radeon,drm_kms_helper,drm,i2c_algo_bit dm_mirror 14101 1 dm_region_hash 12170 1 dm_mirror dm_log 10122 3 dm_mirror,dm_region_hash dm_mod 81692 19 dm_mirror,dm_log [root@s-vrts-n2 ~]# modinfo gab filename: /lib/modules/2.6.32-279.el6.x86_64/veritas/vcs/gab.ko license: Proprietary. Send bug reports to support@veritas.com description: Group Membership and Atomic Broadcast 5.1.120.000-SP1PR2 author: VERITAS Software Corp. srcversion: F43C75576C05662FB0ED8C8 depends: llt vermagic: 2.6.32-71.el6.x86_64 SMP mod_unload modversions parm: gab_logflag:int parm: gab_numnids:maximum nodes in the cluster (1-128) (int) parm: gab_numports:maximum gab ports allowed (1-32) (int) parm: gab_flowctrl:queue depth that causes flow-control (1-128) (int) parm: gab_logbufsize:internal log buffer size in bytes (8100-65400) (int) parm: gab_msglogsize:maximum messages in internal message log (128-4096) (int) parm: gab_isolate_time:maximum time to wait for isolated client (16000-240000) (int) parm: gab_kill_ntries:number of times to attempt to kill client (3-10) (int) parm: gab_kstat_size:Number of system statistics to maintain in GAB 60-240 (int) parm: gab_conn_wait:maximum number of wait for CONNECTS message (1-256) (int) parm: gab_ibuf_count:maximum number of intermediate buffers (0-32) (int) [root@s-vrts-n2 ~]# uname -a Linux s-vrts-n2.lab 2.6.32-279.el6.x86_64 #1 SMP Wed Jun 13 18:24:36 EDT 2012 x86_64 x86_64 x86_64 GNU/Linux [root@s-vrts-n2 ~]#
Solved
sirin_zarin
12 years ago Place Cluster Server
4.2KViews
1like
6Comments
HAD unregistration problem with gab (failed) : VCS 6.0
Dear Team, I'm a newbi here. We have a VCS setup on two linux (6.4) nodes (SFCFSHA 6.0). When customer tried to reboot a master node, the following error messages (confirming that HAD has unregistered with VCS (retry 1) FAILED)appeared on iLOM and it caused the panic on slave node too. Now the node is up and everything looks okay. But we have to find out the cause for this error message. Please check the attached file for screenshot of the error message. Looking forward for your help. Thanks and Regards Barani
Solved
barani_129
9 years ago Place Cluster Server
3.7KViews
1like
15Comments
Listener resource remain faulted
Hello, we are doing some failure tests for a customer. We have VCS 6.2 running on solaris 10. We have an Oracle database and of course the listener associated with it. We try to simulate different kind of failures. One of them is to kill the listener. In this situation the cluster observes that the listener has died, and it fails over the service to the other node. BUT the listener resource will remain in FAULTED state on the original node, and the group to which belongs will be in OFFLINE FAULTED state. In this situation if something goes wrong on the second node the service will not fail back to the original one until we manually run hagrp -clear. Is there anything we can do to fix this? (to have the clear done automatically) Here are some lines from the log: 2015/03/30 17:26:10 VCS ERROR V-16-2-13067 (node2p) Agent is calling clean for resource(ora_listener-res) because the resource became OFFLINE unexpectedly, on its own. 2015/03/30 17:26:11 VCS INFO V-16-2-13068 (node2p) Resource(ora_listener-res) - clean completed successfully. 2015/03/30 17:26:11 VCS INFO V-16-1-10307 Resource ora_listener-res (Owner: Unspecified, Group: oracle_rg) is offline on node2p (Not initiated by VCS) in these it says that clean for the resource has completed successfully, but the resource is still faulted. but if I run hares -clear manually, the the fault goes away. 20150330-173628:root@node1p:~# hares -state ora_listener-res #Resource Attribute System Value ora_listener-res State node1p ONLINE ora_listener-res State node2p FAULTED 20150330-173636:root@node1p:~# hares -clear ora_listener-res 20150330-173653:root@node1p:~# hares -state ora_listener-res #Resource Attribute System Value ora_listener-res State node1p ONLINE ora_listener-res State node2p OFFLINE 20150330-173655:root@node1p:~#
Solved
Laszlo_Budai
10 years ago Place Cluster Server
3.4KViews
0likes
5Comments