node freeze v/s service group freeze
Hi Team, I came across a DIMM replacement activity in one of our solars servers which are in cluster. Please assist before proceeding for teh activity shall i have to freeze teh service group OR DO i need to freeze the node? ALso, kindly guide the scenario, when we should have toproceed for freezing the SG or freezing the node? Thanks..Solved6.8KViews1like1Commentvxdisk list showing errors on multiple disks, and I am unable to start cluster on slave node.
Hello, If anybody have same experience and can help me, I am gonna be very thankful I am using solars 10 (x86141445-09) + EMC PowerPath (5.5.P01_b002) + vxvm (5.0,REV=04.15.2007.12.15) on two node cluster. This is fileserver cluster. I've added couple new LUNs and when I try to scan for new disk :"vxdisk scandisks" command hangs and after that time I was unable to do any vxvm job on that node, everytime command hangs. I've rebooted server in maintanance windows, (before reboot switched all SGs on 2nd node) After that reboot I am unable to join to cluster with reason 2014/04/13 01:04:48 VCS WARNING V-16-10001-1002 (filesvr1) CVMCluster:cvm_clus:online:CVMCluster start failed on this node. 2014/04/13 01:04:49 VCS INFO V-16-2-13001 (filesvr1) Resource(cvm_clus): Output of the completed operation (online) ERROR: 2014/04/13 01:04:49 VCS ERROR V-16-10001-1005 (filesvr1) CVMCluster:???:monitor:node - state: out of cluster reason: Cannot find disk on slave node: retry to add a node failed Apr 13 01:10:09 s_local@filesvr1 vxvm: vxconfigd: [ID 702911 daemon.warning] V-5-1-8222 slave: missing disk 1306358680.76.filesvr1 Apr 13 01:10:09 s_local@filesvr1 vxvm: vxconfigd: [ID 702911 daemon.warning] V-5-1-7830 cannot find disk 1306358680.76.filesvr1 Apr 13 01:10:09 s_local@filesvr1 vxvm: vxconfigd: [ID 702911 daemon.error] V-5-1-11092 cleanup_client: (Cannot find disk on slave node) 222 here is output from 2nd node (working fine) Disk: emcpower33s2 type: auto flags: online ready private autoconfig shared autoimport imported guid: {665c6838-1dd2-11b2-b1c1-00238b8a7c90} udid: DGC%5FVRAID%5FCKM00111001420%5F6006016066902C00915931414A86E011 site: - diskid: 1306358680.76.filesvr1 dgname: fileimgdg dgid: 1254302839.50.filesvr1 clusterid: filesvrvcs info: format=cdsdisk,privoffset=256,pubslice=2,privslice=2 and here is from node where i see this problems Device: emcpower33s2 devicetag: emcpower33 type: auto flags: error private autoconfig pubpaths: block=/dev/vx/dmp/emcpower33s2 char=/dev/vx/rdmp/emcpower33s2 guid: {665c6838-1dd2-11b2-b1c1-00238b8a7c90} udid: DGC%5FVRAID%5FCKM00111001420%5F6006016066902C00915931414A86E011 site: - errno: Configuration request too large Multipathing information: numpaths: 1 emcpower33c state=enabled Can anybody help me? I am not sure aboutConfiguration request too largeSolved5.7KViews1like16Commentsadding new volumes to a DG that has a RVG under VCS cluster
hi, i am having a VCS cluster with GCO and VVR. on each node of the cluster i have a DG with an associated RVG, this RVG contains 11 data volume for Oracle database, these volumes are getting full so i am going to add new disks to the DG and create new volumes and mount points to be used by the Oracle Database. my question:can i add the disks to the DG and volumes to RVGwhile the database is UP and the replication is ON? if the answer is no, please let me know what should be performed on the RVG and rlinkto add these volumes also what to perform on the database resource group to not failover. thanks in advance.Solved4.4KViews0likes14Commentssystem state FAULTED
> Hi, > > I am having VCS 6.0 running on solaris 10on a 2 node cluster. All the > nodes were rebooted as part of scheduled job, during which i got number > of messages in engine log which i am trying to understand. The sequence > of events in log are as: > > 1) all nodes showed as jeopardy state after boot for a moment. Why ? Is > it that one of the link was down for a moment just after booting > 2) After that log says system changed state from RUNNING to FAULTED. I > have never seen that system goes to FAULTED state, why it went to > faulted state. > 3) After this log shows that service groups became autodisabled on all > these nodes. > 4) after this, System (hostname) is in Down State - Membership: 0x4a > 5) VCS:10451:Cleared attribute-'autodisabled' for Group on node, does > the autodisabled flag gets cleared on its own. Many times i have faced > situations where i have cleared the autodisable flag for servicegroup > manually. > > Does somebody know about these error messages & what could be the reason > behind this, specifically the system going to FAULTED state and service > groups getting autodisabled. > > Thanks >Solved3.5KViews0likes1CommentResource group in STARTING|PARTIAL
Can anyone explain step by step what we usually do when we have a resource group in STARTING|PARTIAL phase in vcs See below log: 2014/11/25 16:58:51 VCS ERROR V-16-2-13066 (localhost) Agent is calling clean for resource(cfsmount3) because the resource is not up even after online completed. 2014/11/25 16:58:52 VCS INFO V-16-2-13068 (localhost) Resource(cfsmount3) - clean completed successfully. 2014/11/25 16:58:52 VCS INFO V-16-2-13071 (localhost) Resource(cfsmount3): reached OnlineRetryLimit(0). 2014/11/25 16:58:52 VCS ERROR V-16-1-10303 Resource cfsmount3 (Owner: unknown, Group: vrts_vea_cfs_int_cfsmount2) is FAULTED (timed out) on sys (localhost) 2014/11/25 16:58:52 VCS INFO V-16-6-15004 (localhost) hatrigger:Failed to send trigger for resfault; script doesn't existSolved3.2KViews0likes1CommentService group concurrency violation
Hi Team, We have alerts ofconcurrency violation, we have two servers in cluster mapibm625, mapibm626 Logs are, 2014/12/26 19:37:03 VCS INFO V-16-1-10299 Resource App_saposcol (Owner: Unspecified, Group: sapgtsprd) is online on mapibm625 (Not initiated by VCS) 2014/12/26 19:37:03 VCS ERROR V-16-1-10214 Concurrency Violation:CurrentCount increased above 1 for failover group sapgtsprd 2014/12/26 19:37:03 VCS NOTICE V-16-1-10233 Clearing Restart attribute for group sapgtsprd on all nodes 2014/12/26 19:37:04 VCS WARNING V-16-6-15034 (mapibm625) violation:Offlining group sapgtsprd on system mapibm625 2014/12/26 19:37:04 VCS INFO V-16-1-50135 User root fired command: hagrp -offline sapgtsprd mapibm625 from localhost 2014/12/26 19:37:04 VCS NOTICE V-16-1-10167 Initiating manual offline of group sapgtsprd on system mapibm625 2014/12/26 19:37:04 VCS NOTICE V-16-1-10300 Initiating Offline of Resource App_saposcol (Owner: Unspecified, Group: sapgtsprd) on System mapibm625 2014/12/26 19:37:04 VCS INFO V-16-6-15002 (mapibm625) hatrigger:hatrigger executed /opt/VRTSvcs/bin/internal_triggers/violation mapibm625 sapgtsprd successfully 2014/12/26 19:37:04 VCS INFO V-16-10011-306 (mapibm625) Application:App_saposcol:offline:Execution of Stop Program (/opt/VRTSvcs/bin/Saposcol/offline) returned (0). 2014/12/26 19:37:05 VCS INFO V-16-2-13716 (mapibm625) Resource(App_saposcol): Output of the completed operation (offline) ============================================== 2014/12/26 19:37:06 VCS INFO V-16-1-10305 Resource App_saposcol (Owner: Unspecified, Group: sapgtsprd) is offline on mapibm625 (VCS initiated) 2014/12/26 19:37:06 VCS NOTICE V-16-1-10446 Group sapgtsprd is offline on system mapibm625 ======================================================================================== I have asked the application team to look out as whether they are working on the servers because the resource is of SAP(Resource App_saposcol) However, application team has replied that they are not working on it and might theApp_saposcol is online on both of servers which causes the issue. Then, I have checked the status of resources in both the servers and it says, [root@mapibm626]: # hares -state #Resource Attribute System Value App_saposcol State mapibm625 OFFLINE App_saposcol State mapibm626 ONLINE [root@mapibm625]: # hares -state #Resource Attribute System Value App_saposcol State mapibm625 OFFLINE App_saposcol State mapibm626 ONLINE and also checked the current logs of the server however found only, 2014/12/27 13:03:42 VCS INFO V-16-1-53504 VCS Engine Alive message!! 2014/12/27 17:03:43 VCS INFO V-16-1-53504 VCS Engine Alive message!! 2014/12/27 21:03:44 VCS INFO V-16-1-53504 VCS Engine Alive message!! 2014/12/28 01:03:45 VCS INFO V-16-1-53504 VCS Engine Alive message!! 2014/12/28 05:03:46 VCS INFO V-16-1-53504 VCS Engine Alive message!! 2014/12/28 09:03:47 VCS INFO V-16-1-53504 VCS Engine Alive message!! 2014/12/28 10:56:14 VCS INFO V-16-1-50086 CPU usage on mapibm625 is 61% 2014/12/28 11:26:14 VCS INFO V-16-1-50086 CPU usage on mapibm625 is 61% 2014/12/28 13:03:48 VCS INFO V-16-1-53504 VCS Engine Alive message!! 2014/12/28 14:26:14 VCS INFO V-16-1-50086 CPU usage on mapibm625 is 60% 2014/12/28 17:03:49 VCS INFO V-16-1-53504 VCS Engine Alive message!! 2014/12/28 21:03:50 VCS INFO V-16-1-53504 VCS Engine Alive message!! 2014/12/29 01:03:51 VCS INFO V-16-1-53504 VCS Engine Alive message!! 2014/12/29 05:03:52 VCS INFO V-16-1-53504 VCS Engine Alive message!! 2014/12/29 09:03:53 VCS INFO V-16-1-53504 VCS Engine Alive message!! 2014/12/29 13:03:55 VCS INFO V-16-1-53504 VCS Engine Alive message!! ========================================================================== Please assist what could be the possible reasons for this and in future how to avoid this? Thanks, AllaboutunixSolved2.6KViews1like7CommentsService group shows online and cluster service offline
Hi TEam, I have an output of the hastatus which is attached, It shows ClusterService online in 35th system and rest of the system shows Clusterservice Offline however, Service group POWERCENTERSERVICEMANAGER shows online in all systems 35,36,37,38,39,40. I am unable to get it why the Service groupshows online in all the other systems when only 35th system ClusterService is online and rest others are offline. Do that the scenario of Active-Active Cluster? Please help me to understand this scenario.Solved2.2KViews1like2CommentsPlan for DIMM replacement activity(VCS nodes)
Hi Team, We have to replace DIMM on the passive node in which VCS services is currently not running.The cross over LLTcables is badly hanged wih each other and Symantec engineer told us that he will manage the cable issue and no require to down the active node. Kindly guide step by step procedure for this activity,also please suggest the prerequisities before starting this activity. This is very very crucial activity as Class A application is running on the active node. Currently, SG is running in Sydneyserver.We have to perform activity on Madagascar server. -- SYSTEM STATE -- System State Frozen A Sydney RUNNING 0 A Madagascar RUNNING 0 -- GROUP STATE -- Group System Probed AutoDisabled State B ClusterService Sydney Y N ONLINE B ClusterService Madagascar Y N OFFLINE B ORA_SG_Group Sydney Y N ONLINE B ORA_SG_Group Madagascar Y N OFFLINE Kindly suggest as soon as possible. Thanks in advance.. AllaboutunixSolved2.1KViews1like6Comments