Tag:"Service group"

Need way to create VCS group attributes
We have long been able to create new VCS resource attributes using hatype and haattr commands. We would like to be able to create new group attributes. In our particular use case, we would like to create a temporary group attribute called SleepMonitorthat someone with operator privileges can update. (similar to the TFrozen group attribute) We would set SleepMonitor to an integer time() value that specifies when our custom monitoring script will stop ignoring the state of a service group and alert the GroupOwner when something is wrong. We want this attribute to be temporary so that it doesn't clutter up main.cf or require main.cf to be updated when a user wants to sleep monitoring. The group attribute UserIntGlobal could work if it wasn't a permanent attribute requiring the VCS configuration to be open to modify it using administrator privileges. From time to time we come up with other ideas of things we could do that require new group attributes to be created. Having the ability to create new group attributes would be phenomenal. -Seann
Solved
S_Herdejurgen
9 years ago Place Cluster Server
1.4KViews
1like
1Comment
deleting rlink that's having "secondary_config_err" flag
hello, in my VCS global cluster my ORAGrp resource group is partially online since my rvgres is offline, i am suspecting the issue in the below rlink. i am trying to dissociate thi rlink(below:rlk_sys1-DB-rep_DB_r) and dettach it in order to delete it but i am not able to succeed. below are some output from the system. root@sys2# vxprint -P Disk group: DBhrDG TY NAME ASSOC KSTATE LENGTH PLOFFS STATE TUTIL0 PUTIL0 rl rlk_sys1-DB-rep_DB_r DB_rvg CONNECT - - ACTIVE - - rl rlk_sys1-rep_DB-rvg DB-rvg ENABLED - - PAUSE - - root@sys2# vxrlink -g DBhrDG dis rlk_sys1-rep_DB-rvg VxVM VVR vxrlink ERROR V-5-1-3520 Rlink rlk_sys1-rep_DB-rvg can not be dissociated if it is attached root@sys2# vxrlink -g DBhrDG det rlk_sys1-rep_DB-rvg VxVM VVR vxrlink ERROR V-5-1-10128 Operation not allowed with attached rlinks root@sys2# vxedit -g DBhrDG rm rlk_sys1-rep_DB-rvg VxVM vxedit ERROR V-5-1-3540 Rlink rlk_sys1-rep_DB-rvg is not disabled, use -f flag root@sys2# vxedit -g DBhrDG -f rm rlk_sys1-rep_DB-rvg VxVM vxedit ERROR V-5-1-3541 Rlink rlk_sys1-rep_DB-rvg is not dissociated root@sys2# vxprint -Vl Disk group: DBhrDG Rvg: DB-rvg info: rid=0.1317 version=0 rvg_version=41 last_tag=11 state: state=CLEAN kernel=DISABLED assoc: datavols=(none) srl=(none) rlinks=rlk_sys1-rep_DB-rvg exports=(none) vsets=(none) att: rlinks=rlk_sys1-rep_DB-rvg flags: closed secondary disabled detached passthru logging device: minor=26012 bdev=343/26012 cdev=343/26012 path=/dev/vx/dsk/DBhrDG/DB-rvg perms: user=root group=root mode=0600 Rvg: DB_rvg info: rid=0.1386 version=13 rvg_version=41 last_tag=12 state: state=ACTIVE kernel=ENABLED assoc: datavols=sys1_DB_Process,sys1_DB_Script,... srl=sys1_DB_SRL rlinks=rlk_sys1-DB-rep_DB_r exports=(none) vsets=(none) att: rlinks=rlk_sys1-DB-rep_DB_r flags: closed secondary enabled attached logging device: minor=26014 bdev=343/26014 cdev=343/26014 path=/dev/vx/dsk/DBhrDG/DB_rvg perms: user=root group=root mode=0600 please advise regards
grigori
9 years ago Place Storage Foundation
1.4KViews
0likes
4Comments
Trigger after failed cleanup script
Hi there, I have a system where the cleanup script can fail/timeout and I want to execute another script if this happens. And I was wondering which can be the best way of doing this. In the veritas cluster server administrators guide for Linux I found the trigger RESNOTOFF. From the documentation it is my understanding that this trigger will be triggered in the following cases: A resource fails going offline (started by VCS) and the clean up fails. A resource goes offline unexpectedly and the clean up fails. I have tested this and the RESNOTOFF is working in the first scenario but not in the second. For testing the second scenario I kill the service and I can see the following message in the engine_A.log: VCS ERROR V-16-2-13067 (node1) Agent is calling clean for resource(service1) because the resource became OFFLINE unexpectedly, on its own. When the cleanup fails I would expect the resource to became UNABLE TO OFFLINE. However, the status of the resource is still ONLINE: # hares -state service1 #Resource Attribute System Value service1 State node1 ONLINE service1 State node2 OFFLINE So the resource is ONLINE and VCS keeps running the cleanup command indefinitely (which is failing). I was wondering if I need to configure something else to make the RESNOTOFF to work in this particular scenario. Thanks,
Solved
javierrv
9 years ago Place Cluster Server
936Views
0likes
3Comments
RESNOTOFF not triggered.
Hi, I am following the veritas cluster server administrators guide for linux and trying to trigger the resnotoff script. From the documentation it is my understanding that is a resource faults and the clean command returns 1, resnotoff should be triggered. To begin my service group is in an ONLINE state: [root@node1 ~]# hastatus -sum | grep test B Grp_CS_c1_testservice node1 Y N ONLINE B Grp_CS_c1_testservice node2 Y N ONLINE I have the clean limit set to 1 and the clean script set to /bin/false to force this to return an error exit code. Res_App_c1_fmmed1_testapplication ArgListValues node1 User 1 root StartProgram 1 "/usr/share/litp/vcs _lsb_start vmservice 5" StopProgram 1 "/usr/share/litp/vcs_lsb_stop vmservice 5" CleanProgram 1 /bin/false M onitorProgram 1 "/usr/share/litp/vcs_lsb_status vmservice" PidFiles 0 MonitorProcesses 0 EnvF ile 1 "" UseSUDash 1 0 State 1 2 IState 1 0 Res_App_c1_fmmed1_testapplication ArgListValues node2 User 1 root StartProgram 1 "/usr/share/litp/vcs _lsb_start vmservice 5" StopProgram 1 "/usr/share/litp/vcs_lsb_stop vmservice 5" CleanProgram 1 /bin/false M onitorProgram 1 "/usr/share/litp/vcs_lsb_status vmservice" PidFiles 0 MonitorProcesses 0 EnvF ile 1 "" UseSUDash 1 0 State 1 2 IState 1 0 Res_App_c1_fmmed1_testapplication CleanProgram global /bin/false Res_App_c1_fmmed1_testapplication CleanRetryLimit global 1 The resnotoff is enables for this resource Res_App_c1_fmmed1_testapplication TriggersEnabled global RESNOTOFF Now I manually kill the service Grp_CS_c1_testservice on node 1 and see the following in the /var/log/messages Jun 16 17:02:33 node1 AgentFramework[10323]: VCS ERROR V-16-2-13067 Thread(4147325808) Agent is calling clean for resource(Res_App_c 1_fmmed1_testapplication) because the resource became OFFLINE unexpectedly, on its own. Jun 16 17:02:33 node1 Had[9975]: VCS ERROR V-16-2-13067 (node1) Agent is calling clean for resource(Res_App_c1_fmmed1_testapplicatio n) because the resource became OFFLINE unexpectedly, on its own. Jun 16 17:02:34 node1 AgentFramework[10323]: VCS ERROR V-16-2-13069 Thread(4147325808) Resource(Res_App_c1_fmmed1_testapplication) - clean failed. and in the engine_A.log 2015/06/16 17:02:33 VCS ERROR V-16-2-13067 (node1) Agent is calling clean for resource(Res_App_c1_fmmed1_testapplication) because the resourc e became OFFLINE unexpectedly, on its own. 2015/06/16 17:02:34 VCS INFO V-16-10031-504 (node1) Application:Res_App_c1_fmmed1_testapplication:clean:Executed /bin/false as user root 2015/06/16 17:02:35 VCS ERROR V-16-2-13069 (node1) Resource(Res_App_c1_fmmed1_testapplication) - clean failed. 2015/06/16 17:03:35 VCS ERROR V-16-1-50148 ADMIN_WAIT flag set for resource Res_App_c1_fmmed1_testapplication on system node1 with the reason 4 2015/06/16 17:03:35 VCS INFO V-16-10031-504 (node1) Application:Res_App_c1_fmmed1_testapplication:clean:Executed /bin/false as user root From my understanding of the VCS adminisrator guide section titles 'VCS behavior when an online resource faults' the resnotoff should be triggered however it is not and the resource goes to an ADMIN WAIT state. group resource system message --------------- -------------------- --------------- -------------------- Res_App_c1_fmmed1_testapplication node1 |ADMIN WAIT| Is it possible to get the resnotoff triggered for a cluster in this state or do I need to use the resadminwait trigger (contrary to the documentation). Thanks,
Solved
justinfay
9 years ago Place Cluster Server
1.2KViews
0likes
3Comments
Integrating SAP with VCS 6.2 (on Oracle Linux 6.5)
Hi, I was wondering if someone has some additional information regarding how to setup my cluster... I have both VCS (inclusing Storage Foundation) and Linux knowledge. I do however have no background in SAP. And as SAP is a very complex product, I can not see the forest because of the trees... Setup 2 node (active-passive) cluster of Oracle Linux 6.5 nodes. Veritas Storage Foundation HA (= VxVM + DMP + VCS). Oracle 11.2 as database. SAP ECC 6.0 Apart from the Installation & Configuration guide on the SAP NetWeaver Agent, I found little information about implementing SAP in VCS. Source: "Symantec™ High Availability Agent for SAP NetWeaver Installation and Configuration Guide for Linux 6.2". But unfortunately I can not find a howto, guide or whatever from Symantec, nor from the usual Google attempts. My customer is however also not very SAP knowledged. From what I understand it is a very basic SAP setup, if not the simplest. They are using SAP ECC6.0 and an Oracle 11.2 database. So I assume they are just having a Central Instance and the Database. After some Google resource, I found out that SAP ECC 6.0 is technically a SAP NetWeaver 7.0. On Symantec SORT, I found 3 versions of SAP NetWeaver. I downloaded the first one, as the descripton says: SAP NetWeaver SAP NetWeaver 7.1, 7.2, 7.3, 7.4 SAP NetWeaver 7.1, 7.3 Agent: SAPNW04 5.0.16.0 Application version(s): SAP R/3 4.6, R/3 Enterprise 4.7, NW04, NW04s, ERP/ECC 5.0/6.0, SCM/APO 4.1/5.0/5.1/7.0, SRM 4.0/5.0/7.0, CRM 4.0/5.0/7.0 Source: https://sort.symantec.com/agents/detail/1077 SAP ERP 2005 = SAP NetWeaver 2004s (BASIS 7.00) = ECC 6.0 Source: http://itknowledgeexchange.techtarget.com/itanswers/difference-bet-ecc-60-sap-r3-47/ Source: http://www.fasttrackph.com/sap-ecc-6-0/ Source : http://wulibi.blogspot.be/2010/03/what-is-sap-ecc-60-in-brief.html Currently I have this setup unfinshed: Installed & configured Storage Foundation HA on both nodes. Instaled the ACC Libraries on both nodes. see: https://sort.symantec.com/agents/detail/1183 Installed the SAP NetWeaver Agent on both nodes. see: https://sort.symantec.com/agents/detail/1077 Configured next to the CusterServiceGroup, 3 Service Groups: SG_sap the shared storage Resources: DiskGroup + Volumes + Mount. the SAPNW Agent Resource. SG_oracle the shared storage Resources: DiskGroup + Volumes + Mount the Oracle Agent Resurce. SG_nfs still empty. SAPNW Agent. SAP instance type The SAPNW Agent documentation states: The agent supports the following SAP instance types: Central Services Instance Application Server Instance Enqueue Replication Server Instance. Source: "Symantec™ High Availability Agent for SAP NetWeaver Installation and Configuration Guide for Linux 6.2" But I guess the SAP ECC 6.0 has them all in one central instance, right? So I only need one SAPNW Agent. How is the SAP installed: only ABAP only Java add-in (both ABAP and Java). Source: "Symantec™ High Availability Agent for SAP NetWeaver Installation and Configuration Guide for Linux 6.2" I have no idea. How can I find this out? InstName Attribute Another thing is the InstName Attribute. This also does not correspond with the information I have. My SAP intance is T30. So the syntax is correct more or less, but it isn't listed below. Which is important also to decide on the value for the ProcMon Attribute The SAPSID and InstName form a unique identifier that can identify the processes running for a particular instance. Some examples of SAP instances are given as follows: InstName = InstType DVEBMGS00 = SAP Application Server - ABAP (Primary) D01 SAP = Application Server - ABAP (Additional) ASCS02 = SAP Central Services - ABAP J03 = SAP Application Server - Java SCS04 = SAP Central Services - Java ERS05 = SAP Enqueue Replication Server SMDA97 = Solution Manager Diagnostics Agent Source: "Symantec™ High Availability Agent for SAP NetWeaver Installation and Configuration Guide for Linux 6.2" In the listing of the required attributes it is also stated. However, the default value is CENTRAL. I guess this is correct in my case? InstName Attribute: An identifier that classifies and describes the SAP server instance type. Valid values are: APPSERV: SAP Application Server ENQUEUE: SAP Central Services ENQREP: Enqueue Replication Server SMDAGENT: Solution Manager Diagnostics Agent SAPSTARTSRV: SAPSTARTSRV Process Note: The value of this attribute is not case-sensitive. Type and dimension: string-scalar Default: APPSERV Example: ENQUEUE EnqSrvResName Attribute A required attribute is the EnqSrvResName Attribute. The documentation says this should be the Resource Name for the SAP Central Instance. But I am assuming I only have a SAP Central Instance. So I guess I should use the name of my SAP Agent Resouce from my SAP Service Group? EnqSrvResName Attribute: The name of the VCS resource for SAP Central Services (A)SCS Instance. This attribute is used by Enqueue and Enqueue Replication Server. Using this attribute the Enqueue server queries the Enqueue Replication Server resource state while determining the fail over target and vice a versa. Type and dimension: string-scalar Default: No default value Example: SAP71-PI1SCS_sap Source: "Symantec™ High Availability Agent for SAP NetWeaver Installation and Configuration Guide for Linux 6.2" Is anyone able to help me out? Thanks in advance.
Solved
sanderfiers
10 years ago Place Cluster Server
2.4KViews
2likes
9Comments
ForceImport
Hi all, I need help to good understand attribute ForceImport. There is a table "Failure situations" in Symantec Storage Foundatio and High Availability Solutions Solutions Guide. (221p.) I don't understant situation number 7 Split-brain situation andstorage interconnect lost. ForceImport set to 0 - No interruption of service.Can’t import with only 50% of disks available. Disks on the same node are functioning. Mirroring is not working. ForceImport set to 1 - Automatically imports disks on secondary site. Now disks are online in both locations—data can be kept from only one. Who can unswer this question: "Now disks are online in both locations—data can be kept from only one" What is the purpes? What we will get if first 50% of disks will imported on 1 site and second 50% will imported on 2 site?
Solved
KANSTANTSIN
10 years ago Place Storage Foundation for Windows
1.3KViews
0likes
5Comments
VCS - Resource to mount CIFS shares
Hi, I want to manage a CIFS mount within a Service Group and cannot find the appropriate resource. I just one to mount a CIFS share in my VCS cluster, acting as a client. Can anyone let me know which Resource type should I use? Thanks & Regards, JL
Solved
Jose_Luis_B
10 years ago Place Cluster Server
2.6KViews
2likes
8Comments
Service Monitor
Hi, I have problem with serviceMonitor agent. I created script(www.bat): @echo off ping 172.16.11.1>nul if %errorlevel% == 0 goto A :A exit 110 if %errorlevel% == 1 goto B :B exit 100 Script's directory - C:\Program Files\Veritas\cluster server\bin\www.bat I added resouse ServiceMonitor ServiceOrScriptName:C:\Program Files\Veritas\cluster server\bin\www.bat user:admin@jambo.ga password:XXXXXXXX domain:jamdo.ga Status for agent UNKNOWN for all servers in cluster. ServiceMonitor`s logs: 2015/06/24 07:15:26 VCS ERROR V-16-10051-7506 ServiceMonitor:ping_def:monitor:Failed to open the service 'C:\Program Files\Veritas\cluster server\bin\www.bat'. Error = 1060. 2015/06/24 07:15:32 VCS ERROR V-16-10051-7506 ServiceMonitor:ping_def:monitor:Failed to open the service 'C:\Program Files\Veritas\cluster server\bin\www.bat'. Error = 1060. 2015/06/24 07:16:27 VCS ERROR V-16-10051-7506 ServiceMonitor:ping_def:monitor:Failed to open the service 'C:\Program Files\Veritas\cluster server\bin\www.bat'. Error = 1060. 2015/06/24 07:17:00 VCS ERROR V-16-10051-7506 ServiceMonitor:ping_def:monitor:Failed to open the service 'C:\Program Files\Veritas\cluster server\bin\www.bat'. Error = 1060. Who know how I can resolve it?
Solved
KANSTANTSIN
10 years ago Place Storage Foundation for Windows
1.2KViews
0likes
3Comments
Failed to get the MSDTC Security configuration for VCS from registry
Hi, I created campus cluster with 2 servers (Operating system: Windows server 2012 R2. SQL server 2012. Symantec Storage Foundation 6.1.) Service group became online through time. In first time - Faulted. I clear fault on this server and try up service group. Service online. I made it offline then try up and got faulted. Then online, faulted and so on... Faulted SQL agent. In SQL server logs I didn't find any errors. In C:\Program Files\Veritas\cluster server\log\SQLserver_A.txt I got errors 2015/05/16 20:41:33 VCS NOTICE V-16-20093-75 SQLServer:SQLServer-SQL1:online:Failed to get the MSDTC Security configuration for VCS from registry.Error : [2, 2] 2015/05/16 20:47:50 VCS ERROR V-16-20093-11 SQLServer:SQLServer-SQL1:online:Failed to wait for the service 'MSSQL$SQL1' to start. Error = [2 ,258] 2015/05/16 20:47:50 VCS DBG_21 V-16-50-0 SQLServer:SQLServer-SQL1:online:*** Start of debug information dump for troubleshooting *** LibLogger.cpp:VLibThreadLogQueue::Dump[206] 2015/05/16 20:47:50 VCS DBG_21 V-16-50-0 SQLServer:SQLServer-SQL1:online:(2) CRegKey::Open failed for Software\Veritas\VCS\EnterpriseAgents\SQLServer\SQLServer-SQL1. LibVcsHive.cpp:VLibVcsHive::_GetDWORDValue[435] 2015/05/16 20:47:50 VCS DBG_21 V-16-50-0 SQLServer:SQLServer-SQL1:online:(2) CRegKey::Open failed for Software\Veritas\VCS\EnterpriseAgents\SQLServer\__Global__. LibVcsHive.cpp:VLibVcsHive::_GetDWORDValue[435] 2015/05/16 20:47:50 VCS DBG_21 V-16-50-0 SQLServer:SQLServer-SQL1:online:(2) _GetDWORDValue failed. Subkey = Software\Veritas\VCS\EnterpriseAgents\SQLServer\__Global__, Name = IgnoreMSDTCSecurity LibVcsHive.cpp:VLibVcsHive::GetValue[401] 2015/05/16 20:47:50 VCS DBG_21 V-16-50-0 SQLServer:SQLServer-SQL1:online:Wait timed out for service MSSQL$SQL1 LibService.cpp:VLibService::WaitForServiceStatus[275] 2015/05/16 20:47:50 VCS DBG_21 V-16-50-0 SQLServer:SQLServer-SQL1:online:*** End of debug information dump for troubleshooting *** LibLogger.cpp:VLibThreadLogQueue::Dump[217] 2015/05/16 20:47:50 VCS WARNING V-16-2-13140 Thread(10516) Could not find timer entry with id 274 2015/05/16 20:47:50 VCS INFO V-16-20093-29 SQLServer:SQLServer-SQL1:monitor:The 'MSSQL$SQL1' service is not in stopped or running state. State = 2. 2015/05/16 20:48:50 VCS INFO V-16-20093-29 SQLServer:SQLServer-SQL1:monitor:The 'MSSQL$SQL1' service is not in stopped or running state. State = 2. 2015/05/16 20:49:50 VCS INFO V-16-20093-29 SQLServer:SQLServer-SQL1:monitor:The 'MSSQL$SQL1' service is not in stopped or running state. State = 2. 2015/05/16 20:49:50 VCS ERROR V-16-2-13066 Thread(9380) Agent is calling clean for resource(SQLServer-SQL1) because the resource is not up even after online completed. 2015/05/16 20:49:50 VCS WARNING V-16-20093-55 SQLServer:SQLServer-SQL1:clean:The service 'MSSQL$SQL1' is not in running state. Attempt to stop it might be unsuccessful. 2015/05/16 20:53:32 VCS WARNING V-16-2-13140 Thread(9380) Could not find timer entry with id 279 2015/05/16 20:53:32 VCS ERROR V-16-2-13068 Thread(9380) Resource(SQLServer-SQL1) - clean completed successfully. 2015/05/16 20:53:32 VCS ERROR V-16-2-13071 Thread(9380) Resource(SQLServer-SQL1): reached OnlineRetryLimit(0). 2015/05/16 20:56:41 VCS NOTICE V-16-20093-75 SQLServer:SQLServer-SQL1:online:Failed to get the MSDTC Security configuration for VCS from registry.Error : [2, 2] There are two massages when service group became online 2015/05/16 20:58:58 VCS INFO V-16-20093-30002 SQLServer:SQLServer-SQL1:imf_register:Registering with IMF for online monitoring 2015/05/16 21:08:28 VCS INFO V-16-20093-30001 SQLServer:SQLServer-SQL1:imf_register:Registering with IMF for offline monitoring Can you heip me resolve this issue? Thanks in advance.
KANSTANTSIN
10 years ago Place Storage Foundation for Windows
1.3KViews
2likes
4Comments
VCS clustered Enterprise vault migrate index service to new node
Hi, We have three node cluster for Enterprise vault in Veritas Cluster server, the node details are as follows: Node 1: EV server with all services Node 2: SQL server Node 3: Spare server for EV service failover and SQl service failover. Also GCO and VVR is configured for SQl server and EV server volumes are replicating using VVR but are not in VCS control. Now we want to remove index service from EV server (node 1) and add it to new server, also add new server to the cluster. Can anybody share the steps we should follow from VCS end to perform above changes.
r1_abhinav
10 years ago Place Cluster Server
1.1KViews
0likes
5Comments