Tag:"failover"

Can LLT heartbeats communicate between NICs with different device names?
One 2-node vcs cluster, the heartbeat NICs are eth2 and eth3 on each node, IF eth2 on node1 down, and eth3 on node2 down. Does this mean the 2 heartbeat Links both down, and the Cluster is in split brain situation? Can LLT heartbeats communicate between NIC eth2 and NIC eth3? Since the 《VCSInstallation Guide》requires the 2 heartbeat Links in different networks.We should put eth2 of both nodes in the VLAN (VLAN1), and put eth3 of both nodes in another vlan (VLAN2). So in this situation heartbeats cannot communicate between eth2 and eth3. But, in a production cluster system, we found out the 4 NICs--eth2 and eth3 of both nodes are all in a same VLAN. and this lead me to post the discussion thread to ask this question: IF eth2 on node1 down, and eth3 on node2 down, What will happen to the cluster (which isin active-standby mode)? Thanks!
Solved
zhangchao
12 years ago Place Cluster Server
1.6KViews
5likes
5Comments
Business Resilience That Supports Digital Initiatives
Discover the correlation between business resilience and successful digital transformation initiatives.
jostarke
6 years ago Place Availability
5.4KViews
3likes
0Comments
SmartIO blueprint and deployment guide for Solaris platform
SmartIO for Solaris was introduced in Storage Foundation HA 6.2. SmartIO enables data efficiency on your SSDs through I/O caching. Using SmartIO to improve efficiency, you can optimize the cost per Input/Output Operations Per Second (IOPS). SmartIO supports both read and write-back caching for the VxFS file systems that are mounted on VxVM volumes, in multiple caching modes and configurations. SmartIO also supports block-level read caching for applications running on VxVM volumes. The SmartIO Blueprint for Solaris give an overview of the benefits of using SmartIO technology, the underlying technology, and the essential configuration steps to configure it. In the SmartIO Deployment Guide for Solaris, multiple deployment scenarios of SmartIO and how to manage them are covered in detail. Let us know if you have any questions or feedback!
Moiz_A
10 years ago Place Storage and Clustering
456Views
3likes
0Comments
NFS share doesn't failover due to being busy
Hello! We are trying to implement a failover cluster, which hosts database and files on clustered NFS share. Files are used by the clustered application itself, and by several other hosts. The problem is, that when active node fails (I mean an ungraceful server shutdown or some clustered service stop), the other hosts still continue to use files on our cluster-hosted NFS share. That leads to an NFS-share "hanging", when it doesn't work on the first node, and still cannot be brought online of the second node. Other hosts also experience hanging of requests to that NFS share. Later, I will attach logs, where problem can be observed. The only possible corrective action found by us is total shutdown and sequential start of all cluster nodes and other hosts. Please recommend us a best-practice actions, required for using NFS share on veritas cluster server (maybe, some start/stop/clean scripts being included as a cluster resource, or additional cluster configuration options). Thank you, in advance! Best regards, Maxim Semenov.
Solved
semenov_m_o
11 years ago Place Cluster Server
4.4KViews
3likes
13Comments
Disable AutoFailOver from stopping services
I have several services that I am monitoring that are set to autofailover to a seconds system. It has some up that occasionally I need to restart a service without HA failing over to another system. I can disable the failover from happening using the following command hagrp -modify App_Cluster AutoFailOver 0 However what happens is that if the service is stopped HA continues to shutdown all the other services that are up. I was researching and I cam across disabling the Evacuate on HA, but even with it disabled, it still shuts down the other services. hagrp -modify App_Cluster Evacuate 0 I want the other services to continue to run even if one went down for some reason. What is the best way to accomplish this?
Solved
mkruer
13 years ago Place ApplicationHA
1.9KViews
3likes
4Comments
Creating Oracle HA on a Budget
Creating highly available Oracle databases with immediate failover is expensive, though sometimes justifiable. However, organizations whose SLA includes near-minute failover can consider a Veritas Clustered File System (CFS) solution. CFS is an option within Storage Foundation (SF); SF users need a simple license key to turn it on.
phil_samg
16 years ago Place Availability
542Views
2likes
0Comments
Failed to get the MSDTC Security configuration for VCS from registry
Hi, I created campus cluster with 2 servers (Operating system: Windows server 2012 R2. SQL server 2012. Symantec Storage Foundation 6.1.) Service group became online through time. In first time - Faulted. I clear fault on this server and try up service group. Service online. I made it offline then try up and got faulted. Then online, faulted and so on... Faulted SQL agent. In SQL server logs I didn't find any errors. In C:\Program Files\Veritas\cluster server\log\SQLserver_A.txt I got errors 2015/05/16 20:41:33 VCS NOTICE V-16-20093-75 SQLServer:SQLServer-SQL1:online:Failed to get the MSDTC Security configuration for VCS from registry.Error : [2, 2] 2015/05/16 20:47:50 VCS ERROR V-16-20093-11 SQLServer:SQLServer-SQL1:online:Failed to wait for the service 'MSSQL$SQL1' to start. Error = [2 ,258] 2015/05/16 20:47:50 VCS DBG_21 V-16-50-0 SQLServer:SQLServer-SQL1:online:*** Start of debug information dump for troubleshooting *** LibLogger.cpp:VLibThreadLogQueue::Dump[206] 2015/05/16 20:47:50 VCS DBG_21 V-16-50-0 SQLServer:SQLServer-SQL1:online:(2) CRegKey::Open failed for Software\Veritas\VCS\EnterpriseAgents\SQLServer\SQLServer-SQL1. LibVcsHive.cpp:VLibVcsHive::_GetDWORDValue[435] 2015/05/16 20:47:50 VCS DBG_21 V-16-50-0 SQLServer:SQLServer-SQL1:online:(2) CRegKey::Open failed for Software\Veritas\VCS\EnterpriseAgents\SQLServer\__Global__. LibVcsHive.cpp:VLibVcsHive::_GetDWORDValue[435] 2015/05/16 20:47:50 VCS DBG_21 V-16-50-0 SQLServer:SQLServer-SQL1:online:(2) _GetDWORDValue failed. Subkey = Software\Veritas\VCS\EnterpriseAgents\SQLServer\__Global__, Name = IgnoreMSDTCSecurity LibVcsHive.cpp:VLibVcsHive::GetValue[401] 2015/05/16 20:47:50 VCS DBG_21 V-16-50-0 SQLServer:SQLServer-SQL1:online:Wait timed out for service MSSQL$SQL1 LibService.cpp:VLibService::WaitForServiceStatus[275] 2015/05/16 20:47:50 VCS DBG_21 V-16-50-0 SQLServer:SQLServer-SQL1:online:*** End of debug information dump for troubleshooting *** LibLogger.cpp:VLibThreadLogQueue::Dump[217] 2015/05/16 20:47:50 VCS WARNING V-16-2-13140 Thread(10516) Could not find timer entry with id 274 2015/05/16 20:47:50 VCS INFO V-16-20093-29 SQLServer:SQLServer-SQL1:monitor:The 'MSSQL$SQL1' service is not in stopped or running state. State = 2. 2015/05/16 20:48:50 VCS INFO V-16-20093-29 SQLServer:SQLServer-SQL1:monitor:The 'MSSQL$SQL1' service is not in stopped or running state. State = 2. 2015/05/16 20:49:50 VCS INFO V-16-20093-29 SQLServer:SQLServer-SQL1:monitor:The 'MSSQL$SQL1' service is not in stopped or running state. State = 2. 2015/05/16 20:49:50 VCS ERROR V-16-2-13066 Thread(9380) Agent is calling clean for resource(SQLServer-SQL1) because the resource is not up even after online completed. 2015/05/16 20:49:50 VCS WARNING V-16-20093-55 SQLServer:SQLServer-SQL1:clean:The service 'MSSQL$SQL1' is not in running state. Attempt to stop it might be unsuccessful. 2015/05/16 20:53:32 VCS WARNING V-16-2-13140 Thread(9380) Could not find timer entry with id 279 2015/05/16 20:53:32 VCS ERROR V-16-2-13068 Thread(9380) Resource(SQLServer-SQL1) - clean completed successfully. 2015/05/16 20:53:32 VCS ERROR V-16-2-13071 Thread(9380) Resource(SQLServer-SQL1): reached OnlineRetryLimit(0). 2015/05/16 20:56:41 VCS NOTICE V-16-20093-75 SQLServer:SQLServer-SQL1:online:Failed to get the MSDTC Security configuration for VCS from registry.Error : [2, 2] There are two massages when service group became online 2015/05/16 20:58:58 VCS INFO V-16-20093-30002 SQLServer:SQLServer-SQL1:imf_register:Registering with IMF for online monitoring 2015/05/16 21:08:28 VCS INFO V-16-20093-30001 SQLServer:SQLServer-SQL1:imf_register:Registering with IMF for offline monitoring Can you heip me resolve this issue? Thanks in advance.
KANSTANTSIN
10 years ago Place Storage Foundation for Windows
1.3KViews
2likes
4Comments
Symantec ApplicationHA 6.2: Monitoring applications with Intelligent Monitoring Framework
Symantec ApplicationHA 6.2: Monitoring applications with Intelligent Monitoring Framework Introduced in this release, the Intelligent Monitoring Framework (IMF) feature improves ApplicationHA efficiency with: Faster detection of application faults Ability to monitor a large number of application components, with minimal effect on performance IMF is automatically enabled, if you use the Symantec High Availability Wizard to configure an application for monitoring. The feature was introduced in ApplicationHA 6.1 for Windows. In ApplicationHA 6.2, it is extended to AIX, Linux, and Solaris. For details, see the following topics: How intelligent monitoring works:AIX,Linux (KVM), Linux (VMware), andSolaris. Enabling debug logs for IMF:AIX,Linux (KVM),Linux (VMware), andSolaris. Gathering IMF information for support analysis:AIX,Linux (KVM),Linux (VMware), andSolaris. This release introduces IMF support for the folloing ApplicationHA agents: Apache HTTP Server DB2 Database (not applicable to Oracle VM Server for SPARC environment) Oracle Database Generic (custom) applications The following topics describe how to use the Symantec High Availability wizard to configure each supported application for IMF-enabled monitoring: Configuring application monitoring for Apache:AIX,Linux (KVM),Linux (VMware), andSolaris. Configuring application monitoring for DB2:AIX,Linux (KVM),and(Linux (VMware). Configuring application monitoring for Oracle:AIX,Linux (KVM),(Linux (VMware), andSolaris. Configuring application monitoring for generic applications:AIX,Linux (KVM),(Linux (VMware), andSolaris. You can use Symantec Cluster Server (VCS) commands to perform more advanced IMF actions. ApplicationHA and VCS documentation is available on the SORTwebsite.
Sanjay_Pendse
10 years ago Place Storage and Clustering
467Views
2likes
0Comments
SFHA Solutions 6.1: Using AdaptiveHA to select the largest system for failover
Symantec Cluster Server (VCS) service groups are virtual containers that manage groups of resources required to run a managed application. The FailOverPolicy service group attribute governs how VCS determines the target system for failover. For more information, see About service groups Service group attributes Cluster attributes About defining failover policies When you set FailOverPolicy to BiggestAvailable, AdaptiveHA enables VCS to dynamically select the cluster node with the most available resources to fail over an application. VCS monitors and forecasts the unused capacity of systems in terms of CPU, Memory, and Swap, to select the largest available system. If you set FailOverPolicy to BiggestAvailable for a service group, you must specify the load values in terms such as, 1 CPU, 1GB RAM, and 1GB SWAP, in the Load service group attribute.You only need to specify those resources that are used by the service group. For example, if the service group does not use the Swap resource, only specify the CPU and Memory resources in the Load attribute. Note: The Load FailOverPolicy isbeingdeprecated after this release. Symantec recommends that you change to theBiggestAvailableFailOverPolicy for enabling AdaptiveHA. For more information, see About AdaptiveHA Enabling AdaptiveHA for a service group If you upgrade VCS manually, ensure that you update the VCS configuration file (main.cf) to enable AdaptiveHA. When you upgrade from an older version of VCS using the installer, the main.cf file gets automatically upgraded. For more information, seeManually upgrading the VCS configuration file to the latest version VCS documentation for other platforms and releases can be found on theSORTwebsite.
Moiz_A
11 years ago Place Storage and Clustering
489Views
2likes
0Comments
vxio: Cluster software communication timeout. Reservation refresh has been suspended
Hi, We are experiencing this error on one of our clusters. It'sa two-node campus cluster with the following specifications SiteA Node1 is a Windows Server 2008 R2virtual machine residing on a ESXi 5.1 host in this site Disk1 and 3areLUNs in an enclosure in this site SiteB Node2 is a Windows Server 2008 R2 virtual machine residing on a ESXi 5.1 host in this site Disk2and 4areLUNs in an enclosure in this site We havecreated twoVMDGs, one contains Disk 1 and 2, while the other contains Disk 3 and 4.On these VMDGs, wehave created mirrored dynamic volumes.TheVMDGs arethen presented to the failover cluster. The quorum type on the failover cluster is a file share witness, onanother server. We are also running Microsoft System Center Configuration Manager to install updates and patches on Node 1 and 2. Whenever patches are installed on a node, it gets restarted. Whenever that occurs, failover from Node 1 to Node 2 occurs for the cluster resource group. Everything seems to failover just fine, and the VMDG is imported successfully (according to the log). But 10 minutes after the VMDG has been imported, the following error is logged on Node 2 http://s28.postimg.org/ubh8skfh9/vmdg2.png If I check the status of the VMDGs in VEA its Deported for both VMDGs. http://s3.postimg.org/72ort9683/vmdg3.png But even if the disks and VMDGs seem to be offline on the active node, failover does not occur, as in Failover Cluster Manager, the VMDG is online, but there are no volumes enumerated on it. http://s12.postimg.org/p31vncct9/vmdg1.png Has anyone else experienced the same, and knows why the status of the disks change to deported, without failover occuring?
Solved
Balthier35
10 years ago Place Storage Foundation for Windows
2.6KViews
2likes
6Comments