Failover

137 Topics

MSMQ Outgoing Queues referencing failed EV server - using Building Blocks for High Availability
Hi Everyone We make use of Building Blocks to build High Availability in our EV environment. We have come across a peculiar problem and wanted to know if anyone here has noted a similar issue. Scenario: We are in the archiving schedule and there are (say) 5 EV Mailbox Archiving Servers archiving 5 Exchange Servers. There are cross-references across EV and Exchange Servers i.e. the Storage Service for an archive is on an EV server other than the one which is running the Archival Task. Problem: If an EV Mailbox Archiving Servers fails, we use Building Blocks to quickly bring up services on a standby server by running Update Service Locations. However, we notice that items are still in archive pending state in user mailboxes being archived by other EV Servers. New items are also going into archive pending state across all user mailboxes. Analysis: We have already run a flushdns (or Clear-DNSClientCache in Powershell) across all EV servers. In the MSMQ Outgoing Queues, we noticed that there are queues which still resolve to the old failed servers and are in Failed to Connect or Waiting to Connect state, since MSMQ does not resolve to the EV aliases but the EV server hostnames. Solution: To resolve the issue, we need to add a host file entry resolving thehostname of the failed server to the IP address of the now active server and restart the storage/taskcontroller service. Has anyone of you come across a similar situation? We wish to avoid making host file entries and make thefailover process seamless.
arnoldmathias
8 years ago Place Enterprise Vault
848Views
0likes
4Comments
Automatic Failover
I would like to use Veritas Cluster Server to achieve high availability and automatic fail over for my applications. These application are java applications with some services with the backend databases using Sybase The java applications can be broken into 2 parts : web layer and application layer. The underlying infrastructure will be in RHEL Linux for java applications and database (Sybase) My question is : is VCS supporting seamless automatic failover for the java services including database services without requiring manual intervention ? What i want to achieve is : after i setup active-passive for application layer, i expect the active node to automatically failover to passive node and immediately the passive node become active node
Solved
haditeo
9 years ago Place Cluster Server
998Views
0likes
1Comment
Understanding RestartLimit for non critical ressource
Hello, we have some trouble with our oracle listener process. sometimes the listener is killed by vcs. We dont know why. xxx VCS ERROR V-16-2-13027 (node1) Resource(lsnr-ORADB1) - monitor procedure did not complete within the expected time. xxx VCS ERROR V-16-2-13210 (node1) Agent is calling clean for resource(lsnr-ORADB1) because 4 successive invocations of the monitor procedure did not complete within the expected time. xxx VCS NOTICE V-16-20002-42 (node1) Netlsnr:lsnr-ORADB1:clean:Listener(LISTENER) kill TERM 2342 xxx VCS INFO V-16-2-13068 (node1) Resource(lsnr-ORADB1) - clean completed successfully. xxx VCS INFO V-16-2-13026 (node1) Resource(lsnr-ORADB1) - monitor procedure finished successfully after failing to complete within the expected time for (4) consecutive times. xxx VCS INFO V-16-1-10307 Resource lsnr-ORADB1 (Owner: unknown, Group: ORADB1) is offline on node1 (Not initiated by VCS) However,Resource(lsnr-ORADB1) is set to non-critical, to prevent an failover. I'll now set an RestartLimit forResource(lsnr-ORADB1) to let the cluster try to restart the listener, but what happen if this failed?Will the Ressouce still staying offline or initiate the cluster an failover for the whole ResourceGroup? thanks in advance for any help!
Foo_Bar_BLN
9 years ago Place Cluster Server
1.9KViews
0likes
3Comments
deleting rlink that's having "secondary_config_err" flag
hello, in my VCS global cluster my ORAGrp resource group is partially online since my rvgres is offline, i am suspecting the issue in the below rlink. i am trying to dissociate thi rlink(below:rlk_sys1-DB-rep_DB_r) and dettach it in order to delete it but i am not able to succeed. below are some output from the system. root@sys2# vxprint -P Disk group: DBhrDG TY NAME ASSOC KSTATE LENGTH PLOFFS STATE TUTIL0 PUTIL0 rl rlk_sys1-DB-rep_DB_r DB_rvg CONNECT - - ACTIVE - - rl rlk_sys1-rep_DB-rvg DB-rvg ENABLED - - PAUSE - - root@sys2# vxrlink -g DBhrDG dis rlk_sys1-rep_DB-rvg VxVM VVR vxrlink ERROR V-5-1-3520 Rlink rlk_sys1-rep_DB-rvg can not be dissociated if it is attached root@sys2# vxrlink -g DBhrDG det rlk_sys1-rep_DB-rvg VxVM VVR vxrlink ERROR V-5-1-10128 Operation not allowed with attached rlinks root@sys2# vxedit -g DBhrDG rm rlk_sys1-rep_DB-rvg VxVM vxedit ERROR V-5-1-3540 Rlink rlk_sys1-rep_DB-rvg is not disabled, use -f flag root@sys2# vxedit -g DBhrDG -f rm rlk_sys1-rep_DB-rvg VxVM vxedit ERROR V-5-1-3541 Rlink rlk_sys1-rep_DB-rvg is not dissociated root@sys2# vxprint -Vl Disk group: DBhrDG Rvg: DB-rvg info: rid=0.1317 version=0 rvg_version=41 last_tag=11 state: state=CLEAN kernel=DISABLED assoc: datavols=(none) srl=(none) rlinks=rlk_sys1-rep_DB-rvg exports=(none) vsets=(none) att: rlinks=rlk_sys1-rep_DB-rvg flags: closed secondary disabled detached passthru logging device: minor=26012 bdev=343/26012 cdev=343/26012 path=/dev/vx/dsk/DBhrDG/DB-rvg perms: user=root group=root mode=0600 Rvg: DB_rvg info: rid=0.1386 version=13 rvg_version=41 last_tag=12 state: state=ACTIVE kernel=ENABLED assoc: datavols=sys1_DB_Process,sys1_DB_Script,... srl=sys1_DB_SRL rlinks=rlk_sys1-DB-rep_DB_r exports=(none) vsets=(none) att: rlinks=rlk_sys1-DB-rep_DB_r flags: closed secondary enabled attached logging device: minor=26014 bdev=343/26014 cdev=343/26014 path=/dev/vx/dsk/DBhrDG/DB_rvg perms: user=root group=root mode=0600 please advise regards
grigori
9 years ago Place Storage Foundation
1.4KViews
0likes
4Comments
Storage Foundation
Hi im new to the product and the field, so really any advice would be great. where exactly do you install VOM in your SAN environment and does the SF s/w collide with the storage vendor management s/w. And how do you control a heterogenous storage with SF Also can someone explain the licensing.
Solved
Amal_Jacob_John
9 years ago Place Storage Foundation for Windows
2KViews
0likes
7Comments
adding new volumes to a DG that has a RVG under VCS cluster
hi, i am having a VCS cluster with GCO and VVR. on each node of the cluster i have a DG with an associated RVG, this RVG contains 11 data volume for Oracle database, these volumes are getting full so i am going to add new disks to the DG and create new volumes and mount points to be used by the Oracle Database. my question:can i add the disks to the DG and volumes to RVGwhile the database is UP and the replication is ON? if the answer is no, please let me know what should be performed on the RVG and rlinkto add these volumes also what to perform on the database resource group to not failover. thanks in advance.
Solved
grigori
9 years ago Place Storage and Clustering
4.4KViews
0likes
14Comments
Windows 2012 R2 Failover cluster and SF Volume Manager Disk Group
Hi, I just found this article and its quite the problem I'm facing right now: https://www-secure.symantec.com/connect/forums/one-cooperation-issue-about-windows-2012-r2-failover-cluster-and-sf-cluster-shared-disk-group I created a new Windows Server 2012 R2 Failover Cluster with several mirrored Disks in DynamicDiskGroups in StorageFoundation. As a next step, I wanted to add the Diskgroups from SF as a Ressource to my Failover Roles. Actual scenario: Cluster Role Name: C0002Z DiskGroupName in SF: DG_C0002Z_1 Then I added a Ressource to the Cluster Role via Add Resource -> More Resources -> Volume Manager Disk Group I changed the Name of the VMDG in "General" to "DG_C0002Z_1", apply, ok. Then I wanted to add the Name of the DiskGroup in Properties -> Properties -> DiskGroupName of the VMDG to "DG_C0002Z_1". And thats where I dont get any further. It doesn't matter which Name I want to enter into DiskGroupName, theres always this Error: There was an error saving properties for 'DG_C0002Z_1'. Failed to execute control code '20971654'. Error Code: 0x800700a0 One or more arguments are not correct I also tried to enter Names without special characters but that doesnt help eather. Do you have any Idea how to fix or bypass this issue? Thanks
CHPost
10 years ago Place Storage Foundation for Windows
2.4KViews
0likes
1Comment
ForceImport
Hi all, I need help to good understand attribute ForceImport. There is a table "Failure situations" in Symantec Storage Foundatio and High Availability Solutions Solutions Guide. (221p.) I don't understant situation number 7 Split-brain situation andstorage interconnect lost. ForceImport set to 0 - No interruption of service.Can’t import with only 50% of disks available. Disks on the same node are functioning. Mirroring is not working. ForceImport set to 1 - Automatically imports disks on secondary site. Now disks are online in both locations—data can be kept from only one. Who can unswer this question: "Now disks are online in both locations—data can be kept from only one" What is the purpes? What we will get if first 50% of disks will imported on 1 site and second 50% will imported on 2 site?
Solved
KANSTANTSIN
10 years ago Place Storage Foundation for Windows
1.3KViews
0likes
5Comments
how LLT heartbeat setting MAC address ?
I was able to set theMAC addressin LLT , but its getting changed after every server reboot, Node A # lltstat -nvv | head -10 LLT node information: Node State Link Status Address * 0 node a OPEN vnet2 UP 00:14:4F:F8:3B:B7 vnet3 UP 00:14:4F:F9:73:DC -- vnet1 UP 00:14:4F:F8:F5:AC --- 1 node b OPEN vnet2 UP 00:14:4F:FA:2B:77 vnet3 UP 00:14:4F:FB:5E:07 -- vnet1 UP 00:14:4F:F9:E0:17 -- Node B # lltstat -nvv | head -10 LLT node information: Node State Link Status Address 0 node a OPEN vnet2 UP 00:14:4F:F8:3B:B7 vnet3 UP 00:14:4F:F9:73:DC -- vnet1 UP 00:14:4F:F8:F5:AC --- * 1 node b OPEN vnet2 UP 00:14:4F:FA:2B:77 vnet3 UP 00:14:4F:F9:E0:17 --- vnet1 UP 00:14:4F:FB:5E:07 ---- I got abovepost VCS instllation, So I followedbelow steps to change the MAC address for Vnet1 and Vnet3 gabconfig -U svcadm disable svc:/system/gab:default lltconfig -k disable svcadm disable svc:/system/llt:default svcadm enable svc:/system/llt:default lltconfig -k enable svcadm enable svc:/system/gab:default /sbin/gabconfig -c -n2 It has changed here, Node A: # lltstat -nvv | head -10 LLT node information: Node State Link Status Address * 0 node a OPEN vnet2 UP 00:14:4F:F8:3B:B7 vnet3 UP 00:14:4F:F9:73:DC vnet1 UP 00:14:4F:F8:F5:AC 1 node b OPEN vnet2 UP 00:14:4F:FA:2B:77 vnet3 UP 00:14:4F:FB:5E:07 vnet1 UP 00:14:4F:F9:E0:17 Node B # lltstat -nvv | head -10 LLT node information: Node State Link Status Address 0 node a OPEN vnet2 UP 00:14:4F:F8:3B:B7 vnet3 UP 00:14:4F:F9:73:DC vnet1 UP 00:14:4F:F8:F5:AC * 1 node b OPEN vnet2 UP 00:14:4F:FA:2B:77 vnet3 UP 00:14:4F:FB:5E:07 vnet1 UP 00:14:4F:F9:E0:17 but after server reboot went to orginal state of non-sync mode. where Vnet1's MAC and Vnet3's MAC changed. Any body know how the setting happens? how it is fetching and from where? how to make it permanent?
Solved
skmohanbe
10 years ago Place Cluster Server
1.4KViews
0likes
2Comments
Failed to get the MSDTC Security configuration for VCS from registry
Hi, I created campus cluster with 2 servers (Operating system: Windows server 2012 R2. SQL server 2012. Symantec Storage Foundation 6.1.) Service group became online through time. In first time - Faulted. I clear fault on this server and try up service group. Service online. I made it offline then try up and got faulted. Then online, faulted and so on... Faulted SQL agent. In SQL server logs I didn't find any errors. In C:\Program Files\Veritas\cluster server\log\SQLserver_A.txt I got errors 2015/05/16 20:41:33 VCS NOTICE V-16-20093-75 SQLServer:SQLServer-SQL1:online:Failed to get the MSDTC Security configuration for VCS from registry.Error : [2, 2] 2015/05/16 20:47:50 VCS ERROR V-16-20093-11 SQLServer:SQLServer-SQL1:online:Failed to wait for the service 'MSSQL$SQL1' to start. Error = [2 ,258] 2015/05/16 20:47:50 VCS DBG_21 V-16-50-0 SQLServer:SQLServer-SQL1:online:*** Start of debug information dump for troubleshooting *** LibLogger.cpp:VLibThreadLogQueue::Dump[206] 2015/05/16 20:47:50 VCS DBG_21 V-16-50-0 SQLServer:SQLServer-SQL1:online:(2) CRegKey::Open failed for Software\Veritas\VCS\EnterpriseAgents\SQLServer\SQLServer-SQL1. LibVcsHive.cpp:VLibVcsHive::_GetDWORDValue[435] 2015/05/16 20:47:50 VCS DBG_21 V-16-50-0 SQLServer:SQLServer-SQL1:online:(2) CRegKey::Open failed for Software\Veritas\VCS\EnterpriseAgents\SQLServer\__Global__. LibVcsHive.cpp:VLibVcsHive::_GetDWORDValue[435] 2015/05/16 20:47:50 VCS DBG_21 V-16-50-0 SQLServer:SQLServer-SQL1:online:(2) _GetDWORDValue failed. Subkey = Software\Veritas\VCS\EnterpriseAgents\SQLServer\__Global__, Name = IgnoreMSDTCSecurity LibVcsHive.cpp:VLibVcsHive::GetValue[401] 2015/05/16 20:47:50 VCS DBG_21 V-16-50-0 SQLServer:SQLServer-SQL1:online:Wait timed out for service MSSQL$SQL1 LibService.cpp:VLibService::WaitForServiceStatus[275] 2015/05/16 20:47:50 VCS DBG_21 V-16-50-0 SQLServer:SQLServer-SQL1:online:*** End of debug information dump for troubleshooting *** LibLogger.cpp:VLibThreadLogQueue::Dump[217] 2015/05/16 20:47:50 VCS WARNING V-16-2-13140 Thread(10516) Could not find timer entry with id 274 2015/05/16 20:47:50 VCS INFO V-16-20093-29 SQLServer:SQLServer-SQL1:monitor:The 'MSSQL$SQL1' service is not in stopped or running state. State = 2. 2015/05/16 20:48:50 VCS INFO V-16-20093-29 SQLServer:SQLServer-SQL1:monitor:The 'MSSQL$SQL1' service is not in stopped or running state. State = 2. 2015/05/16 20:49:50 VCS INFO V-16-20093-29 SQLServer:SQLServer-SQL1:monitor:The 'MSSQL$SQL1' service is not in stopped or running state. State = 2. 2015/05/16 20:49:50 VCS ERROR V-16-2-13066 Thread(9380) Agent is calling clean for resource(SQLServer-SQL1) because the resource is not up even after online completed. 2015/05/16 20:49:50 VCS WARNING V-16-20093-55 SQLServer:SQLServer-SQL1:clean:The service 'MSSQL$SQL1' is not in running state. Attempt to stop it might be unsuccessful. 2015/05/16 20:53:32 VCS WARNING V-16-2-13140 Thread(9380) Could not find timer entry with id 279 2015/05/16 20:53:32 VCS ERROR V-16-2-13068 Thread(9380) Resource(SQLServer-SQL1) - clean completed successfully. 2015/05/16 20:53:32 VCS ERROR V-16-2-13071 Thread(9380) Resource(SQLServer-SQL1): reached OnlineRetryLimit(0). 2015/05/16 20:56:41 VCS NOTICE V-16-20093-75 SQLServer:SQLServer-SQL1:online:Failed to get the MSDTC Security configuration for VCS from registry.Error : [2, 2] There are two massages when service group became online 2015/05/16 20:58:58 VCS INFO V-16-20093-30002 SQLServer:SQLServer-SQL1:imf_register:Registering with IMF for online monitoring 2015/05/16 21:08:28 VCS INFO V-16-20093-30001 SQLServer:SQLServer-SQL1:imf_register:Registering with IMF for offline monitoring Can you heip me resolve this issue? Thanks in advance.
KANSTANTSIN
10 years ago Place Storage Foundation for Windows
1.3KViews
2likes
4Comments