11-06-2013 02:41 AM
When using SFCFS for fast failover, so that only one node reads and writes to a filesystem at anyone time and SFCFS is just used to speed up failover by avoiding having to import diskgroups and mount filesystems, there is a requirement to move the CFS primary with the failover application service group so that the application doing the writes is on the same system as the CFS primary.
For me this is standard functionality, as SFCFS is particularly sold as a fast failover solution, like as an alternative to Oracle RAC (in addition of course to being able to write to filesystems simultaneously), but I can't find any referece to standard tools for moving the CFS primary in VCS - does anyone know if any exist?
For me the most logical approach for this would be to have a trigger script or resource which looks at the CFS service group it is dependent on and promotes the CFSMounts in that child service group which I can write myself, but I would have thought that a standard script/agent exists.
Thanks
Mike
Solved! Go to Solution.
11-15-2013 04:53 AM
Hi mike,
having CFS primary on the same node as the application is not required.
the performance impact if primary is on different node should actually below 1%.
You can test this by doing an I/O test while primary is on another node and then switch the primary to the node on which your database is running and perform I/O test again.
For the cluster startup you can set a policy so that CFS master would always start on your preferred node if available.
http://www.symantec.com/business/support/index?page=content&id=TECH39266
If you want the primary to always switch along with the database you will need to write your own trigger script.
There is no general feature that will do that as you might have more than one service group depending on the cfs file system. As an example lets say you have 3 node cluster, node a fails, sg1 fails over to node b and sg2 fails over to node c. With which sg should the primary switch to the remaining node?
regards,
Dan
11-07-2013 04:34 AM
Hi Mike,
in case the file system or the whole node fails on the current CFS primary, the remaining cluster nodes will elect a new primary. If the filesystem is still ok on this node and only the application fails VCS will failover the application, but won't touch the CFS resource i.e. the primary won't change.
You could theoretically move the primary using a post online or pre online script to the new node, but I think this could cause problems, for example if CFS is electing a new primary and you issuethe command.
Also keep in mind that during election of new primary I/O might get stuck until logs have been replayed.
What SF version are you using? There were some changes in CVM/CFS in later versions that make CFS depend less on primary node.
regards,
Dan
11-07-2013 05:24 AM
SFCFS version would be 5.1SP1 and there would be 3 nodes in the cluster with several database service groups, so in effect SFCFS + single instance Oracle is being used as an alternative to RAC which is a Symantec Use case. DR is also being used with hardware replication so each database will have its own associated CFS SG so that the CFS SG and database SG pair can be failed independently from other databaes across sites
Lets consider cluster start up first:
Where the CFS primaries are will depend on what order the CFS SGs start and this is independent of where the database SGs start and it doesn't make sense for CFS primary to be on one node and the database on another node as ONLY the database SG is writing to the CFS Mount so it is pointless to send all the meta data across the heartbeats.
Then consider node failover:
Suppose Node 1 fails, then the CFS primaries will move to node 2 or 3 and again the databases will move independently, so again, you really want your CFS primary to be on the same node as the database SG.
Then consider database SG failure:
Here the database will move to another node, but the CFS primary will not, so really you want the CFS primary to move with it.
It sounds like there are no standard tools to align CFS primary with the failover SG that is actually writing the data - is this correct - even for cluster startup and node failure. Obviously the solution will works without moving the CFS primaries, but what will be the performance impact - are you saying this is neglible and Symantec do not recommend trying to allign the CFS primary on the same noe as the application that is writing the data?
Thanks
Mike
11-13-2013 02:43 PM
Can anyone answer this - in summary:
When using SFCFS for fast failover, should you have the CFS primary on the same node with the failover SG that is actually writing the data and if not, what is the impact of writes to a CFS secondary (example writes to CFS secondary will be 10% slower than writes to CFS primary)
Mike
11-15-2013 04:53 AM
Hi mike,
having CFS primary on the same node as the application is not required.
the performance impact if primary is on different node should actually below 1%.
You can test this by doing an I/O test while primary is on another node and then switch the primary to the node on which your database is running and perform I/O test again.
For the cluster startup you can set a policy so that CFS master would always start on your preferred node if available.
http://www.symantec.com/business/support/index?page=content&id=TECH39266
If you want the primary to always switch along with the database you will need to write your own trigger script.
There is no general feature that will do that as you might have more than one service group depending on the cfs file system. As an example lets say you have 3 node cluster, node a fails, sg1 fails over to node b and sg2 fails over to node c. With which sg should the primary switch to the remaining node?
regards,
Dan
11-15-2013 12:07 PM
Thanks Dan - this is really useful and covers my 2 scenarios for cluster start up and node failover so that the CFS primary will follow VCS service group, but it doesn't as I expected cover SG failure, but if write impact is only 1% then this is OK and could always use postonline triggers to do this if necessary.
In answer to your last point, if sg1 fails over to node b and sg2 fails over to node c, then if sg1 and sg2 write to data in different filesystems, then this does not cause as issue and this would normally be the case - this has to be the case for non-CFS (i.e each service group has to have its own set of filesystems in its own diskgroup) and normally CFS would follow this and has to follow this if you have DR where sg1 and sg2 can fail betweeen sites independently.
Just to explain how this would work if anyone else is interested:
Suppose you have (with default FailOverPolicy = Priority):
SG1: SystemList = (Node1 = 1, Node2 = 2, Node3 = 3 }
SG2: SystemList = (Node1 = 1, Node3 = 2, Node2 = 3 }
SG3: SystemList = (Node2 = 1, Node1 = 2, Node3 = 3 }