Forum Discussion

Marianne's avatar
Marianne
Level 6
15 years ago

Dynamic mirrored quorum in campus cluster

Scenario:

Customer wants to create 2-node Microsoft Cluster in a campus cluster setup:

Array1 at building A and Array2 at building B with all volumes mirrored across arrays.

My problem is with this section in the Admin Guide:

"It is strongly recommended that a dynamic mirrored quorum contain three disks because a cluster disk resource cannot be brought online unless a majority of disks are available. With the quorum volume in a two-disk group, loss of one disk will prevent the quorum volume from coming online and make the cluster unavailable. "

Any advice about how to overcome this problem in a campus cluster? The quorum dg will consist of 2 disks (one per array).

Does anyone know if there is an attribute to force-import dg in Microsoft Cluster should there be a site/array failure at one of the buidings?

SFW 5.1 SP1 on W2008 R2.

  • Hi Marianne,

     

    Microsoft does not support dynamic disks in their clusters.  To use dynamic disks in a cluster you need to have SFW and then Symantec will support the dynamic disks.  When using dynamic disks in a cluster Symantec should be the first point of contact to ensure that dynamic disks and related cluster resources to determine if the problem is related to the usage of dynamic disks.

     

    As for you issue with mirrored quorum (witness disk in Windows 2008), if you have an even number of disks spread evenly between an 2 arrays and you loose access to 1 array then you will not be able to online the VMDg resource because you do not have a majority of the disks in the disk group. 

     

    This will only affect online/import of the VMDg resource/clustered dynamic disks.  If the disk group is already imported you can loose upto 50% of the disks (everything from a single array) and the disk group will remain imported.  However, the diskgroup will fail and be deported when more than 50% of the disks are lost.  This is done so that the disk groups will be accessible by the node that had them imported when the SAN failure happened causing loss of access to the remote array and the remote node will not be able to come online because it does not have a majority of the disks.

     

    The use of a third array is a recommendation to add another layer of redundancy to the mix.  If you loose access between your two sites then the site that still sees the third array will always have a majority of the disks and it can online the disk groups as needed. 

     

    It will still work if you only have 2 arrays but you will not be able to online the service groups or quorum if they are accidently taken offline when the SAN link between the two sites is down.  We do have a lot of customers running with this type of configuration.  It is just a matter of what are the customer's concerns and how much infrastructure they have or are willing to get to cover situations like this.  If they feel fine with 2 arrays and a manual process to online the disk groups if they are deported with the link between arrays down then that is OK.  If they want a more automated approach and have/can afford a third site than that is also OK.

     

    Thanks,

    Wally

  • Hi M,

     

    This is from the SFW Exchange QR and MSCS guide.

     

    Creating a dynamic cluster disk group for the quorum, mirrored
    If you have not already done so, use SFW to create a dynamic disk group for the quorum. The minimum number of disks for the mirrored quorum is two disks. Symantec recommends using four small disks for the mirrored quorum for additional redundancy.
    If possible, use small disks, because the disk group will only be used for the quorum volume, which Microsoft recommends to be 500 MB. To create a four- way mirrored volume in the New Volume wizard, select the Concatenated layout, click the Mirrored checkbox, and specify four mirrors. For full details on creating cluster disk groups and volumes, see:

     

    R

  • Thanks Riaan!

    Will the effect be any different than having 2-way mirror? With 4 small disks it will use 2 from one array and 2 from another array. (Customer does not have a 3rd site...) Array/site loss will still leave us with half of the disks (no majority).

    As always - I appreciate your opinion!

  • Hi,

     

    One thing i'm not sure about is when using a dynamic quorum, does it still look at underlying disks? The DG is added to the cluster configuration, which obviously contains the mirrored Q drive, which contains the clusDB. Site failure either side wont affect our mirror. I think that might be the trick as MS cant use mirrors, cause they dont support dynamic disks in MSCS.

     

    Another thing to evaluate is File Share Witness.

     

    http://www.windowsitpro.com/article/john-savills-windows-faqs/q-what-are-the-windows-server-2008-quorum-models-.aspx

     

    This makes the quorum dependant on access to a file share. Now maybe you should just build a VCS cluster to hosts the share.........And keep MSCS up.

  • I'm confused....

    "they dont support dynamic disks in MSCS" - Do you have any documentation about that?

    Why would the SFW Admin Guide (chapter 13) go into details about converting Basic Disk to Dynamic Mirrored disk? And then show Microsoft Cluster screendumps detailing the steps?

    If Microsoft does not support that, then maybe we should stop bothering.....

     

    BTW - I just found the following in the Troubleshooting section of the SFW Admin Guide blush :

    - Bring a cluster online that has a minority of the disks in the cluster

  • Hi Marianne,

     

    Microsoft does not support dynamic disks in their clusters.  To use dynamic disks in a cluster you need to have SFW and then Symantec will support the dynamic disks.  When using dynamic disks in a cluster Symantec should be the first point of contact to ensure that dynamic disks and related cluster resources to determine if the problem is related to the usage of dynamic disks.

     

    As for you issue with mirrored quorum (witness disk in Windows 2008), if you have an even number of disks spread evenly between an 2 arrays and you loose access to 1 array then you will not be able to online the VMDg resource because you do not have a majority of the disks in the disk group. 

     

    This will only affect online/import of the VMDg resource/clustered dynamic disks.  If the disk group is already imported you can loose upto 50% of the disks (everything from a single array) and the disk group will remain imported.  However, the diskgroup will fail and be deported when more than 50% of the disks are lost.  This is done so that the disk groups will be accessible by the node that had them imported when the SAN failure happened causing loss of access to the remote array and the remote node will not be able to come online because it does not have a majority of the disks.

     

    The use of a third array is a recommendation to add another layer of redundancy to the mix.  If you loose access between your two sites then the site that still sees the third array will always have a majority of the disks and it can online the disk groups as needed. 

     

    It will still work if you only have 2 arrays but you will not be able to online the service groups or quorum if they are accidently taken offline when the SAN link between the two sites is down.  We do have a lot of customers running with this type of configuration.  It is just a matter of what are the customer's concerns and how much infrastructure they have or are willing to get to cover situations like this.  If they feel fine with 2 arrays and a manual process to online the disk groups if they are deported with the link between arrays down then that is OK.  If they want a more automated approach and have/can afford a third site than that is also OK.

     

    Thanks,

    Wally

  • enlightened Thanks for lots of valuable information, Wally!

    I now have enough info to present to the customer. Just wish everyone was using VCS instead wink...

  • Valuable info Wally, thanks.

     

    Tell me, if you loose 50% and you have to manually import, do you import with VEA, or do you use a force quorum to get it to come online, or import with VEA and then force quorum?

     

    R

  • Hi Riaan,

     

    There are several ways to go about it.  Here is a tech note that explains them.

     

    http://verisearch.ges.symantec.com/tnotes/all_docs/336624.htm

     

    If the link were going to be down for awhile I would run the command listed in solution #2 (in the above tech note) on the node where I want the Quorum and other service groups to be online on.  Then you can start the cluster services on that node and online the service groups on that node.  If using this method don't forget to run the command again to disable this option as it will cause unexpected behavor when the SAN link is back up.

     

    Thanks,

    Wally