I found a KB document detailing the steps needed to rebuild a cluster node when using SFW HA for Windows, but I need to rebuild a Win2003 cluster node that is running SFW 5.0 with MSCS. If it was a cluster that wasn't using SFW, I would just evict the node from the cluster, rebuild the OS on it, and add it back to the cluster. What is the correct procedure for SFW with MSCS? The cluster quorum is a volume manager disk resource, so I can't add the node back to the cluster before installing SFW with the MSCS option, which is the usual installation order when creating a new cluster from scratch. Is the rebuild procedure:
1. Evict the node
2. Install Win2003 on node
3. Install SFW with MSCS option
4. Add node to cluster
I followed your precedure above (no problems encountered during the rebuild procedure), but after step 4 and a reboot, when I start VEA, although all the SAN disks are visible in VEA on the rebuilt node, none of the disk groups that exist in VEA on the node that remained in the cluster are visible on the rebuilt node, and trying to fail the cluster over to the rebuilt node just results in the cluster going offline and then coming back online on the same node. If I move the Quorum back to a basic disk, then I can move the cluster group to the rebuilt node, but none of the dynamic disk groups will import on the rebuilt node.
So it looks as though there are more configuration steps needed when recovering a node on a Windows cluster that is using SFW.
The newly rebuilt node does not know anything about the disks groups in its isis database. This is because scsi reservations are preventing the rebuilt node from reading the disks during boot time.
To resolve this you will need to offline the service group on the active node then perform a rescan on the rebuilt node. Now you should be able to see the disk groups and online the service groups on the rebuilt node.
Once this is done the first time you won't have any more problems.
The only other issue that you might have is that the drive letters do not get assigned correctly during the first import of the disk groups. You may need to import the disk groups and then manually assign the drive lettters to the volumes. Again this is a one time deal to properly setup the mount manager information correctly.
Actually, I found that the problem is that when the newly built node without SFW installed is added to the cluster, MSCS doesn't add the node as a possible owner of any of the Volume Manager Disk Group resources. It does add it as an owner of all the other resources.
After installing SFW and rebooting the node, you have to use Cluster Administrator to add the node as a owner of each Volume Manager Disk Group. After this failover to the newly built node works correctly.