Solved: VCS Setup - "Install A Clustered Master Server" G...

Shaun_Taylor · ‎05-22-2015

I originally posted this in the NetBackup forum but it seems to be a VCS issue that is the root of the problem: https://www-secure.symantec.com/connect/forums/install-clustered-master-server-greyed-out

I am trying to set up a test NBU 7.6.0.1 master server cluster with VCS on Windows Server 2008 R2 but the option to "install a clustered master server" is greyed out.

I have built 2 new servers, installed SFHA / VCS 6.0.1 and configured disks / volumes / disk groups / VVR replication. The VCS installation has not been modified or customised in any way as from my understanding the NBU installation will create its own service group. I have carried out the initial configuration and added the remote cluster node. Within cluster explorer, the default ClusterService service group is all up and running, the global cluster option is enabled and the remote cluster status also shows that the other cluster is up.

Marianne has already asked me to run two commands with the following output:

'gabconfig -a' - "GAB gabconfig ERROR V-15-2-25022"

lltconfig -nvv - "LLT lltconfig ERROR V-14-2-15000 open \\.\llt failed: error number: 0x2"

I've probably just made a silly mistake somewhere but I can't figure out what piece of the puzzle I have missed - any ideas?

RiaanBadenhorst · ‎05-26-2015

We're testing this for your in our lab quick. Cluster doens't like it when the disk are not really shared, i would suggest changing it to and RDM. Will update you later but so far we see the same thing, with no diskgroup imported the option is greyed out.

View solution in original post

NBU_13 · ‎05-26-2015

Hi Shaun,

In single node cluster, Install A Clustered Master Server" Greyed Out, if the LLT (heart beat) has not configured. please configured LLT and try to install NBU, the Install A Clustered Master Server should visible.

View solution in original post

mikebounds · ‎05-27-2015

To configure LLT manually:

1. Enable llt and gab. These services are hidden and usually disabled by default so you need to enable them by editing the following registry entries by changing the "Start" key from 4 (disabled) to 2:

HKLM\SYSTEM\CurrentControlSet\Services\llt

HKLM\SYSTEM\CurrentControlSet\Services\gab

2. Create an llthosts.txt and llttab.txt file in \Program Files\VERITAS\comms\llt as follows:

llthosts.txt

0 host_name_of_server

llttab.txt

set-node host_name_of_server
set-cluster 11
link Adapter0 MAC-ADDR_with_colon_seperator - ether - -
start

The cluster id set by "set-cluster" should be unique so you should check if you have any other clusters on the same network and I guess you know you can get MAC address from "ipconfig /all"

3. Create an gabtab.txt file in \Program Files\VERITAS\comms\gab as follows:

gabtab.txt

gabconfig -c -n 1

Stop VCS:

hastop -all

Start llt, gab and VCSComm:

net start llt
net start gab
net start VCSComm

Check gab is running

gabconfig -a

Then start VCS:

hastart

Hopefully NBU wizard only checks llt and gab are running and not number of heartbeats, but if it is still greyed out you can create 2 heartbeats with the same interface so add line to llttab.txt like:

link Adapter1 SAME-MAC-ADDR_with_colon_seperator - ether - -

And then restart all services.

To undo above, just do reverse, but you can leave LLT and GAB running if you want.

Mike

View solution in original post

Marianne · ‎05-22-2015

If you have a 2-node cluster, you need 2 heartbeats.

Please have a look at this URL: https://sort.symantec.com/public/documents/vcs/6.1/windows/productguides/html/vcs_netapp-oracle/ch02...

Re-run the Config Wizard (VCW) if you have not configured heartbeats.

llt and gab (vcscomms service ) must be running before VCS (had) will start.

Handy NetBackup Links

mikebounds · ‎05-22-2015

Can you provide output of "hastaus -sum".

Mike

mikebounds · ‎05-22-2015

Looking at other post on NBU forum you have 2 1-node clusters which I suspected (hence request for hastatus -sum) - this means it is fine you have 1 NIC and that LLT and GAB are not loaded as they are not required for 1-node clusters.

It maybe the NBU wizard does not support 1-node cluster as it may not have been coded to take into account a global cluster as the cluster coding for NBU is done by NBU coders, rather than VCS coders and my experience so far of the NBU clustering is that it was written by someone with only very basic VCS skills as there are several VCS agent conventions that are not adhered to with the NBU agent.

Mike

mikebounds · ‎05-22-2015

You could try tricking wizard into thinking there are 2 nodes in the local cluster as follows:

Removwe remote cluster and heartbeats and then add the remote node as a fake 2nd node by running

hasys -add node2

Suprisingly, this command works for a one-node cluster (where VCS is started as "hastart -onenode" in the had service). It works on UNIX so it should work on Windows too, so what will happen is that VCS has no communication with node 2 so VCS will say it node 2 is "NOT RUNNING"

This may work, depends on if NBU wizard checks system2 is in RUNNING state, but any commands that actually run on the 2nd node (to install NBU s/w) should still work fine.

If this does work, then I can post details of next part which is to set up global cluster part.

Mike

Marianne · ‎05-22-2015

NBU is fine with 1-node cluster. The only requirement is that had must be up and running. So, if 1-node cluster, then hastart -onenode should do the trick.

Handy NetBackup Links

RiaanBadenhorst · ‎05-24-2015

This is not a VCS issue, your cluster is up and running. Its an issue with the NetBackup installer not detecting the VCS installation. Please post the logs from %ALLUSERSPROFILE%\Symantec\NetBackup\InstallLogs\ either here or your other post.

mikebounds · ‎05-25-2015

Will there be any logs as installation has not been started as option is greyed out?

Could this be a licence issue - there is no license on the VCS side, but does NBU requires a licence for it to be clustered?

Mike

RiaanBadenhorst · ‎05-25-2015

The GUI/Wizard creates some logs as its proceeding with the install. I'm not hundred percent its in that location (memory is a bit foggy but I faced that same issue before..... The NBU installer was driving me up the wall).

mikebounds · ‎05-26-2015

hastatus output looks fine and as suspected there is not much in logs. On UNIX there is little documenation on NBU agent - is Windows any different - is there a pre-req section for VCS for installing NBU?

I would guess that you need to create a diskgroup first,with a volume with a filesystem where the diskgroup can be seen from both nodes and the diskgroup is imported on one node with filesystem available. The installer MAY check that a diskgroup and filesystem is available and you said this is done, but can you confirm the VVR primary is on the host you are running install on - please post output of:

vxprint -VPv

vxprint -Vl

Mike

Shaun_Taylor · ‎05-26-2015

Thanks for the quick response Mike.

I am certain that the diskgroups were in place and successfully replicating when I last checked, but it would appear that the two disks in the cluster disk group are now offline and when I try to import the disk group on the primary I get the following error:

Error V-76-58645-585: Failed to reserve a majority of disks in cluster dynamic disk group.

I'm not having much luck with this! The test environment is just using virtual disks so I will try removing / re-adding them first before testing again.

UPDATE:

Having removed and re-added new virtual disks to replace the old ones, I got "Error V-76-58645-542: Unable to reserve a majority of dynamic disk group members. Failed to start SCSI reservation thread" when creating the disk group. Interestingly though, the cluster disk group was successfully created and imported as far as I can see. Having said that, deporting the cluster disk group and attempting to import again put me back to the start with offline disks and Error V-76-58645-585.

Shaun_Taylor · ‎05-26-2015

Virtual on ESXi 5.5.

RiaanBadenhorst · ‎05-26-2015

Are you running physical or virtual?

Sorry just read its virtual. Please describe the environment and how you've connected the disks. This sounds like an issue you'd see when using passthrough disks with SFHWA.

mikebounds · ‎05-26-2015

See if following posts help:

https://www-secure.symantec.com/connect/forums/unable-reserve-majority-dynamic-disk-group-members-fa...

https://www-secure.symantec.com/connect/forums/v-76-58645-585-failed-reserve-majority-disks-cluster-...

Mike

mikebounds · ‎05-26-2015

Also are you using a disk for VVR which is on same controller as O/S disk and if so have you set usesystembus using vxclus.

Mike

Shaun_Taylor · ‎05-26-2015

Apologies for the delay in getting back to you all, it has been a bank holiday weekend here so I have not been in the office.

Riaan - I have found the installation logs in the location that you mentioned but the information in it is very limited as the greyed out option as at the very start of the installation:

05-26-2015,09:47:19 : Starting Setup 05-26-2015,09:47:19 : Performing installation on host: HOST1
05-26-2015,09:47:19 : Checking for 32-bit processor on a 64-bit system.
05-26-2015,09:47:19 : UAC is supported on this OS, and it is currently disabled.

Marianne / Mike - hastatus output is below, does it look as you would expect? I have also tried "hastop -all / hastart -onenode" on both sides but this has not made any difference. SFHA licenses have already been added on both nodes.

C:\> hastatus -sum

-- SYSTEM STATE
-- System               State                Frozen              

A  NODE1        RUNNING              0                    

-- GROUP STATE
-- Group           System               Probed     AutoDisabled    State          

B  ClusterService  NODE1        Y          N               ONLINE         

-- WAN HEARTBEAT STATE
-- Heartbeat       To                   State     

M  Icmp            NODE2NB      ALIVE     

-- REMOTE CLUSTER STATE
-- Cluster         State     

N  NODE2NB RUNNING   

-- REMOTE SYSTEM STATE
-- cluster:system       State                Frozen              

O  NODE2NB:NODE2 RUNNING              0                   

C:\> hastatus -sum

-- SYSTEM STATE
-- System               State                Frozen              

A  NODE2        RUNNING              0                    

-- GROUP STATE
-- Group           System               Probed     AutoDisabled    State          

B  ClusterService  NODE2        Y          N               ONLINE         

-- WAN HEARTBEAT STATE
-- Heartbeat       To                   State     

M  Icmp            NODE1NB      ALIVE     

-- REMOTE CLUSTER STATE
-- Cluster         State     

N  NODE1NB RUNNING   

-- REMOTE SYSTEM STATE
-- cluster:system       State                Frozen              

O  NODE1NB:NODE1 RUNNING              0

Shaun_Taylor · ‎05-26-2015

Apologies Riaan, I posted before I saw your updated post - the disks are normal thin provisioned virtual disks on their own virtual SCSI controller. They are in a VMFS5 datastore on our EMC Clariion SAN. I must admit I hadn't thought to check if this was specifically supported as

Mike - I have already enabled that option as I knew that we had to do it for our live build all those years ago. Thanks for the links as well - I may end up having to contact Symantec for any possible SFHA patches / service packs that might resolve this as our ESXi hosts are kept up to date for patches / drivers / firmware etc.

RiaanBadenhorst · ‎05-26-2015

We're testing this for your in our lab quick. Cluster doens't like it when the disk are not really shared, i would suggest changing it to and RDM. Will update you later but so far we see the same thing, with no diskgroup imported the option is greyed out.

Shaun_Taylor · ‎05-26-2015

Thanks Riaan, much appreciated. I guess because the initial import worked I assumed it wouldn't be an issue, but it certainly has turned out to be a problem a bit further down the line!

Keep me updated with your testing as I'm curious to know how you get on, and in the meantime I'll start arranging some LUNs to replace the virtual disks.

Thanks again for everyone's help so far.

RiaanBadenhorst · ‎05-26-2015

We still looking at it. So far we're seeing what you're seeing on a single node cluster.

One funny that I can recall from a previous install was a dependence on the ClusterService group being online on the same node. I had never seen this being an issue in previous install but in this case it made a difference. Having said that it might have been at the next stage of the install where its supposed to check the cluster confiuration.

VOX

VCS Setup - "Install A Clustered Master Server" Greyed Out