Forum Discussion

Amit_Mane's avatar
Amit_Mane
Level 3
11 years ago

How to delete non-existant system from the Cluster "-i"

Hell Expert,

I am facing one simple but bit tricky issue in VCS.

one non-existant system added in the cluster which has hostname "-i" . i am not aware how this system added in the list. find bellow system list.


bash-3.00# hasys -list
-i
MMINDIA01
MMINDIA02
MMINDIA03
MMINDIA04

i tried with "hasys -delete -i", "hasys -delete "/i" but no success.

Kindly help on priority.

 

Regards,

AMit MAne

 

  • Hi Amit,

     

    I would run hastop -force -all on node 1 (the one with correct main.cf)

    Once all are down, run hastart on node1. Then once its up, run hastart or remaining nodes.

    They should then pull the correct main.cf and rebuild their own.

     

    It is weird though for 2,3,4 to see something different from 1. And they've not partitioned into mini clusters as they id's are all the same.

  • I don't understand why 3 of the 4 cluster nodes have a different main.cf?

    The 1st node to startup in a multi-node cluster will do a 'local build' - reading local main.cf and load that config into memory.

    Subsequent nodes that startup should find a currently running config, do a 'remote build' and update local main.cf. This will ensure that all nodes have the same config.

    Since the incorrect entry only appears in main.cf on the 3 nodes, it will be easy to fix without bringing down any applications, but more inportant now to see state of cluster membership.

    Please show us output of 'gabconfig -a' on all nodes?

     

    **** EDIT *****

    I found output of gabconfig in you quarantined post and published it.
    The fact that all nodes show correct cluster membership is making the different output on node 1 even more strange....

  • Hello Mike,

    Thanks for help!!

    find bellow requested details.

    bash-3.00# hostname
    MHOGW01
    bash-3.00# cat /etc/llthosts
    0 MHOGW01
    1 MHOGW02
    2 MHOGW03
    3 MHOGW04
    bash-3.00#

    # hostname
    MHOGW02
    # cat /etc/llthosts
    0 MHOGW01
    1 MHOGW02
    2 MHOGW03
    3 MHOGW04

    # hostname
    MHOGW03
    # cat /etc/llthosts
    0 MHOGW01
    1 MHOGW02
    2 MHOGW03
    3 MHOGW04
    #

    # cat /etc/llthosts
    0 MHOGW01
    1 MHOGW02
    2 MHOGW03
    3 MHOGW04
    #

     

    main.cf

    Node 1

    bash-3.00# more /etc/VRTSvcs/conf/config/main.cf
    include "types.cf"
    include "mediationtypes.cf"
    include "CFSTypes.cf"
    include "CVMTypes.cf"

    cluster MultiMediation (
            UserNames = { admin = INOkNPnMNsNM }
            Administrators = { admin }
            UseFence = SCSI3
            HacliUserLevel = COMMANDROOT
            )

    system MHOGW01 (
            )

    system MHOGW02 (
            )

    system MHOGW03 (
            )

    system MHOGW04 (
            )

     

    Node 2,3 & 4.

     

    bash-3.00# more /etc/VRTSvcs/conf/config/main.cf
    include "types.cf"
    include "mediationtypes.cf"
    include "CFSTypes.cf"
    include "CVMTypes.cf"

    cluster MultiMediation (
            UserNames = { admin = INOkNPnMNsNM }
            Administrators = { admin }
            UseFence = SCSI3
            HacliUserLevel = COMMANDROOT
            )

    system "-i" (
            )

    system MHOGW01 (
            )

    system MHOGW02 (
            )

    system MHOGW03 (
            )

    system MHOGW04 (
            )

     

    I tried to verify the config node 2,3 & 4 but not able to verify

    hasys -list

    Node1

    bash-3.00# hasys -list
    MHOGW01
    MHOGW02
    MHOGW03
    MHOGW04

     

    Node 2,3 & 4

    # hasys -list
    -i
    MHOGW01
    MHOGW02
    MHOGW03
    MHOGW04
    #

     

    Regards,

    Amit Mane

     

     

  • Dear Naveen,

     

    Thanks for reply!!

    As i mentioned in forum. nonexistent system. we have 4 node cluster. and someone added -i in config which is not existed in the network.

     

    still find bellow required details.

    Node1:

    # cat /etc/hosts
    GW02
    #
    # Internet host table
    #
    ::1 localhost
    127.0.0.1 localhost
    IP MHOGW02
    IP MHOGW01 loghost
    IP MHOGW03
    IP MHOGW04

    Node2,3,4

    # cat /etc/hosts
    GW02
    #
    # Internet host table
    #
    ::1     localhost
    127.0.0.1       localhost
    IP   MHOGW02 loghost
    IP   MHOGW01
    IP MHOGW03
    IP MHOGW04

     

    Regards,

    Amit Mane
     

  • Please find requested details.

    # gabconfig -a
    GAB Port Memberships
    ===============================================================
    Port a gen  26eef6e membership 0123
    Port b gen  26eef6f membership 0123
    Port d gen  26eef71 membership 0123
    Port f gen  26ef015 membership 0123
    Port h gen  26ef013 membership 0123
    Port u gen  26ef013 membership 0123
    Port v gen  26ef00f membership 0123
    Port w gen  26ef011 membership 0123
    # hostname
    MHOGW04
    # Connection to MHOGW04 closed.
    # hostname
    MHOGW03
    # gabconfig -a
    GAB Port Memberships
    ===============================================================
    Port a gen  26eef6e membership 0123
    Port b gen  26eef6f membership 0123
    Port d gen  26eef71 membership 0123
    Port f gen  26ef015 membership 0123
    Port h gen  26ef013 membership 0123
    Port u gen  26ef013 membership 0123
    Port v gen  26ef00f membership 0123
    Port w gen  26ef011 membership 0123
    # Connection to MHOGW03 closed.
    # hostname
    MHOGW02
    # gabconfig -a
    GAB Port Memberships
    ===============================================================
    Port a gen  26eef6e membership 0123
    Port b gen  26eef6f membership 0123
    Port d gen  26eef71 membership 0123
    Port f gen  26ef015 membership 0123
    Port h gen  26ef013 membership 0123
    Port u gen  26ef013 membership 0123
    Port v gen  26ef00f membership 0123
    Port w gen  26ef011 membership 0123
    # Connection to MHOGW02 closed.
    bash-3.00# hostname
    MHOGW01
    bash-3.00# gabconfig -a
    GAB Port Memberships
    ===============================================================
    Port a gen  26eef6e membership 0123
    Port b gen  26eef6f membership 0123
    Port d gen  26eef71 membership 0123
    Port f gen  26ef015 membership 0123
    Port h gen  26ef013 membership 0123
    Port u gen  26ef013 membership 0123
    Port v gen  26ef00f membership 0123
    Port w gen  26ef011 membership 0123
    bash-3.00#

     

    # hastatus -summ | more

    -- SYSTEM STATE
    -- System               State                Frozen            

    A  -i                   FAULTED              0                   
    A  MHOGW01              RUNNING              0                   
    A  MHOGW02              RUNNING              0                   
    A  MHOGW03              RUNNING              0                   
    A  MHOGW04              RUNNING              0                  

    -- GROUP STATE
    -- Group           System               Probed     AutoDisabled    State        

     

    mail.cf & hasys

    Node1:

    bash-3.00# more /etc/VRTSvcs/conf/config/main.cf
    include "types.cf"
    include "mediationtypes.cf"
    include "CFSTypes.cf"
    include "CVMTypes.cf"

    cluster MultiMediation (
            UserNames = { admin = INOkNPnMNsNM }
            Administrators = { admin }
            UseFence = SCSI3
            HacliUserLevel = COMMANDROOT
            )

    system MHOGW01 (
            )

    system MHOGW02 (
            )

    system MHOGW03 (
            )

    system MHOGW04 (
            )

     

    ---------------------------

    bash-3.00# hasys -list
    MHOGW01
    MHOGW02
    MHOGW03
    MHOGW04
    bash-3.00#

    ---------------------------

    Node2,3,4:

     

    bash-3.00# more /etc/VRTSvcs/conf/config/main.cf
    include "types.cf"
    include "mediationtypes.cf"
    include "CFSTypes.cf"
    include "CVMTypes.cf"

    cluster MultiMediation (
            UserNames = { admin = INOkNPnMNsNM }
            Administrators = { admin }
            UseFence = SCSI3
            HacliUserLevel = COMMANDROOT
            )

    system "-i" (
            )

    system MHOGW01 (
            )

    system MHOGW02 (
            )

    system MHOGW03 (
            )

    system MHOGW04 (
            )

     

    ---------------------------

    bash-3.00# hasys -list
    -i
    MHOGW01
    MHOGW02
    MHOGW03
    MHOGW04
    bash-3.00#

    ---------------------------

     

    Also note thay i am not able to verify the main.cf with haconf -verify on node 2,3 & 4.

     

    Regards,

    AMit Mane

     

     

  • Please post /etc/llthosts and "system" section of main.cf

    I tried to add host "-i" on a test system and I could add to llthosts, but I could not add to main.cf, either by editing as syntax is rejected or by using hasys command.

    Mike

  • Hi Amit,

     

    First check the /etc/hosts file, as may somebody checked the hostname , also provide the hastatus -sum output.

     

    Before you remove any node from the cluster please verify by running few commands to check how many node the cluster confired.

     

    cat /etc/gabtab

    cat /etc/hosts

    cat /etc/VRTSvcs/conf/config/main.cf|grep -i system

    Please verify first nobody changed the hostname:-

    If you think node :-i" is not part of the cluster then below is the procedure to remove node from the cluster.

     

    # hagrp -switch group -to <another node>
    2 Check for any dependencies involving any service groups that run on the
    leaving node; for example, grp4 runs only on the leaving node.
    # hagrp -dep
    3 If the service group on the leaving node requires other service groups, that
    is, if it is a parent to service groups on other nodes, then unlink the service
    groups.
    # haconf -makerw
    # hagrp -unlink group  <another node>
    These commands enable you to edit the configuration and to remove the
    requirement grp4 has for grp1.
    4 Stop VCS on the leaving node:
    # hastop -sys C
    5 Check the status again. The state of the leaving node should be EXITED. Also,
    any service groups set up for failover should be online on other nodes:
    # hastatus -summary
    -- SYSTEM STATE
    -- System State Frozen
    A A RUNNING 0
    A B RUNNING 0
    A C EXITED 0
    12 Adding and removing cluster nodes
    Removing a node from a cluster
    -- GROUP STATE
    -- Group System Probed AutoDisabled State
    B grp1 A Y N ONLINE
    B grp1 B Y N OFFLINE
    B grp2 A Y N ONLINE
    B grp3 B Y N ONLINE
    B grp3 C Y Y OFFLINE
    B grp4 C Y N OFFLINE
    6 Delete the leaving node from the SystemList of service groups grp3 and
    grp4.
    # hagrp –modify < groupA> SystemList -delete C
    # hagrp -modify < groupB> SystemList -delete C
    7 For service groups that run only on the leaving node, delete the resources
    from the group before deleting the group.
    # hagrp -resources < groupB>
    processx_grp4
    processy_grp4
    # hares -delete <dependent resources>

    8 Delete the service group configured to run on the leaving node.
    # hagrp -delete groupA
    9 Check the status.
    # hastatus -summary
    -- SYSTEM STATE
    -- System State Frozen
    A A RUNNING 0
    A B RUNNING 0
    A C EXITED 0
    -- GROUP STATE
    -- Group System Probed AutoDisabled State
    B grp1 A Y N ONLINE
    B grp1 B Y N OFFLINE
    B grp2 A Y N ONLINE
    B grp3 B Y N ONLINE
    10 Delete the node from the cluster.
    # hasys -delete C
    11 Save the configuration, making it read only.
    # haconf -dump -makero
    Adding and removing cluster nodes 13
    Removing a node from a cluster
    Modifying configuration files on each remaining node
    Perform the following tasks on each of the remaining nodes of the cluster.
    To modify the configuration files on a remaining node
    1 If necessary, modify the /etc/gabtab file.
    No change is required to this file if the /sbin/gabconfig command has only
    the argument -c, although Symantec recommends using the -nN option,
    where N is the number of cluster systems.
    If the command has the form /sbin/gabconfig -c -nN, where N is the
    number of cluster systems, then make sure that N is not greater than the
    actual number of nodes in the cluster, or GAB does not automatically seed.
    Note: Symantec does not recommend the use of the -c -x option for /sbin/
    gabconfig. The Gigabit Ethernet controller does not support the use of
    -c -x.
    2 Modify /etc/llthosts file on each remaining nodes to remove the entry of the
    leaving node.
    For example, change:
    0 A
    1 B
    2 C
    to:
    0 A
    1 B
    Unloading LLT and GAB and removing VCS on the leaving node
    Perform the tasks on the node leaving the cluster.
    To stop LLT and GAB and remove VCS
    1 Stop GAB and LLT:
    # /etc/init.d/gab stop
    # /etc/init.d/llt stop
    2 To determine the RPMs to remove, enter:
    # rpm -qa | grep VRTS

  • Hi,

     

    please post gabconfig -a to see if this is present there as well. If its is not then post or pm /etc/VRTSvcs/conf/config/main.cf. If its only the main.cf it should be easily correctable.