Forum Discussion

Amit_Mane's avatar
Amit_Mane
Level 3
10 years ago

How to delete non-existant system from the Cluster "-i"

Hell Expert,

I am facing one simple but bit tricky issue in VCS.

one non-existant system added in the cluster which has hostname "-i" . i am not aware how this system added in the list. find bellow system list.


bash-3.00# hasys -list
-i
MMINDIA01
MMINDIA02
MMINDIA03
MMINDIA04

i tried with "hasys -delete -i", "hasys -delete "/i" but no success.

Kindly help on priority.

 

Regards,

AMit MAne

 

  • Hi Amit,

     

    I would run hastop -force -all on node 1 (the one with correct main.cf)

    Once all are down, run hastart on node1. Then once its up, run hastart or remaining nodes.

    They should then pull the correct main.cf and rebuild their own.

     

    It is weird though for 2,3,4 to see something different from 1. And they've not partitioned into mini clusters as they id's are all the same.

17 Replies

  • Dear Mike,

    This problem occured as hostname of one of the node was changed. because of someone has run the command 'hostname -i'

    IF you have test system, can u please check by changing hostname of the server by "hostname -i"

    share me you observation.

     

    Thanks in advance.

     

    Regards,

    Amit Mane

  • Hi Amit,

     

    I would run hastop -force -all on node 1 (the one with correct main.cf)

    Once all are down, run hastart on node1. Then once its up, run hastart or remaining nodes.

    They should then pull the correct main.cf and rebuild their own.

     

    It is weird though for 2,3,4 to see something different from 1. And they've not partitioned into mini clusters as they id's are all the same.

  • I managed to replicate what you have by putting "-i" in /etc/VRTSvcs/conf/sysname and when you do this and start vcs it adds host "-i" to main.cf.

    So to fix I would do:

    Stop VCS on all node, but leave apps running

    hastop -all force

    Correct /etc/VRTSvcs/conf/sysname

    Start VCS on node with correct main.cf (node 1)

    Start VCS on other nodes (this will build main.cf from node1 so no need to edit main.cf on other nodes)

    Mike

  • Hi Mike,

    I checked /etc/VRTSvcs/conf/sysname in all the node but there is no such entry in the sysname in all 4 nodes.

    I think you are aware that main.cf on node 1 is correct but main.cf in other 3 nodes having entry of "-i"

    kindly let me know "hastop -all force" will not affect running services in cluster as all the nodes having some running services.

    bash-3.00# hastatus -summ | grep ONLINE
    B  DG1     MHOGW01              Y          N               ONLINE        
    B  DG1     MHOGW02              Y          N               ONLINE        
    B  DG1     MHOGW03              Y          N               ONLINE        
    B  DG1     MHOGW04              Y          N               ONLINE        
    B  Network         MHOGW01              Y          N               ONLINE        
    B  Network         MHOGW02              Y          N               ONLINE        
    B  Network         MHOGW03              Y          N               ONLINE        
    B  Network         MHOGW04              Y          N               ONLINE        
    B  ALARM     MHOGW01              Y          N               ONLINE        
    B  MANAGER   MHOGW01              Y          N               ONLINE        
    B  SERVER    MHOGW01              Y          N               ONLINE        
    B  SERVER10  MHOGW04              Y          N               ONLINE        
    B  SERVER2   MHOGW03              Y          N               ONLINE        
    B  SERVER3   MHOGW03              Y          N               ONLINE        
    B  SERVER4   MHOGW01              Y          N               ONLINE        
    B  SERVER5   MHOGW04              Y          N               ONLINE        
    B  SERVER6   MHOGW02              Y          N               ONLINE        
    B  SERVER7   MHOGW04              Y          N               ONLINE        
    B  SERVER8   MHOGW01              Y          N               ONLINE        
    B  SERVER9   MHOGW04              Y          N               ONLINE        
    B  TRACER    MHOGW01              Y          N               ONLINE        
    B  Sentinel        MHOGW01              Y          N               ONLINE        
    B  cvm             MHOGW01              Y          N               ONLINE        
    B  cvm             MHOGW02              Y          N               ONLINE        
    B  cvm             MHOGW03              Y          N               ONLINE        
    B  cvm             MHOGW04              Y          N               ONLINE

    Node2,3,4

    bash-3.00# hastatus -summ

    -- SYSTEM STATE
    -- System               State                Frozen            

    A  -i                   FAULTED              0                   
    A  MHOGW01              RUNNING              0                   
    A  MHOGW02              RUNNING              0                   
    A  MHOGW03              RUNNING              0                   
    A  MHOGW04              RUNNING              0

    Node 1:

    bash-3.00# hastatus -summ | more

    -- SYSTEM STATE
    -- System               State                Frozen            

    A  MHOGW01              RUNNING              0                   
    A  MHOGW02              RUNNING              0                   
    A  MHOGW03              RUNNING              0                   
    A  MHOGW04              RUNNING              0

     

  • I can see currently Node3 is master node, this is for your information.

    bash-3.00# /etc/vx/bin/vxclustadm nidmap
    Name                             CVM Nid    CM Nid     State              
    MHOGW01                          0          0          Joined: Slave      
    MHOGW02                          1          1          Joined: Slave      
    MHOGW03                          2          2          Joined: Master     
    MHOGW04                          3          3          Joined: Slave

     

  • As per Mike's previous post, 'hastop -all -force' will stop VCS on all nodes, but leave apps running.

    So, we want node1 with correct main.cf to start 'had' and load correct config into memory.
    Run hastatus in one window to view continuous progess. (Will firstly say 'cannot connect...' when had is down on all nodes.)
    After stopping VCS (had) on all nodes, run 'hastart' on node 1.

    Wait for 'hastatus' to show node 1 in 'LOCAL BUILD'. Wait for RUNNING state.
    Run hastart on remaining nodes. They should all do a 'Remote Build' from node 1.

    All nodes should now share the correct main.cf with no '-i' entry.

  • I agree with Marianne, you still need to do hastop -force as in essence "-i" for a host is an invalid string which is rejected by "check of main.cf" when VCS started and "check of command hasys", but there is a side case where if VCS node name (which is normally taken from /etc/VRTSvcs/conf/sysname) is not in main.cf when VCS start then it adds it to VCS and it looks as though the "checks" don't take place here.  Once VCS is started if you change /etc/VRTSvcs/conf/sysname (or if this is not there, then I think VCS uses hostname) then this will not affect VCS as this is only determined at VCS start up.  So as "hasys" rejects "-i" for host you need to run "hastop -all -force", and then restart VCS which will only start on first node if main.cf does NOT contain the invalid "-i" hostname (all other systems will build main.cf from the first node)

    Mike