Forum Discussion

Home_224's avatar
Home_224
Level 6
5 years ago

VCS cannot startup

Hi All ,

The enviornment is configured two node form the active / passive cluster, i have maintenance for active node , switch to passive node to online cluster, but check the status is in parital, I have no idea what happen on the issue.  Can you please advice how to fix it ?

root@devuaebms42 # gabconfig -a
GAB Port Memberships
===============================================================
Port a gen 63750d membership 01
Port h gen 63750b membership ;1
Port h gen 63750b visible 0

^Croot@devuaebms42 # hastatus -sum

-- SYSTEM STATE
-- System State Frozen

A devuaebms41 EXITED 0
A devuaebms42 RUNNING 0

-- GROUP STATE
-- Group System Probed AutoDisabled State

B cf_bms_sg_01 devuaebms41 Y Y OFFLINE
B cf_bms_sg_01 devuaebms42 Y N PARTIAL

Many thanks,

Hong

  • The NBU dg is visible on devuaebms42 (in deported state) - check the bottom of the list:

    EMC1_30 auto - (cf_bms_dg_01) online emcpower2c

    If this is NBU master only, you need to find out why so many luns have been zoned to this environment. 
    Looks like disaster waiting to happen.... 

  • Home_224 

    I have moved your post in the NetBackup forum to the Cluster forum.

    Even if this is a NetBackup Cluster, you appear to have a cluster-related issue.

    Resource(cf_bms_ser_01) became OFFLINE unexpectedly on its own. Agent is restarting (attempt number 2 of 2) the resource.

    Usually when we see 'the resource became OFFLINE unexpectedly, on its own', it is because 'someone' is manually stopping or restarting processes outside of VCS. 

    Can you share exactly which steps were followed before one node was switched off for maintenance as well as afterwards. 

    It seems that 'someone' manually started NetBackup on node devuaebms42, instead of properly onlining it with VCS: 

    2020/02/11 20:47:09 VCS INFO V-16-1-10299 Resource cf_bms_ser_01 (Owner: unknown, Group: cf_bms_sg_01) is online on devuaebms42 (Not initiated by VCS) 

    and then stopping again? 

    2020/02/11 20:49:26 VCS ERROR V-16-2-13067 (devuaebms42) Agent is calling clean for resource(cf_bms_ser_01) because the resource became OFFLINE unexpectedly, on its own.
    2020/02/11 20:50:28 VCS ERROR V-16-2-13006 (devuaebms42) Resource(cf_bms_ser_01): clean procedure did not complete within the expected time.

    Can you show us full 'hastatus' output? 

    This will show which resources in the SG are online and which ones offline (causing the Partial status). 

    • Home_224's avatar
      Home_224
      Level 6

      Hi Marianne ,

      Thank you for your comment.

      I will show the hastatus output here when I back to office tomorrow 

      • Marianne's avatar
        Marianne
        Level 6

        I am curious to see how this SG is configured.

        I see no attempt in the EngineA log to import the dg, mount volumes or online virtual IP.

        I only see the NBU resource going online.

        So, important to know what other resources are in the SG and what their status is.