cancel
Showing results for 
Search instead for 
Did you mean: 

Plan for DIMM replacement (Both cluster system would be down which is mandatory)

allaboutunix
Level 6

Hi Team,

 

We have to replacement DIMM on the passive node in which VCS services is currently not running.The cross over cables LLT  is hanged in such a way that we need to move both boxes (Active and passive nodes).It requires downtime because we will also have to shutdown active node.

 

We are now asking apps team for downtime for it.

 

Could you please suggest,the plan to proceed for this activity when we have to shutdown both nodes for DIMM replacement.

Currently, SG is running in Polar server.

 


-- SYSTEM STATE
-- System               State                Frozen

A  polar               RUNNING              0
A  summer            RUNNING              0

-- GROUP STATE
-- Group           System               Probed     AutoDisabled    State

B  ClusterService  polar               Y          N               ONLINE
B  ClusterService  summer              Y          N               OFFLINE
B  ORA_SG_Group   polar              Y          N               ONLINE
B  ORA_SG_Group   summer            Y          N               OFFLINE

 

This is bit urgent so require your suggestions as soon as possible.

 

3 ACCEPTED SOLUTIONS

Accepted Solutions

Gaurav_S
Moderator
Moderator
   VIP    Certified

Hello,

If you have to shutdown both the systems because cables are stuck, then I would suggest to go for a full graceful shutdown of VCS & then nodes.

high level steps

1. offline all groups from polar server

2. stop vcs on all nodes   (hastop -all)

3. shutdown all your nodes

4. Do the DIMM replacement

5. Ensure cables are solved so that next time you don't need outage on both

6. Start polar server first

7. start summer server

8. All groups should come online as defined in systemlist & autostart list, if doesn't, then online manually.

 

G

View solution in original post

Sunil_Yadav
Level 4
Employee

Just minor corrections in steps provided:

    1. Stop VCS on all nodes in cluster by executing “hastop –all” from a running node in cluster.

Graceful VCS stopping(hastop without –force option) takes care of offlining active service groups in correct dependency order. No need to explicitly offline service groups.

OR

Freeze nodes in the cluster(“hasys –freeze”). Active service groups will be eventually offlined while nodes are shutdown.

    2. Shutdown nodes for maintenance activity(in this case, DIMM and solving the cables’ issue).

    3. Start nodes. Order of nodes isn’t strict.

    4. When all nodes join back cluster, AutoStart logic will kick in. Based on AutoStartList and SystemList, it will auto-start service groups on appropriate nodes.

Thanks & Regards,

Sunil Y

View solution in original post

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified

The cross over cables LLT  is hanged in such a way that we need to move both boxes

This defeats the purpose of having a cluster.
You need to carefully look at each hardware component to eliminate Single Point Of Failure. 

View solution in original post

3 REPLIES 3

Gaurav_S
Moderator
Moderator
   VIP    Certified

Hello,

If you have to shutdown both the systems because cables are stuck, then I would suggest to go for a full graceful shutdown of VCS & then nodes.

high level steps

1. offline all groups from polar server

2. stop vcs on all nodes   (hastop -all)

3. shutdown all your nodes

4. Do the DIMM replacement

5. Ensure cables are solved so that next time you don't need outage on both

6. Start polar server first

7. start summer server

8. All groups should come online as defined in systemlist & autostart list, if doesn't, then online manually.

 

G

Sunil_Yadav
Level 4
Employee

Just minor corrections in steps provided:

    1. Stop VCS on all nodes in cluster by executing “hastop –all” from a running node in cluster.

Graceful VCS stopping(hastop without –force option) takes care of offlining active service groups in correct dependency order. No need to explicitly offline service groups.

OR

Freeze nodes in the cluster(“hasys –freeze”). Active service groups will be eventually offlined while nodes are shutdown.

    2. Shutdown nodes for maintenance activity(in this case, DIMM and solving the cables’ issue).

    3. Start nodes. Order of nodes isn’t strict.

    4. When all nodes join back cluster, AutoStart logic will kick in. Based on AutoStartList and SystemList, it will auto-start service groups on appropriate nodes.

Thanks & Regards,

Sunil Y

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified

The cross over cables LLT  is hanged in such a way that we need to move both boxes

This defeats the purpose of having a cluster.
You need to carefully look at each hardware component to eliminate Single Point Of Failure.