Forum Discussion

javierrv's avatar
10 years ago

Trigger after failed cleanup script

Hi there, I have a system where the cleanup script can fail/timeout and I want to execute another script if this happens. And I was wondering which can be the best way of doing this. In the ver...
  • Sunil_Yadav's avatar
    10 years ago

    Hi,

    Cleanup script(AKA clean entry point) is invoked in different scenarios. There are different state transitions based on success/failure of clean entry point. As you specifically mentioned("I have a system where the cleanup script can fail/timeout"), we will elaborate only failure scenarios of clean entry point.

    Scenario # 1

    Resource ONLINE --> Resource attempting OFFLINE --> Resource fails to go OFFLINE --> Clean entry point invoked --> Clean entry point fails --> Resource moves to ONLINE|UNABLE TO OFFLINE

    Scenario # 2

    Resource ONLINE --> Resource unexpectedly went OFFLINE --> Clean entry point invoked --> Clean entry point fails --> If Type:: CleanRetryLimit == 0, clean entry point is retired infinitely --> Till clean entry point succeeds, resource remains ONLINE

    Scenario # 3

    Resource ONLINE --> Resource unexpectedly went OFFLINE --> Clean entry point invoked --> Clean entry point fails --> If Type:: CleanRetryLimit != 0, clean entry point is retried for CleanRetryLimit times --> If it still fails, resource moves to ONLINE| ADMIN_WAIT

    Scenario # 4

    Resource OFFLINE --> Resource attempting ONLINE --> Resource fails to go ONLINE --> Clean entry point invoked --> Clean entry point fails --> Resource moves to OFFLINE|ADMIN_WAIT

     

    RESNOTOFF is invoked on the system if a resource in a service group does not go offline even after issuing the offline command to the resource. This event trigger only covers scenario # 1. That you also verifying in your test environment.

    As per your description, you are either hitting scenario # 2 or # 3. RESNOTOFF won’t be executed in this scenarios. This is expected behavior. You needn’t worry about scenario # 2. In scenario # 2, clean entry point will be retried infinitely. Eventually, at some of time, clean entry point will succeed. 

    For scenarios # 3 and # 4, you can use RESADMINWAIT trigger. RESADMINWAIT trigger is invoked when a resource enters ADMIN_WAIT state.

     

    To cover all possible failure scenarios of clean entry point, you should use RESNOTOFF and RESADMINWAIT triggers.

     

    Thanks & Regards,
    Sunil Y