Forum Discussion

allaboutunix's avatar
10 years ago

mnt_app resource failover

Hi,

 

Below is the resource dependecies in my enviroment, what happened is somebody unmount the /app filesystem.So, resource went on faulted state,

when i checked that resource criticality it shows mnt_app as non-critical resource 0 and vol_app and app_dg is set as critical 1

The depediencies below shows parent child relationship (mnt_app) as parent and (vol_app)as child.

If parent is set as critical and child is non-critical 0, do it failover to another node.

Or 

If child is set as critical or parent as non-critical 0, then it failover?

Please assist as soon as possible.

 

root@lyle# hares -dep |grep app


PDM_PRD_MG   APP_aphelion             mnt_app
PDM_PRD_MG   APP_tibjmsd              mnt_app
PDM_PRD_MG   Blind_check_stopDB       mnt_app
PDM_PRD_MG   IPMultB_pdmprdappdb      MNicB_DB
PDM_PRD_MG   appdg                    SRDF_app
PDM_PRD_MG   mnt_app                  vol_app
PDM_PRD_MG   vol_activelogs           appdg
PDM_PRD_MG   vol_app                  appdg
PDM_PRD_MG   vol_archivelogs          appdg
PDM_PRD_MG   vol_index                appdg

  • Hi,

    Before digging into this particular configuration, lets understand relationship between “criticality of resources” and “failover of SG”.

    Critical resource: Fault of resource will always initiate failover of service group.

    Non-critical resource: Fault of resource will trigger failover of service group only if there is a “critical” AND “online” parent resource up in the dependency tree. Critically of child resources isn’t considered while deciding to failover or not to failover.

    1st query : If parent is set as critical and child is non-critical 0, do it failover to another node.

    If critical parent resource faults, undoubtedly failover will be initiated. If critical parent resource is online and non-critical resource faults, failover will be initiated. If critical parent resource is offline and non-critical resource faults, failover will not be initiated.

    2nd query : If child is set as critical or parent as non-critical 0, then it failover?

    If critical child resource faults, undoubtedly failover will be initiated. If non-critical parent resource faults, failover will not be initiated.

    Based on the "hares -dep" output snippet, the resources dependency tree is like

    APP_aphelion                   APP_tibjmsd                      Blind_check_stopDB

    |                                              |                                              |

    |                                              |                                              |

    |----------------------------------------------|----------------------------------------------|

                                                   |

                                               mnt_app(non-critical)

                                                   |

                                                   |

                                              vol_app(critical)

                                                   |

                                                   |

                                              app_dg(critical)

    As not mentioned, we are assuming that APP_aphelion, APP_tibjmsd, and Blind_check_stopDB are critical resources.

    In this configuration; fault of app_dg, vol_app, APP_aphelion, APP_tibjmsd, or Blind_check_stopDB will initiate failover of the service group. Fault of mnt_app will trigger failover ONLY IF any of APP_aphelion, APP_tibjmsd, or Blind_check_stopDB is online. In all other cases, fault of mnt_app will not trigger failover of the service group.

  • root@lyle# hares -dep |grep app

    PDM_PRD_MG   APP_aphelion             mnt_app
    PDM_PRD_MG   APP_tibjmsd              mnt_app
    PDM_PRD_MG   Blind_check_stopDB       mnt_app
    PDM_PRD_MG   IPMultB_pdmprdappdb      MNicB_DB
    PDM_PRD_MG   appdg                    SRDF_app
    PDM_PRD_MG   mnt_app                  vol_app
    PDM_PRD_MG   vol_activelogs           appdg
    PDM_PRD_MG   vol_app                  appdg
    PDM_PRD_MG   vol_archivelogs          appdg
    PDM_PRD_MG   vol_index                appdg

     

    DO this command shows mnt_app as parent or child?

    If child is set as critical or parent as non-critical 0, then it failover?

  • You say: "If parent is set as critical and child is non-critical 0, do it failover to another node"?

    Short answer: NO, not necessarily...

    If child faults the parent is likely to fault too, especially when the child resource is a mount resource.  If the file system goes away, the parent, IF it was ONLINE at the time, would almost certainly soon fault as it would not longer be able to operate without it's required file system.

    However, if the parent was OFFLINE and the child faulted, that there would be no failover as the child (the mnt resource) is marked as non-critical. 

    Also, if some how the parent only ever needs the child file system to start up, and then never needs it again, then maybe it would not fault if the file system went away -- that would be unusual, but I guess possible.

     

  • Yes your command shows mnt_app as parent to vol_app -- I was confused and thought it was the other way around initially in my first response. 

    The answer is the same, however.  You will only get a failover if a critical resource faults. 

    You say: "If parent is set as critical and child is non-critical 0, do it failover to another node" and I guess I need to rephrase this in order for it to make for a more precise example:

    Qustion1:  If a parent is set as critical and it faults, will it cause a failover even if the child is non-critical?    ANS: YES, a failover would occur.

    Qustion2:  If a parent is set as critical and its child is set to non-critical, and then the child faults, will it cause a failover?    ANS: Maybe;  most of the time the failure of a child would eventually lead to the parent faulting, but I can imaginge cases where a parent would be capable of working fine even though it's parent has already faulted, and in that case, there would be no failover.

    Hope that helps...

     

  • The critical settings of children have no bearing on whether a non-critical resource fails, so the critical settings for volume and diskgroup resources are not relevent, but what IS relevent is the critical settings of parent resources, so you have APP_aphelion and APP_tibjmsd  as both dependent on mnt_app and so if mnt_app fails, then VCS will take APP_aphelion and APP_tibjmsd resources offline and if APP_aphelion and APP_tibjmsd are both non-critical then the group will not failover with the APP resources remaining down, but if either of APP_aphelion and APP_tibjmsd are critical then VCS will fail the group over.

    So if your group failed over when mnt_app  faulted then I suspect that one or both of APP_aphelion and APP_tibjmsd  are critical.

    If APP_aphelion and APP_tibjmsd can continue without mnt_app, and so should not be brought down if mnt_app fails, then you should not make them dependent on the mount.

    MIke

  • Hi,

    Before digging into this particular configuration, lets understand relationship between “criticality of resources” and “failover of SG”.

    Critical resource: Fault of resource will always initiate failover of service group.

    Non-critical resource: Fault of resource will trigger failover of service group only if there is a “critical” AND “online” parent resource up in the dependency tree. Critically of child resources isn’t considered while deciding to failover or not to failover.

    1st query : If parent is set as critical and child is non-critical 0, do it failover to another node.

    If critical parent resource faults, undoubtedly failover will be initiated. If critical parent resource is online and non-critical resource faults, failover will be initiated. If critical parent resource is offline and non-critical resource faults, failover will not be initiated.

    2nd query : If child is set as critical or parent as non-critical 0, then it failover?

    If critical child resource faults, undoubtedly failover will be initiated. If non-critical parent resource faults, failover will not be initiated.

    Based on the "hares -dep" output snippet, the resources dependency tree is like

    APP_aphelion                   APP_tibjmsd                      Blind_check_stopDB

    |                                              |                                              |

    |                                              |                                              |

    |----------------------------------------------|----------------------------------------------|

                                                   |

                                               mnt_app(non-critical)

                                                   |

                                                   |

                                              vol_app(critical)

                                                   |

                                                   |

                                              app_dg(critical)

    As not mentioned, we are assuming that APP_aphelion, APP_tibjmsd, and Blind_check_stopDB are critical resources.

    In this configuration; fault of app_dg, vol_app, APP_aphelion, APP_tibjmsd, or Blind_check_stopDB will initiate failover of the service group. Fault of mnt_app will trigger failover ONLY IF any of APP_aphelion, APP_tibjmsd, or Blind_check_stopDB is online. In all other cases, fault of mnt_app will not trigger failover of the service group.