04-07-2015 10:42 AM
Hi,
Below is the resource dependecies in my enviroment, what happened is somebody unmount the /app filesystem.So, resource went on faulted state,
when i checked that resource criticality it shows mnt_app as non-critical resource 0 and vol_app and app_dg is set as critical 1
The depediencies below shows parent child relationship (mnt_app) as parent and (vol_app)as child.
If parent is set as critical and child is non-critical 0, do it failover to another node.
Or
If child is set as critical or parent as non-critical 0, then it failover?
Please assist as soon as possible.
root@lyle# hares -dep |grep app
PDM_PRD_MG APP_aphelion mnt_app
PDM_PRD_MG APP_tibjmsd mnt_app
PDM_PRD_MG Blind_check_stopDB mnt_app
PDM_PRD_MG IPMultB_pdmprdappdb MNicB_DB
PDM_PRD_MG appdg SRDF_app
PDM_PRD_MG mnt_app vol_app
PDM_PRD_MG vol_activelogs appdg
PDM_PRD_MG vol_app appdg
PDM_PRD_MG vol_archivelogs appdg
PDM_PRD_MG vol_index appdg
Solved! Go to Solution.
05-04-2015 01:44 AM
Hi,
Before digging into this particular configuration, lets understand relationship between “criticality of resources” and “failover of SG”.
Critical resource: Fault of resource will always initiate failover of service group.
Non-critical resource: Fault of resource will trigger failover of service group only if there is a “critical” AND “online” parent resource up in the dependency tree. Critically of child resources isn’t considered while deciding to failover or not to failover.
1st query : If parent is set as critical and child is non-critical 0, do it failover to another node.
If critical parent resource faults, undoubtedly failover will be initiated. If critical parent resource is online and non-critical resource faults, failover will be initiated. If critical parent resource is offline and non-critical resource faults, failover will not be initiated.
2nd query : If child is set as critical or parent as non-critical 0, then it failover?
If critical child resource faults, undoubtedly failover will be initiated. If non-critical parent resource faults, failover will not be initiated.
Based on the "hares -dep" output snippet, the resources dependency tree is like
APP_aphelion APP_tibjmsd Blind_check_stopDB
| | |
| | |
|----------------------------------------------|----------------------------------------------|
|
mnt_app(non-critical)
|
|
vol_app(critical)
|
|
app_dg(critical)
As not mentioned, we are assuming that APP_aphelion, APP_tibjmsd, and Blind_check_stopDB are critical resources.
In this configuration; fault of app_dg, vol_app, APP_aphelion, APP_tibjmsd, or Blind_check_stopDB will initiate failover of the service group. Fault of mnt_app will trigger failover ONLY IF any of APP_aphelion, APP_tibjmsd, or Blind_check_stopDB is online. In all other cases, fault of mnt_app will not trigger failover of the service group.
04-07-2015 11:45 AM
root@lyle# hares -dep |grep app
PDM_PRD_MG APP_aphelion mnt_app
PDM_PRD_MG APP_tibjmsd mnt_app
PDM_PRD_MG Blind_check_stopDB mnt_app
PDM_PRD_MG IPMultB_pdmprdappdb MNicB_DB
PDM_PRD_MG appdg SRDF_app
PDM_PRD_MG mnt_app vol_app
PDM_PRD_MG vol_activelogs appdg
PDM_PRD_MG vol_app appdg
PDM_PRD_MG vol_archivelogs appdg
PDM_PRD_MG vol_index appdg
DO this command shows mnt_app as parent or child?
If child is set as critical or parent as non-critical 0, then it failover?
04-07-2015 12:54 PM
You say: "If parent is set as critical and child is non-critical 0, do it failover to another node"?
Short answer: NO, not necessarily...
If child faults the parent is likely to fault too, especially when the child resource is a mount resource. If the file system goes away, the parent, IF it was ONLINE at the time, would almost certainly soon fault as it would not longer be able to operate without it's required file system.
However, if the parent was OFFLINE and the child faulted, that there would be no failover as the child (the mnt resource) is marked as non-critical.
Also, if some how the parent only ever needs the child file system to start up, and then never needs it again, then maybe it would not fault if the file system went away -- that would be unusual, but I guess possible.
04-07-2015 01:10 PM
Yes your command shows mnt_app as parent to vol_app -- I was confused and thought it was the other way around initially in my first response.
The answer is the same, however. You will only get a failover if a critical resource faults.
You say: "If parent is set as critical and child is non-critical 0, do it failover to another node" and I guess I need to rephrase this in order for it to make for a more precise example:
Qustion1: If a parent is set as critical and it faults, will it cause a failover even if the child is non-critical? ANS: YES, a failover would occur.
Qustion2: If a parent is set as critical and its child is set to non-critical, and then the child faults, will it cause a failover? ANS: Maybe; most of the time the failure of a child would eventually lead to the parent faulting, but I can imaginge cases where a parent would be capable of working fine even though it's parent has already faulted, and in that case, there would be no failover.
Hope that helps...
04-09-2015 11:55 AM
The critical settings of children have no bearing on whether a non-critical resource fails, so the critical settings for volume and diskgroup resources are not relevent, but what IS relevent is the critical settings of parent resources, so you have APP_aphelion and APP_tibjmsd as both dependent on mnt_app and so if mnt_app fails, then VCS will take APP_aphelion and APP_tibjmsd resources offline and if APP_aphelion and APP_tibjmsd are both non-critical then the group will not failover with the APP resources remaining down, but if either of APP_aphelion and APP_tibjmsd are critical then VCS will fail the group over.
So if your group failed over when mnt_app faulted then I suspect that one or both of APP_aphelion and APP_tibjmsd are critical.
If APP_aphelion and APP_tibjmsd can continue without mnt_app, and so should not be brought down if mnt_app fails, then you should not make them dependent on the mount.
MIke
05-04-2015 01:44 AM
Hi,
Before digging into this particular configuration, lets understand relationship between “criticality of resources” and “failover of SG”.
Critical resource: Fault of resource will always initiate failover of service group.
Non-critical resource: Fault of resource will trigger failover of service group only if there is a “critical” AND “online” parent resource up in the dependency tree. Critically of child resources isn’t considered while deciding to failover or not to failover.
1st query : If parent is set as critical and child is non-critical 0, do it failover to another node.
If critical parent resource faults, undoubtedly failover will be initiated. If critical parent resource is online and non-critical resource faults, failover will be initiated. If critical parent resource is offline and non-critical resource faults, failover will not be initiated.
2nd query : If child is set as critical or parent as non-critical 0, then it failover?
If critical child resource faults, undoubtedly failover will be initiated. If non-critical parent resource faults, failover will not be initiated.
Based on the "hares -dep" output snippet, the resources dependency tree is like
APP_aphelion APP_tibjmsd Blind_check_stopDB
| | |
| | |
|----------------------------------------------|----------------------------------------------|
|
mnt_app(non-critical)
|
|
vol_app(critical)
|
|
app_dg(critical)
As not mentioned, we are assuming that APP_aphelion, APP_tibjmsd, and Blind_check_stopDB are critical resources.
In this configuration; fault of app_dg, vol_app, APP_aphelion, APP_tibjmsd, or Blind_check_stopDB will initiate failover of the service group. Fault of mnt_app will trigger failover ONLY IF any of APP_aphelion, APP_tibjmsd, or Blind_check_stopDB is online. In all other cases, fault of mnt_app will not trigger failover of the service group.