According to the OCF specification, there are strict definitions of the exit codes an action must return. The cluster always checks the return code against the expected result. If the result does not match the expected value, then the operation is considered to have failed and a recovery action is initiated. There are three types of failure recovery:
Table 8-1 Failure Recovery Types
|
Recovery Type |
Description |
Action Taken by the Cluster |
|---|---|---|
|
soft |
A transient error occurred. |
Restart the resource or move it to a new location. |
|
hard |
A non-transient error occurred. The error may be specific to the current node. |
Move the resource elsewhere and prevent it from being retried on the current node. |
|
fatal |
A non-transient error occurred that will be common to all cluster nodes. This means a bad configuration was specified. |
Stop the resource and prevent it from being started on any cluster node. |
Assuming an action is considered to have failed, the following table outlines the different OCF return codes and the type of recovery the cluster will initiate when the respective error code is received.
Table 8-2 OCF Return Codes
|
OCF Return Code |
OCF Alias |
Description |
Recovery Type |
|---|---|---|---|
|
0 |
OCF_SUCCESS |
Success. The command completed successfully. This is the expected result for all start, stop, promote and demote commands. |
soft |
|
1 |
OCF_ERR_GENERIC |
Generic |
soft |
|
2 |
OCF_ERR_ARGS |
The resource’s configuration is not valid on this machine (for example, it refers to a location/tool not found on the node). |
hard |
|
3 |
OCF_ERR_UNIMPLEMENTED |
The requested action is not implemented. |
hard |
|
4 |
OCF_ERR_PERM |
The resource agent does not have sufficient privileges to complete the task. |
hard |
|
5 |
OCF_ERR_INSTALLED |
The tools required by the resource are not installed on this machine. |
hard |
|
6 |
OCF_ERR_CONFIGURED |
The resource’s configuration is invalid (for example, required parameters are missing). |
fatal |
|
7 |
OCF_NOT_RUNNING |
The resource is not running. The cluster will not attempt to stop a resource that returns this for any action. This OCF return code may or may not require resource recovery—it depends on what is the expected resource status. If unexpected, then soft recovery. |
N/A |
|
8 |
OCF_RUNNING_MASTER |
The resource is running in Master mode. |
soft |
|
9 |
OCF_FAILED_MASTER |
The resource is in Master mode but has failed. The resource will be demoted, stopped and then started (and possibly promoted) again. |
soft |
|
other |
N/A |
Custom error code. |
soft |