ocfs2 on SLES10 NTS sanity check (OCFS2 HEARTBEAT)

This document (7001469) is provided subject to the disclaimer at the end of this document.

Environment

SUSE Linux Enterprise Server 10 Service Pack 2
SUSE Linux Enterprise Server 10 Service Pack 1

Situation

OCFS2 on SLES10 is to be checked whether it is configured correctly. The following is best practice from WSS Linux NUE Team. This does not mean that ocfs2 cannot be configured differently. Especially the timings can be different depending on the situation. This is not supposed to be a blueprint, but a step by step check whether the settings are basically correct.

Resolution

OCFS2 is a cluster file system. This means all checks have to be done on all participating cluster nodes and all settings have to be the same on all nodes.

On each and every node check the following either on the system or by checking the appropriate output from supportconfig.sh

1. Kernelversion

The Kernel should be the latest Kernel of a Service Pack. Caveat, there is a version change of ocfs2 from SLES10 SP1 to SLES10 SP2.

Checking on the commandline via

uname -a

or in the supportconfig in basic-environment.txt

2. Check that the appropriate module is loaded. Make sure this module belongs to the kernel identified in Step 1 and is not a weak update or anything else.

Checking on the commandline via

modinfo ocfs2

on in the supportconfig in modules.txt

3. Check that ocfs2 is activated during boot process. Caveat, you are looking for o2cb, not ocfs2 in this case in the system

Checking on the commandline via

chkconfig o2cb --list

on in the supportconfig in ocfs2.txt

4. Check the ocfs2 settings themself. Caveat, the output of /etc/init.d/o2cb status differs from SLES10 SP1 to SLES10 SP2 so it might be best to rely on the config file in sysconfig

Checking on the commandline by looking at / editing

/etc/sysconfig/o2cb

on in the supportconfig in ocfs2.txt

The recommended values are

O2CB_HEARTBEAT_THRESHOLD=31
O2CB_IDLE_TIMEOUT_MS=30000
O2CB_KEEPALIVE_DELAY_MS=2000
O2CB_RECONNECT_DELAY_MS=2000
O2CB_HEARTBEAT_MODE="user"

Special attention should be given to O2CB_HEARTBEAT_THRESHOLD which defaults in older versions of SLES10 to 7 which might be ok for testing but not for production.

Caveat, if HEARTBEAT_MODE="user" is used then Heartbeat and STONITH have to be configured to get the ocfs2 running. The reason for this is the following, if you have the mode set to "user", then ocfs2 relies on the cluster communication from heartbeat. But the only way that heartbeat can send a notify is, if the STONITH tells the cluster that a node is gone. So without STONITH the ocfs2 settings are not working with "user" mode. We recommend the use of "user" mode.

5. Check the ocfs2 configuration file

Checking on the commandline by looking at / editing

/etc/ocfs2/cluster.conf

on in the supportconfig in ocfs2.txt

The syntax of this file is explained by the example. Caveat, this is only an example.

node:
        ip_port = 7777
        ip_address = 149.44.174.137
        number = 0
        name = power720-1
        cluster = rumburak

node:
        ip_port = 7777
        ip_address = 149.44.174.138
        number = 1
        name = power720-2
        cluster = rumburak

cluster:
        node_count = 2
        name = rumburak

The file breaks down in 2 areas, node, where there have to be the entries for the each and every node in the ocfs2 cluster and cluster which is only a summary of the name and the node count. Caveat node count in the nodes section starts with 0 while the node count in the cluster section starts with 1.

If these checks are done, mode is set to "user" and everything seems to be alright, but there are still problems with ocfs2 then the next step should be to check the heartbeat settings.
If mode is "kernel" but there are still problems with ocfs2 then the next step should be to contact NTS.

The heartbeat settings are explained by an example, you get the heartbeat settings by issuing

    cibadmin -Q > /tmp/suse.xml

on one node. As the heartbeat settings are the same on all nodes it is not necessary to get the cibadmin from every node.

Relevant entries are in the sections crm_config , example:

    <crm_config>
    ...
        <nvpair id="cib-bootstrap-options-stonith-enabled" name="stonith-enabled" value="true"/>
    ...
    </crm_config>

as stated above, without STONITH activated user mode ocfs2 will not work.

and in the section resources , example:

<resources>
    ...
       <clone id="ocfs2_fs">
         <meta_attributes id="ocfs2_fs_meta_attrs">
           <attributes>
             <nvpair id="ocfs2_fs_metaattr_clone_max" name="clone_max" value="2"/>
             <nvpair id="ocfs2_fs_metaattr_clone_node_max" name="clone_node_max" value="1"/>
             <nvpair id="ocfs2_fs_metaattr_notify" name="notify" value="true"/>
             <nvpair id="ocfs2_fs_metaattr_globally_unique" name="globally_unique" value="false"/>
           </attributes>
         </meta_attributes>
         <primitive id="resource_fs" class="ocf" type="Filesystem" provider="heartbeat">
           <instance_attributes id="resource_fs_instance_attrs">
             <attributes>
               <nvpair id="4ca301b0-142a-4664-b197-9c7385c59f46" name="device" value="/dev/disk/by-id/scsi-1494554000000000000000000030000005e2b00000d000000"/>
               <nvpair id="9cc32e5e-f53b-4bf0-a065-74efc0b4e252" name="directory" value="/mnt/t1"/>
               <nvpair id="cd560808-810d-4637-b0fb-d9f8636c9a1e" name="fstype" value="ocfs2"/>
             </attributes>
           </instance_attributes>
           <operations>
             <op id="493029d4-e225-4f81-89a2-bb7f2f076672" name="monitor" interval="20" timeout="40" start_delay="10" on_fail="fence" disabled="false" role="Started"/>
           </operations>
         </primitive>
       </clone>
    ...
<resources>

The most common errors here are
    - notify true not set
    - globally_unique false not set
    - device not set to /dev/disk/by-id/ value, which can result in troubles if the numbering of the devices changes, for example iSCSI or SAN devices are possible culprits here
    - no monitor action set, which can result in nasty behaviour if an admin does an umount on the commandline, bypassing the heartbeat and heartbeat not realizing that the ocfs2 node is gone.

Disclaimer

This Support Knowledgebase provides a valuable tool for SUSE customers and parties interested in our products and solutions to acquire information, ideas and learn from one another. Materials are provided for informational, personal or non-commercial use within your organization and are presented "AS IS" WITHOUT WARRANTY OF ANY KIND.