How To Configure Bonding on Novell Linux products

  • 3929220
  • 11-Jan-2008
  • 30-Apr-2012

Environment

Novell SUSE Linux Enterprise Server 9
Novell SUSE Linux Enterprise Server 10
Novell Open Enterprise Server (Linux based)

Situation

You want to combine several physical network cards into a virtual one. This article does not describe all of the possible options and features of bonding but simply explains how to set it up on Novell Linux products. For additional information on bonding itself, please refer to the file/usr/src/linux/Documentation/networking/bonding.txt, provided the kernel sources are installed on your system, or visit the home page of the bonding project.

Resolution

NOTE: Before beginning, please refer to the additional notes section.

In this sample scenario, two network cards will be combined by way of bondingmode=1(hot-standby).  As of SLE10SP1, YaST2 can setup a bonding device (as documented in TID 3815448 - How to set up bonding with YaST2), however versions prior to this do not, so we must do this manually by editing the config files created by YaST2.

1. Configure the physical devices with YaST2.  Setup the devices to use DHCP as this will save some typing.
    * If using SLE10SP1, continue with the configuration in YaST2.  However, before creating the bond interface, make sure the interfaces to be included in the bond are configured without any IP addresses (None Configuration in YaST2).
    * Pay particular attention to the bonding module options.  If using active-backup (synonymous with hot-standby) mode, be sure to add the miimon=100 use_carrier=1 parameters to the options dialog.

2. Create a new configuration file at /etc/sysconfig/network which contains the configuration for the bonding device.  Name this configuration ifcfg-bond0.  This configuration will be unique to each network environment and desired mode.  As an example:
    BOOTPROTO='static'
    BROADCAST='192.168.1.255'
    IPADDR='192.168.1.1'
    NETMASK='255.255.255.0'
    NETWORK='192.168.1.0'
    STARTMODE='onboot'
    BONDING_MASTER='yes' ###Note: the line BONDING_MASTER='yes' indicates that the device is a bonding master device.
    BONDING_MODULE_OPTS='mode=1 miimon=100 use_carrier=0'
    BONDING_SLAVE0='bus-pci-0000:06:08.1'
    BONDING_SLAVE1='bus-pci-0000:06:09.1'
        * Supply one BONDING_SLAVEn='slave_device' for each slave.  The'slave_device' is either an interface name, e.g., 'eth0', or a device specifier for the network device.  The interface name is easier to find, but the ethn names are subject to change at boot time if a device early in the sequence has failed.  The device specifiers 'bus-pci-0000:06:08.1' specify the physical network device, and will not change unless the device's bus location changes (for example, it is physically moved from one PCI slot to another).

3. The contents of BONDING_MODULE_OPTS are supplied to the instance of the bonding module for this device.  Specify the options for the bonding mode, link monitoring, and so on here.  The above is an example of the MII link detection method.  Other methods include:
    * netif_carrier method
        BONDING_MODULE_OPTS='miimon=100 mode=1 use_carrier=1'
    * ARP monitoring method
        BONDING_MODULE_OPTS='arp_interval=2500 arp_ip_target=192.168.1.1 mode=1'

        Note: due to code changes in SLE10, this parameter should be modified with a "+" (to add an arp target) or "-" (to remove an arp target) before the ip address.  The example would then read:arp_ip_target=+192.168.1.1.
        Note: due to code changes in SLES9(.283 kernel), the bonding module (v 3.1.0) has been updated and now requires the arp_validate option when using ARP monitoring.  Please refer to the Linux Kernel Documentation for details.

4. Edit the created configuration files/etc/sysconfig/network/ifcfg-eth-id-xx:xx:xx:xx:xx:xx, and change:

    BOOTPROTO='none'
    STARTMODE='off'
    * Note the pci-ids from these configuration files.  The pci-ids look like this:
_nm_name+'bus-pci-0000:06:08.1

5. Restart the network:
    rcnetwork restart

To verify the configuration, perform the following:

   1. Watch the system log file.  Open a terminal and enter:
      tail -f /var/log/messages
   2. Open another terminal and ping a host:
      ping192.168.1.2
   3. Unplug the network cable of the primary interface on the Linux host.

You should see, if properly configured, the ping continuing regardless of which interface is active.  In the log file, there should be a notice that the link of one device has gone down and the other device has been activated.  Repeat the plug/unplug (also test the other cable or maybe both) to see what logging messages you get and to understand what's going on.

Status

Top Issue

Additional Information

To avoid problems, it is suggested that all network cards use the same driver.  If they use different drivers, please take the following into consideration:
  • There are three driver-dependent methods for check whether a network card has a link or a network connection:
    • MII link status detection,
    • register in the driver netif_carrier, and
    • ARP monitoring.
It is very important that the used drivers support the same method.  If this is not the case, the only solution is to replace one network card in order to use a different driver.

To find out what method is supported by your driver, proceed as follows:

MII link status can be determined with the tools mii-tool or ethtool.

In the case of netif_carrier and ARP monitoring, refer to the driver's source code to find out whether these methods are supported or not.  The corresponding kernel sources must be installed for this purpose.
  • Regarding netif_carrier, search for this string in the driver's source code:
    grep netif_carrierdriver_name.c
  • As for the ARP monitoring method, the driver must support either the register last_rx or trans_start.  Thus, you can search in the driver's source code for:
    grep "last_rx\|trans_start"driver_name.c
Start with the setup only after having verified this information.

The Linux bonding driver provides a method for aggregating multiple network interfaces into a single logical "bonded" interface.  The behavior of the bonded interfaces depends upon the mode.  Generally speaking, modes provide either hot-standby or load balancing services.  Additionally, link integrity monitoring may be performed.

What can bonding do?  With bonding, you can mainly do two things:

  • combine two or more network devices and balance the network traffic over this virtual device.  There are several balancing algorithms to balance traffic.
  • setup two or more network devices which do a fail over if a network device goes down (hot-standby).
The bonding device will work like any other normal network device from the user's point-of-view.  So, for example, you can:
  • setup 802.1q VLANs on top of a bonding device.
  • sniff network traffic.
  • etc., etc.,.
What can't bonding do?  Bonding is based on dead-link detection, therefore it can't detect all possible network failures.  By example, bonding will fail to balance or fail over when:
  • the link detection does not work properly.
  • a network card driver is buggy and stops working.
  • a switch has malfunctioned but still reports the link is up.
  • the used service is out of order.
  • the routing is wrong.

Occasionally it has been experienced that not all network interfaces come up after a system reboot. To prevent this, the loading of the modules should start earlier during the reboot process. The following procedure is helpful in this case:
  1. Edit the file/etc/sysconfig/kernel and add this line: MODULES_LOADED_ON_BOOT="driver_name".
  2. Reboot the server and check the status of all network interfaces, using commands lspci and ifconfig.
  3. If this method is not successful, edit the file/etc/sysconfig/kernel again and remove the line inserted at step 1. Modify the line containing the INITRD_MODULES statement; add the driver_name to this line. It should read similar to this:
    INITRD_MODULES="cdrom scsi_mod ide-cd ehci-hcd reiserfs driver_name"
  4. Call command mkinitrd.
  5. Reboot the server as in step 2.
Another method is to delay the starting of the network interfaces after loading the modules:
  1. Edit the file/etc/sysconfig/network/config and change the variable WAIT_FOR_INTERFACES to the wanted delay in seconds. To delay the interfaces 3 seconds, enter:
    WAIT_FOR_INTERFACES="3"
  2. Reboot the server to verify the success of this measure.

Presently, in SLE10, when using the bonding option"primary=ethX"the fallback to the primary interface does not work. This is because the driver options have moved to/sysfs and can only be set when there is at least one slave.  This will be fixed in SLE10 SP1.

A workaround is:
  1. Add the line POST_UP_SCRIPT="enable-primary" to /etc/sysconfig/network/ifcfg-bond0
  2. Create the following file: /etc/sysconfig/networking/scripts/enable-primary which should contain:
    #!/bin/sh
    sleep 1
    echo "eth0">/sys/class/net/bond0/bonding/primary
    exit 0
  3. Make/etc/sysconfig/networking/scripts/enable-primary executable.  I.e., chmod +x /etc/sysconfig/networking/scripts/enable-primary
The primary should now be setup correctly on boot or network restart.

It has been occasionally experienced that the /etc/sysconfig/network/routes file does not get populated with the routing table as configured through YaST2.  If this is the case, this file can be manually modified with the proper routing table.