Novell Home

My Favorites

Close

Please to see your favorites.

XEN: Common problems with network bridges

This document (7001989) is provided subject to the disclaimer at the end of this document.

Environment

Novell SUSE Linux Enterprise Server 10 Service Pack 2
Novell SUSE Linux Enterprise Server 10 Service Pack 1

Situation

This TID addresses some common issues with network bridges:
  • intermittentant connectivity on bonded bridges
  • routing issues within Dom0
  • DomU's lack network access because of switch security
  • DomU's have no network connectivity after the network subsystem has been restarted
  • Bridge performance
  • DomU's do not see all network traffic
  • DomU's connect to the wrong bridge when started
  • Dom0 does not have a network bridge.
  • Changes using YaST break Xen Networking
  • Interfaces on multiple networks

Resolution

This document is not intended to be a catch-all solution document. Rather, this document details common causes and solutions for network issues when running the Xen kernel. There are many others solutions to these problems; only the common solutions have been shown. In some cases, this TID will only identify the source of the problem.

intermittent connectivity on bonded bridges
It has been observed that on some low and mid-range switches, bonding mode 0 or "round-robin," network connectivity is intermittent and sparadoc. Some symptoms are:
  • DomU's may not have any connectivity
  • Dom0 may not have any connectivity to the network
  • DomU's and Dom0 are connected
cause: Round-robin and a few other modes works by manipulating the ARP table of the switch. Some swithces do not work very well with this setting.

solution: Change to "active-backup" or look change the arp settings on the switch. If the swtich supports "802.11q" or "Dynamic Link Aggregation," mode 4 would also work.

routing: dom0 has network connectivity, but domU's do not

cause: The most common cause of this is that the routing table does not contain a default route. This issue is commonly seen where there are multiple network bridges, third-party or custom scripts, or a static bridging configuration has been setup.

test:  To test, type "ip route show", for example:
192.168.0.0/24 dev br0  proto kernel  scope link  src 192.168.0.10
172.16.142.0/24 dev br1  proto kernel  scope link  src 172.16.142.1
127.0.0.0/8 dev lo  scope link
default via 192.168.0.1 dev br0

If you do not  see a "default via" as the last value, then the default route has not been set properly. To test a new default route you can use "ip route add". For example, to add a default route using br0 as the interface and the gateway of 192.168.0.1, one would use:
ip route add default via 192.168.0.1 dev br0

solution: Alter your scripts to set the default route. If you are using a static bridge configuration, in /etc/sysconfig/network/routes, define the bridge. Setting this value will, however, break networking when booting into a non-Xen kernel. For example:
default 10.0.0.1 - br0.

switch issue: dom0 has network connectivity but domU's do not

cause: Some mid and high end enterprise class switches support "port security" or "port mode" settings which define the role of individual ports. Some of these switches will restrict the number of MAC (hardware) addresses that are allowed on a single switch port. In these cases a single MAC address, usually the MAC address of Dom0, is able to communicate with the greater network. DomU's will each have a MAC address which will be seen on the switch port that the bridge is plugged into.

solution: Disable port security on the switch or the individual port. Alternatively, change the port to a "bridge" or a "switch" port. Consult with your switch vendor documentation or support in order to establish whether or not this a match.

network restart: domU's have no connectivity after a network restart

cause: The architecture of network bridges is such that restarting the network subsystem after a DomU has been started will "unplug" their connection to the bridge.

solution: See TID 700986 Xen: reattaching network devices after the network bridge is restarted

bridge does not forward all traffic through the bridge

cause: Network bridges will not forward all traffic across the bridge. By definition, a bridge will forward broadcast traffic. Other network traffic is only forwarded when the target (MAC address) is on the the other side of the traffic; if the MAC address is not on the other side of the traffic, then it will not be forwarded.

solution: You will need to set up forwarding rules in "ip tables" to forward all traffic through the bridge. Unfortunately, there are too many variables for this document to detail how to do it. If you need to implement this solution, you may need to contact a Novell Linux partner.

solution: An alternative solution is to do PCI pass through, which is well documented in the Xen documentation. The caveaut, however, is that it is only currently available for para-virtual Domians at this time. Newer chips and motherboards which support the Intel-VTd technology will allow you to use PCI pass through with fully virtual domains.

utilization: slow bridges

cause: Software bridges are able to handle significant loads. However, if the CPU utilization tops 85% there is a significant hit to the efficiency of the bridge. Software bridges are the fastest when CPU utilization is less than 56%, but will not have noticeable performance degradation until utilization reached 85% or more.

solution: reduce load for total utilization (Dom0 + DomU's) to be less than 85%. Alternatively, you could do CPU pinning to reduce the load and pin at least two CPU's to Dom0 exclusively.

network topography: slow bridges or slow lookup times

cause:  The network bridge is only capable of retaining 4096 MAC addresses. On some networks, the number of MAC address can exceed 4096. On these networks, the table will have to purge one address and then which side of the bridge the target MAC address is on. Once this has been done, then the MAC address is added to the table and it will be forwarded

solution: Generally speaking this is not an issue. If, however, there is time sensative communication, use a smaller subnet than /22.

domu's connect to the wrong bridge

cause: The default behavior of Xen is to attach DomU's to the first bridge or the default bridge. Often times this will be the "eth0" bridge.

solution: Xen DomU's can be configured to attach to a specific bridge inferface. See TID 7000175, Changing network bridges for DomU's

network bridges are not present

cause: At installation time, "network manager" was chosen over the "traditional ifup method" during the network installation. This setting allows the user to dynamically configure networks. However, Network-Manager does not allow for multiple IP address, bridges or any configuration that a normal user would need. Network-manager is designed more mobile users than for servers.

solution: Disable network-manager. See TID 35882110 Xen boot problems when the network manager is enabled

dom0 services such as DHCP or LinuxHA do not work after xend starts

cause: If the default Xen configuration is chosen, then "xend" or the Xen management deamon is started after the network and just about every service that runs in Dom0. Part of this process is tearing down the network, creating network bridges and then bringing the new network up.

solution: There is no "easy" solution to this problem. In general, Novell advises against using Dom0 to host network services. Some notable exceptions are services that are required to run the DomU's or light weight services that are used to support the DomU's. Some services like DHCP and LinuxHA use simple device names like "eth0" to start their services. If you use a different device name, like "br0" or "xenbr0" you need to configure the services to use the bridge name. Further, for those using the default Xen configuration, booting into the Xen kernel and the none-Xen kernel will create two different networking environments. If this is a concern, then it is recommended that you create static bridges that are present in both the Xen and non-Xen environment.

modifications to bonding or network configuration in non-Xen kernel results in broken networking in Xen kernel

cause: This error is caused by the Xen networking script not parsing out the modified files. At installation time, the configuration file for ethernet devices uses "ifcfg-eth-id-...." file name format. In some cases YaST will change the configuration file to be "ifcfg-bus-id..." If this happens, then the Xen scripts will overlook the ethernet devices and there will be no members of a bonded interface.

solution: Rename an "ifcfg-bus-id" files to be "ifcfg-eth-id-" (followed by the MAC address of the card) or "ifcfg-ethX" (replace X with the pseudo number, for example ifcfg-eth0).

multiple networks on the same interface: domU's have no connectivity

match: Multiple IP networks are connected to the same physical or logical connection. For example, 172.16.10.1/22 and 192.168.0.1/24 are bound to logical interface bond0.

cause: This issue is commonly seen in networks where IP space is limited. In order to add more IP space, system administrators will simply add another IP network on top of an existing network. While this technically possible for IP networks, experience has shown that this does not work on bridges where the bridge interface has an IP address assigned.

solution: Make sure that IP addresses assigned to the interfaces on Dom0 and DomU are on the same subnet.


Additional Information

Below are some useful tips for troubleshooting bridge and networking related issues

debug mode for network scripts
In /etc/sysconfig/network/config, set "DEBUG=YES". This will display debug information when the scripts are run.

bridge commands
The following list shows useful bridge commands
  • brctl show: shows all bridges on the system
  • brctl showstp <bridge>: shows the state information and the path costs
  • brctl addif  <bridge> <interface>: adds an interface to a bridge
  • brctl delif <bridge> <interface>: removes an interface from a bridge
arp commands
Sometimes it is useful to manipulate and see the arp table on a box. The following is a list of useful ARP/MAC address related commands:
  • arping <IP Address>: Sends out an arp request for the MAC address associated with the IP Address. Useful in establishing layer 2 connectivity.
  • ip neigh show: Shows ARP table and state for MAC addresses
  • ip neigh help: Shows help about how to manipulate the ARP table on a box. Use with caution
routing commands
There are several ways to manipulate and show the routing table on Linux. The following is a list of useful related commands for basic manipulation:
  • ip route show: shows the routing table in the new format
  • route -n: shows the routing table in traditional format as known to the to the kernel
  • route: route with out options will hang if there are routing problems
  • ip route replace default via <gateway> dev <ethernet device>: Replace the default route with a new route
  • ip route del <route>: deletes default route
  • ip route add default via <gateway> dev <ethernet device>: Adds default route
Terms for searchability
xen network
xen common network
xen bridges
xen bridge
xen bonding

Dom0 is reachable, domU's are intermittently not reachable
Customers have reported this issue can be fixed by disabling redirects to prevent intermittent network interruptions with entrys in /etc/sysctl.conf:
net.ipv4.conf.all.send_redirects=0
net.ipv4.conf.eth0.send_redirects=0
net.ipv4.conf.br0.send_redirects=0
net.ipv4.conf.default.send_redirects=0

Disclaimer

This Support Knowledgebase provides a valuable tool for NetIQ/Novell/SUSE customers and parties interested in our products and solutions to acquire information, ideas and learn from one another. Materials are provided for informational, personal or non-commercial use within your organization and are presented "AS IS" WITHOUT WARRANTY OF ANY KIND.

  • Document ID:7001989
  • Creation Date:21-NOV-08
  • Modified Date:11-AUG-14
    • SUSESUSE Linux Enterprise Server

Did this document solve your problem? Provide Feedback