SLES11SP2 with OES11SP1 with kernel 3.0.42-0.7-default hangs at loading novell-nss

  • 7011339
  • 09-Nov-2012
  • 19-Nov-2012

Environment

SUSE Linux Enterprise Server 11 Service Pack 2
Novell Open Enterprise Server 11 (OES 11) Linux Support Pack 1

Situation

When running Novell Open Enterprise Server 11 Service Pack 1 (OES11 SP1) servers with NSS pools and software mirroring active (NSS Raid 1 or NSS Raid 5, or SBD mirroring) and patching the server with the latest available patches available in the public patch catalogs, a change in the latest SLES11 SP2 kernel module that will be installed ('kernel 3.0.42') prevents the OES11 SP1 server from booting beyond the point where 'novell-nss' is started.

This problem exists on all servers with the given kernel version using NSS mirroring (RAID 1).

In addition, when performing a NSS 'pool move' action on a server with the problematic kernel, this will lead to a kernel crash.

Resolution

This is a top issue for Novell to resolve, and a solution is currently work in progress.

As a work around, there are temporarily two methods :
- Wait with applying any upgrades from the public patch catalogs if NSS Raid 1 exists on the server and there's no absolute requirement to apply patches.
When there is a requirement to apply patches to the server, but no need to apply the problematic kernel, please see the 'Additional information' section below on how to configure zypper to not install the kernel that exposes this problem.

- When already patched, and the server does no longer boot, it is possible to perform a kernel downgrade from problematic kernel 3.0.42 to kernel 3.0.38 which is not affected by this problem. There may however possibly be errors or side effects with other kernel related modules.

- Log a Service Request with Novell Technical Support to request, refer to this TID and request an FTF that has resolved the problem in testing environments.
This FTF has  just been delivered and not yet been released to the public. It is currently undergoing QA testing, but is confirmed to resolve the problem in problem duplication environments.

Cause

A change that was made to the kernel below is causing the problem for NSS :

- patches.fixes/block-introduce-blk_set_stacking_limits-function:
block: Introduce blk_set_stacking_limits function.

Additional Information

It seems that with this scenario, where servers not properly booting after the update, the best advice would currently be to wait with patching the SLES11 SP2 / OES11 SP1 server, if the OES11 SP1 server has any form of either NSS or SBD mirroring, or NSS RAID5 active.

If this applies to the configuration, please *do not* update the machine until this problem has been resolved.
To make sure there will be no accidental kernel upgrade when other administrators may possible also operate the server, there's a possibility to exclude the kernel upgrade using zypper.

'man zypper' has a help section on how to prevent certain packages from being upgraded.

<snip/>
Package Locks Management

Package  locks  serve  the  purpose  of  preventing changes to the set of installed packages on the system. The locks are stored in form of a query in /etc/zypp/locks file (see also locks(5)).  Packages matching this query are then forbidden to change their installed status; an installed package can't be removed, not installed package can't be installed.   When requesting to install or remove such locked package, you will get a dependency problem dialog.


locks (ll)
              List currently active package locks.


addlock (al) [options] <package-name> ...
              Add a package lock. Specify packages to lock by exact name or by a glob pattern using '*' and '?'  wildcard characters.


       -r, --repo <alias|name|#|URI>
              Restrict the lock to the specified repository.

       -t, --type <type>
              Lock only packages of specified type (default: package).  See section Package Types for list of available package types.


removelock (rl) [options] <lock-number|package-name> ...
              Remove specified package lock. Specify the lock to remove by its number obtained with zypper locks or by the package name.


       -r, --repo <alias|name|#|URI>
              Restrict the lock to the specified repository.

       -t, --type <type>
              Restrict the lock to packages of specified type (default: package).  See section Package Types for list of available package types.



cleanlocks (cl)
              Remove unused locks.

This command looks for locks that do not currently (with regard to repositories used) lock any package and for each such lock it asks user whether to remove it.
</snip>

As such, configuring  an exclusion for the kernel using 'zypper addlock kernel*' will no longer list the kernel update as available patch when using the 'zypper lu' command. When ex exclusion is made, this is update in '/etc/zypp/locks' . Please make sure to clean this up and do not leave items here as it will influence future patching behavior.

Once the solution has been made available to the public patch catalogs, this self-imposed restriction can be eliminated, and the server can be patched with the latest kernel.