Novell Home

My Favorites

Close

Please to see your favorites.

Configure kernel core dump capture

This document (3374462) is provided subject to the disclaimer at the end of this document.

Environment

SUSE Linux Enterprise Server 10
SUSE Linux Enterprise Server 11
SUSE Linux Enterprise Desktop 10
SUSE Linux Enterprise Desktop 11

Situation

The kernel is crashing or otherwise misbehaving and a kernel core dump needs to be captured for analysis.

Resolution

Prerequisites

Kdump stores kernel core dumps under /var.
The partition that /var is on must have enough available disk space for the vmcore file, which will be approximately the size of the system's physical memory. By default, the system will attempt to keep 5 vmcore files. The Kdump facility is available for SLES10 and SLES11 on x86, x86-64, ppc64 and ia64 architectures, and starting with SLES11 SP3, for IBM System z. See Limitations under Additional Information below for more information.

Check the taint status of the kernel (recommended)

Whenever possible, kernel crashes should be reproduced using untainted kernels. Refer to TID 3582750, Tainted Kernel, for details on kernel tainting, the impact of tainting on supportability and for recommendations on how to avoid tainting the kernel.

Set up magic SysRq (recommended)

For kernel problems other than a kernel oops or panic, a kernel core dump is not triggered automatically. If the system still responds to keyboard input to some degree, a kernel core dump can be triggered manually through a "magic SysRq" keyboard combination (typically: hold down three keys simultaneously: the left Alt key, the Print Screen / SysRq key and a letter key indicating the command - ' s' for sync, ' c' for core dump) or a SYSTEM/PSW RESTART on System z, if this feature has been enabled.


For general documentation of the "magic SysRq" feature, please refer to the Documentation/sysrq.txt file in the Linux kernel source.

To enable the magic SysRq feature permanently, edit /etc/sysconfig/sysctl, change the ENABLE_SYSRQ line to ENABLE_SYSRQ="176"
. This change becomes active after a reboot. To enable the feature for the running kernel, run

Set up serial console (recommended)

For some kernel problems, a kernel core dump is not triggered and the system does not respond to the keyboard anymore. For those situations, it may still be possible to trigger a kernel core dump through magic SysRq sequences from a serial console.

Please refer to TID 3456486, Configuring a Remote Serial Console for SLES, for the procedure.

Configure the system for capturing kernel core dumps (SLES 10)
  1. Install the packages kernel-kdump, kdump, and kexec-tools.

    The kernel-kdump package contains a "crash" or "capture" kernel that is started when the primary kernel has crashed and which provides an environment in which the primary kernel's state can be captured. The version of the kernel-dump package needs to be identical to that of the kernel whose state needs to be captured.

    The kexec-tools package contains the tools that make it possible to start the capture kernel from the primary kernel.
  2. Reserve memory for the capture kernel by passing appropriate parameters to the primary kernel.
    For the x86 and x86_64 architecture use the table below based upon how much memory you have.

    Memory

    crashkernel=

    0 - 12 GB64M@16M
    13 - 48 GB128M@16M
    49 - 128 GB256M@16M
    129 - 256 GB512M@16M








    For the PPC64 architecture: crashkernel=128M@32M
    Note: for Xen installations, this parameter needs to be passed to the GRUB line for the Xen hypervisor, not the module line for the Dom0 kernel.

    This can be done as follows: Start YaST, under System, select Boot Loader. On the tab Section Management, select the default section and select Edit. Add the settings to the field labeled Optional Kernel Command Line Parameter , then select Ok and Finish to save the settings.
  3. Activate the kdump system service.

    Run
    chkconfig kdump on

    or in YaST: under System, select System Services (Runlevel), select kdump , then select Enable and Finish.
  4. Reboot the system for the settings to take effect
Configure the system for capturing kernel core dumps (SLES 11)
  1. Install the packages kdump, kexec-tools, and makedumpfile.

    SLES 11 does not require the kernel-kdump package like in earlier versions of SUSE LINUX Enterprise. The technical reason is that the normal kernel is relocatable now and can be used as kdump kernel, i.e. it's possible to load and execute the normal kernel at any address, not only the compiled-in address as before.

    The kexec-tools package contains the tools that make it possible to start the capture kernel from the primary kernel.
  2. Reserve memory for the capture kernel by passing appropriate parameters to the primary kernel.
    For the x86 and x86_64 architecture use the table below based upon how much memory you have.

    Memory

    crashkernel=

    0 - 12 GB128M
    13 - 48 GB256M
    49 - 128 GB512M
    129 - 256 GB1G *(896M, 768M or 512M)









    Note: the crashkernel no longer needs the offset of 16M on SLES 11 for x86 and x86_64 architecture.
    Note: For SLES11 SP2 please double the values for needed memory. The minimum need is 256M.

    For the PPC64 architecture: crashkernel=128M@64M

    Note: for Xen installations, this parameter needs to be passed to the GRUB line for the Xen hypervisor, not the module line for the Dom0 kernel.

    Note (*): There are hard-coded limits (kernel/kexec-tools) not allowing to allocate 1GB.
    If loading kdump with crashkernel=1G fails, please change the crashkernel size to something smaller: 896M, 768M or 512M.                            
    The minimum size of the crashkernel can vary (hardware/machine specific) so it's best to test the size that will allow to load kdump.


    Reserving memory can be done as follows: Start YaST, under System, select Boot Loader. On the tab Section Management, select the default section and select Edit. Add the settings to the field labeled Optional Kernel Command Line Parameter , then select Ok and Finish to save the settings.
  1. Activate the kdump system service.

    Run
    chkconfig boot.kdump on

    or in YaST: under System, select System Services (Runlevel), select boot.kdump , then select Enable and Finish.
  2. Reboot the system for the settings to take effect

Capturing kdump on a target using devicemapper (lvm or multipath) devices


If the root device is not using devicemapper devices, but the dump is to be captured on a devicemapper device, you need to set:

KDUMP_COMMANDLINE_APPEND="root_no_dm=1 root_no_mpath=1"

in /etc/sysconfig/kdump.
If you use devicemapper devices for both, root and kdump, these options must not be added.

Test local kernel core dump capture

To test the local kernel core dump capture, follow these steps.
If magic SysRq has been configured:
  1. Magic-SysRq-S to sync (flush out pending writes)
  2. Magic-SysRq-C to trigger the kernel core dump
On IBM System z, a kernel core dump can be manually triggered:
  • For SLES in an LPAR by executing the PSW RESTART task on the HMC
  • For SLES in z/VM by issuing #CP SYSTEM RESTART from the 3270 terminal
Please note that the RESTART mechanism does not provide a way to flush write buffers and bears the risk of data loss. It should be used only if the SLES system is completely unresponsive and can't be shut down properly.

Alternatively, without magic SysRq:
  1. Open a shell or terminal
  2. Run sync
  3. Run echo c >/proc/sysrq-trigger
Please note that the 'c' must be lower case! Also, the system will not be responsive while the capture is being prepared and made as the capture kernel environment is a limited, non-interactive environment.

Once the system becomes responsive again, verify that a capture file was created as /var/log/dump/ date-time
/vmcore. On SLES 11 look in /var/crash/date.


Setup for network dump captures - prepare for non-interactive data transfers

1. for SLES10

The scp command (part of OpenSSH) will be used to transfer the dump over the network.

As the capture environment on the dumping system is completely non-interactive, all authorization for the data transfer needs to be set up in advance, so
the system that is to receive the dump needs to accept SSH connections from the dumping server without requiring passwords. This can be done as follows:
  • on the sending system, as the rootuser, generate a keypair for SSH, unprotected by a passphrase:
    ssh-keygen -N '' -C 'passthrough key' -t dsa2
  • From the sending system, add the public key from this keypair to the list of authorized keys for the rootuser on the receiving system:
    ssh root@ receiving.system' cat >>/root/.ssh/authorized_keys' < /root/.ssh/id_dsa.pub
On the receiving system, as the root user, create a directory in which to receive the dump, say /dump :
install -m 700 -u root -g root -d /dump
Make sure this directory resides on a filesystem with sufficient free space.

On the dumping machine, the following settings need to be configured in /etc/sysconfig/kdumpon the dumping system:
KDUMP_RUNLEVEL=3
KDUMP_TRANSFER="scp /proc/vmcore ReceivingSystemNameOrIP:/dump/"

This will make kdump act in a manner similar to the older netdump mechanism: the capture environment will go up to runlevel 3 (where network connectivity is enabled) and will use the secure copy command scp to transfer the kernel core dump to a separate system.


2. for SLES11

add the network device to be used to the variable:  KDUMP_NETCONFIG in /etc/sysconfig/kdump.

       In order to automatically set up a network device, pass the option "auto". This is also the default.
       For a custom setup, pass a string that contains the network device and the mode (dhcp,static), separated by
       a colon, for example: "eth0:static" or "eth1:dhcp".
       If you use "static", you have to set the IP address with ip=ipspec. ipspec is <client>:<server>:<gateway>:<netmask>:<hostname>:<device>:<proto>
       as boot parameter. See mkinitrd(8) for details.

Pass the dumping method and the destination directory to the parameter: KDUMP_SAVEDIR in /etc/sysconfig/kdump
Supported methods are:

       FTP, for example "ftp://user:password@host/var/log/dump"
       SSH, for example "ssh://user:password@host/var/log/dump"
       NFS, for example "nfs://server/export/var/log/dump"
       CIFS (SMB) , for example "cifs://user:password@host/share/var/log/dump"

See also: kdump(5) which contains an exact specification for the URL format.

NOTE:

  • When calculating the needed value for crashkernel, the number of dm-devices is important. For each dm-device attached to the server an extra 4 MB is needed.
  • A to low configured value will cause the server to hang when booting in the crashkernel. See also tid 7010542

Additional Information

Limitations

Kdump is not supported for Xen kernels prior to SLES 11 Service Pack 2.
On IBM System z, kdump is not supported prior to SLES11 Service Pack 3.
On the IA-64 architecture, kdump is only supported as of SLES 10 Service Pack 1.

Related documentation

Configuring a IBM System z Linux server for a coredump

Before SLES11 SP3, a different methodology has to be used for capturing kernel crash dumps on IBM System z.
It is documented in IBM's documentation Linux on System z.

SLES 10
http://www.ibm.com/developerworks/linux/linux390/october2005_documentation.html
Using the Dump Tools
http://public.dhe.ibm.com/software/dw/linux390/docu/l26cdt02.pdf

SLES 11
http://www.ibm.com/developerworks/linux/linux390/documentation_novell_suse.html
Using the Dump Tools
http://public.dhe.ibm.com/software/dw/linux390/docu/l3n5dt11.pdf


(or via http://www.ibm.com/developerworks/linux/linux390/history.html
) and needs the s390-tools package to be installed.)

Disclaimer

This Support Knowledgebase provides a valuable tool for NetIQ/Novell/SUSE customers and parties interested in our products and solutions to acquire information, ideas and learn from one another. Materials are provided for informational, personal or non-commercial use within your organization and are presented "AS IS" WITHOUT WARRANTY OF ANY KIND.

  • Document ID:3374462
  • Creation Date:10-JAN-08
  • Modified Date:08-OCT-14
    • SUSESUSE Linux Enterprise Desktop
      SUSE Linux Enterprise Server

Did this document solve your problem? Provide Feedback