Configure crashkernel memory for kernel core dump analysis

This document (3374462) is provided subject to the disclaimer at the end of this document.

Environment

SUSE Linux Enterprise Server 15
SUSE Linux Enterprise Server 12
SUSE Linux Enterprise Server 11
SUSE Linux Enterprise Server 10
SUSE Linux Enterprise Desktop 15
SUSE Linux Enterprise Desktop 12
SUSE Linux Enterprise Desktop 11
SUSE Linux Enterprise Desktop 10

Situation

The kernel is crashing or otherwise misbehaving and a kernel core dump needs to be captured for analysis.

Resolution

Table of Contents

 

Prerequisites

Kdump stores kernel core dumps under /var. The partition that /var is on must have enough available disk space for the vmcore file. The size varies widely depending on the KDUMP_DUMPLEVEL parameter set in the /etc/sysconfig/kdump file and the size of the system's physical memory. The default kdump value used to be 0, which pulled all physical memory, but since SLES 11 SP3 has been changed to 31, which pulls significantly less, focusing mainly on kernel space memory.
(See man kdump for more detail on KDUMP_DUMPLEVEL)

T
he system will by default attempt to keep 5 vmcore files. The Kdump facility is available for SLES10 and SLES11 on x86, x86-64, ppc64 and ia64 architectures, and starting with SLES11 SP3, for IBM System z.

See Limitations under Additional Information below for more information.

Check the taint status of the kernel (recommended)

Whenever possible, kernel crashes should be reproduced using untainted kernels. Refer to TID 3582750, Tainted Kernel, for details on kernel tainting, the impact of tainting on supportability and for recommendations on how to avoid tainting the kernel.

Set up magic SysRq (recommended)

For kernel problems other than a kernel oops or panic, a kernel core dump is not triggered automatically. If the system still responds to keyboard input to some degree, a kernel core dump can be triggered manually through a "magic SysRq" keyboard combination (typically: hold down three keys simultaneously: the left Alt key, the Print Screen / SysRq key and a letter key indicating the command - ' s' for sync, ' c' for core dump) or a SYSTEM/PSW RESTART on System z, if this feature has been enabled.

The magic SysRq key is a keycombo that enables you to have some control over the system even when it has crashed. The complete documentation can be found at /usr/src/linux/Documentation/sysrq.txt (requires installation of the kernel-source package).

To send the magic keys to a KVM virtual machine using virsh, do the following:


virsh send-key kvm_guest KEY_LEFTALT KEY_SYSRQ KEY_S
virsh send-key kvm_guest KEY_LEFTALT KEY_SYSRQ KEY_C

SLE 10/11 :

To enable the magic SysRq feature permanently, edit /etc/sysconfig/sysctl, change the ENABLE_SYSRQ line to ENABLE_SYSRQ="1"
. This change becomes active after a reboot. To enable the feature for the running kernel, run

# echo 1>/proc/sys/kernel/sysrq
SLE 12 :

enabled per default, to verify run
# cat /proc/sys/kernel/sysrq
184
change the default value, edit /etc/sysctl.conf and add "kernel.sysrq=<new value>" , to activate this change without rebooting the system run
# sysctl -p

Set up serial console (recommended)

For some kernel problems, a kernel core dump is not triggered and the system does not respond to the keyboard anymore. For those situations, it may still be possible to trigger a kernel core dump through magic SysRq sequences from a serial console.

Please refer to TID 3456486, Configuring a Remote Serial Console for SLES, for the procedure.



Configure the system for capturing kernel core dumps (SLES 10)
  1. Install the packages kernel-kdump, kdump, and kexec-tools.

    The kernel-kdump package contains a "crash" or "capture" kernel that is started when the primary kernel has crashed and which provides an environment in which the primary kernel's state can be captured. The version of the kernel-dump package needs to be identical to that of the kernel whose state needs to be captured.

    The kexec-tools package contains the tools that make it possible to start the capture kernel from the primary kernel.

  • Reserve memory for the capture kernel by passing appropriate parameters to the primary kernel. For the x86 and x86_64 architecture use the table below based upon how much memory you have.
 

Memory

crashkernel=

0 - 12 GB64M@16M
13 - 48 GB128M@16M
49 - 128 GB256M@16M
129 - 256 GB512M@16M









For the PPC64 architecture: crashkernel=128M@32M

Note: for Xen installations, this parameter needs to be passed to the GRUB line for the Xen hypervisor, not the module line for the Dom0 kernel.

This can be done as follows: Start YaST, under System, select Boot Loader. On the tab Section Management, select the default section and select Edit. Add the settings to the field labeled Optional Kernel Command Line Parameter , then select Ok and Finish to save the settings.
 
  1. Activate the kdump system service.
  • Run
    chkconfig kdump on
     
    or in YaST: under System, select System Services (Runlevel), select kdump , then select Enable and Finish.
 
  • Reboot the system for the settings to take effect


Configure the system for capturing kernel core dumps (SLES 11)
  1. Install the packages kdump, kexec-tools, and makedumpfile.

    SLES 11 does not require the kernel-kdump package like in earlier versions of SUSE LINUX Enterprise. The technical reason is that the normal kernel is relocatable now and can be used as kdump kernel, i.e. it's possible to load and execute the normal kernel at any address, not only the compiled-in address as before.

    The kexec-tools package contains the tools that make it possible to start the capture kernel from the primary kernel.
Reserve memory for the capture kernel by passing appropriate parameters to the primary kernel.
For the x86 and x86_64 architecture use the table below based upon how much memory you have.
 

Memory

crashkernel=

0 - 12 GB128M
13 - 48 GB256M
49 - 128 GB512M
129 - 256 GB1G *(896M, 768M or 512M)






 

 

 


For the IBM PPC64 architecture use the table below based upon how much memory the system has :.
 

Memory

crashkernel=

2 - 4 GB

320M

5 - 32 GB

512M

33 - 64 GB

1024M

65 - 128 GB

2048M

129 GB & above

4096M (See note 7.)



Reserving memory can be done as follows: Start YaST, under System, select Boot Loader. On the tab Section Management, select the default section and select Edit. Add the settings to the field labeled Optional Kernel Command Line Parameter , then select Ok and Finish to save the settings.
  • Note 1: the crashkernel no longer needs the offset of 16M on SLES 11 for x86 and x86_64 architecture.
  • Note 2: For SLES11 SP2 please double the values for needed memory. The minimum need is 256M.
  • Note 3: More memory needs to be reserved if the btrfs filesystem is used for root rather than the ext3 filesystem.
  • Note 4: For Xen installations, this parameter needs to be passed to the GRUB line for the Xen hypervisor, not the module line for the Dom0 kernel.
  • Note 5: There are hard-coded limits (kernel/kexec-tools) not allowing to allocate 1GB.  If loading kdump with crashkernel=1G fails, please change the crashkernel size to something smaller: 896M, 768M or 512M.
  • Note 6: The minimum size of the crashkernel can vary (hardware/machine specific) so it's best to test the size that allows to boot kdump kernel. If kdump kernel fails to boot with Out of memory or other error messages, try again by increasing the memory reserved for kdump (For example, encrypted disks take more memory).
  • Note 7: For servers with 129 GB or more memory and where the average load is expected to be high the recommendation of "4096M" may be insufficient. Consider increasing the crash kernel size to a minimum of "6144M" instead.
  1. Activate the kdump system service.

    Run
    chkconfig boot.kdump on
     
    or in YaST: under System, select System Services (Runlevel), select boot.kdump , then select Enable and Finish.
  2. Reboot the system for the settings to take effect

Configure the system for capturing kernel core dumps (SLES 12)

For SUSE Linux Enterprise Server 12 this is documented in the System Analysis and Tuning guide, chapter 'Kexec and Kdump '. 


Memory ranges for kdump and fadump for ppc64le (SLES 15)
 
System Memory  - crashkernel setting (crashkernel=)
   4GB  -   32GB :  512MB
  32GB  -   64GB : 1024MB
  64GB  -  128GB : 2048MB
 128GB  - 1024GB :    4GB
1024GB  - 2048GB :    6GB
2048GB  - 4096GB :   12GB
4096GB  - 8192GB :   20GB
   8TB  - 16TB  :   36GB
  16TB & above  :   64GB
 
Note 1: To configure fadump (Firmware Assisted Dump) on SLE12 / SLE15 and IBM Power, please refer to : TID 7023277
Note 2: The minimum size of the crashkernel can vary (hardware/machine specific) so it's best to test the size that allows to boot kdump kernel. If kdump kernel fails to boot with Out of memory or other error messages, try again by increasing the memory reserved for kdump (For example, encrypted disks take more memory).
 
 
Capturing kdump on a target using devicemapper (lvm or multipath) devices

If the root device is not using devicemapper devices, but the dump is to be captured on a devicemapper device, you need to set:
KDUMP_COMMANDLINE_APPEND="root_no_dm=1 root_no_mpath=1"

in /etc/sysconfig/kdump.
If using devicemapper devices for both, root and kdump, these options must *not* be added.


Test local kernel core dump capture

To test the local kernel core dump capture, follow these steps.
If magic SysRq has been configured:
  1. Magic-SysRq-S to sync (flush out pending writes)
  2. Magic-SysRq-C to trigger the kernel core dump
On IBM System z, a kernel core dump can be manually triggered:
  • For SLES in an LPAR by executing the PSW RESTART task on the HMC
  • For SLES in z/VM by issuing #CP SYSTEM RESTART from the 3270 terminal
Please note that the RESTART mechanism does not provide a way to flush write buffers and bears the risk of data loss. It should be used only if the SLES system is completely unresponsive and can't be shut down properly.

Alternatively, without magic SysRq:
  1. Open a shell or terminal
  2. Run sync
  3. Run echo c >/proc/sysrq-trigger
Please note that the 'c' must be lower case! Also, the system will not be responsive while the capture is being prepared and made as the capture kernel environment is a limited, non-interactive environment.

Once the system becomes responsive again, verify that a capture file was created as /var/log/dump/ date-time
/vmcore. On SLES 11 look in /var/crash/date.



Setup for network dump captures - prepare for non-interactive data transfers

1. for SLES10

The scp command (part of OpenSSH) will be used to transfer the dump over the network.

As the capture environment on the dumping system is completely non-interactive, all authorization for the data transfer needs to be set up in advance, so
the system that is to receive the dump needs to accept SSH connections from the dumping server without requiring passwords. This can be done as follows:
  • on the sending system, as the rootuser, generate a keypair for SSH, unprotected by a passphrase:
    ssh-keygen -N '' -C 'passthrough key' -t dsa2
  • From the sending system, add the public key from this keypair to the list of authorized keys for the rootuser on the receiving system:
    ssh root@ receiving.system' cat >>/root/.ssh/authorized_keys' < /root/.ssh/id_dsa.pub
On the receiving system, as the root user, create a directory in which to receive the dump, say /dump :
install -m 700 -u root -g root -d /dump
Make sure this directory resides on a filesystem with sufficient free space.

On the dumping machine, the following settings need to be configured in /etc/sysconfig/kdumpon the dumping system:
KDUMP_RUNLEVEL=3
KDUMP_TRANSFER="scp /proc/vmcore ReceivingSystemNameOrIP:/dump/"

This will make kdump act in a manner similar to the older netdump mechanism: the capture environment will go up to runlevel 3 (where network connectivity is enabled) and will use the secure copy command scp to transfer the kernel core dump to a separate system.


2. for SLES11

add the network device to be used to the variable:  KDUMP_NETCONFIG in /etc/sysconfig/kdump.

       In order to automatically set up a network device, pass the option "auto". This is also the default.
       For a custom setup, pass a string that contains the network device and the mode (dhcp,static), separated by
       a colon, for example: "eth0:static" or "eth1:dhcp".
       If you use "static", you have to set the IP address with ip=ipspec. ipspec is <client>:<server>:<gateway>:<netmask>:<hostname>:<device>:<proto>
       as boot parameter. See mkinitrd(8) for details.

Pass the dumping method and the destination directory to the parameter: KDUMP_SAVEDIR in /etc/sysconfig/kdump
Supported methods are:

       FTP, for example "ftp://user:password@host/var/log/dump"
       SSH, for example "ssh://user:password@host/var/log/dump"
       NFS, for example "nfs://server/export/var/log/dump"
       CIFS (SMB) , for example "cifs://user:password@host/share/var/log/dump"

See also: kdump(5) which contains an exact specification for the URL format.



3. for SLES12

For SUSE Linux Enterprise Server 12 this is documented in the System Analysis and Tuning Guide chapter:

17. Kexec and Kdump




NOTE:
When calculating the needed value for crashkernel, the number of dm-devices is important.
For each dm-device attached to the server an extra 4 MB is needed !
A too low configured value will cause the server to hang when booting in the crashkernel. See also tid 7010542

Additional Information

Limitations

Kdump is not supported for Xen kernels prior to SLES 11 Service Pack 2.

On IBM System z, kdump is not supported prior to SLES11 Service Pack 3.
On the IA-64 architecture, kdump is only supported as of SLES 10 Service Pack 1.

Related documentation
Configuring a IBM System z Linux server for a coredump

Before SLES11 SP3, a different methodology has to be used for capturing kernel crash dumps on IBM System z.
It is documented in IBM's documentation Linux on System z.

SLES 10
Using the Dump Tools
http://public.dhe.ibm.com/software/dw/linux390/docu/l26cdt02.pdf


SLES 11
Using the Dump Tools
http://public.dhe.ibm.com/software/dw/linux390/docu/l3n5dt11.pdf


(
and needs the s390-tools package to be installed.)

Disclaimer

This Support Knowledgebase provides a valuable tool for SUSE customers and parties interested in our products and solutions to acquire information, ideas and learn from one another. Materials are provided for informational, personal or non-commercial use within your organization and are presented "AS IS" WITHOUT WARRANTY OF ANY KIND.

  • Document ID:3374462
  • Creation Date: 10-Jan-2008
  • Modified Date:11-Aug-2022
    • SUSE Linux Enterprise Desktop
    • SUSE Linux Enterprise Server

< Back to Support Search

For questions or concerns with the SUSE Knowledgebase please contact: tidfeedback[at]suse.com

SUSE Support Forums

Get your questions answered by experienced Sys Ops or interact with other SUSE community experts.

Join Our Community

Support Resources

Learn how to get the most from the technical support you receive with your SUSE Subscription, Premium Support, Academic Program, or Partner Program.


SUSE Customer Support Quick Reference Guide SUSE Technical Support Handbook Update Advisories
Support FAQ

Open an Incident

Open an incident with SUSE Technical Support, manage your subscriptions, download patches, or manage user access.

Go to Customer Center