Capturing an application core for a segfaulting process in Filr.

  • 7012489
  • 28-May-2013
  • 03-Jun-2013

Environment

On Novell Filr, in the default configuration, a crashing process will log the event if a segfault has occurred, but it will not capture the crashing process in an application core which is required for analysis of the crash.

The method to capture a crashing process on the Novell Filr appliance, does not differ from a regular SLES11 server, but in order to manage this on the Filr appliance, however some manual interaction is required.

Situation

A problem of a crashing process on the Filr appliance was reported, and Novell Technical Services is analyzing what is causing the crash.

Since no application core file is actually written, these are the steps to setup the system for capturing an application core.

Resolution

Enable SSH

As a first step, SSH needs to be enabled on the Filr appliance

Navigate a browser to http://<filr_ip_address>:9443 and login as 'vaadmin' and provide the correct credentials. Under 'Novell Appliance System Configuration', check for 'System Services', and enable SSH there. After the service has started, please SSH in to the appliance.

Preparing the system.

The following steps should be taken to prepare for capturing a core dump :
  • Check for and disable the limit for the maximum size of a core dump file.
  • Configure a fixed location for storing core dumps.
  • Enable core dumps for setuid and setgid processes.

Applying the requirements

  1. In a terminal window, type 'ulimit' and confirm the outcome equals 'unlimited'.  This is the default setting for a Filr 1.0 appliance. When the output is different, enter 'set ulimit -c unlimited'.
  2. type 'install -m 1777 -d /var/local/dumps'.
  3. type 'echo "/var/local/dumps/core.%e.%p" > /proc/sys/kernel/core_pattern'.
  4. type 'sysctl -w kernel.suid_dumpable=2'

To additionally make this configuration change persistent over reboots, add a line
kernel.core_pattern=/var/local/dumps/core.%e.%p
to the /etc/sysctl.conf configuration file.

When writing a core file, the kernel will expand %e to the name of the crashing process, and %p to the PID and add the same to the file name.

In forthcoming troubleshooting scenarios, it may be necessary to manually trigger the creation of a core dump. This can be done by sending a signal that generates a coredump to the process, such as for example SIGABRT.

To send SIGABRT to process 1234, run
kill -ABRT 1234

or to send SIGABRT to all running novell-xregd processes, run
kill -ABRT $(pidof novell-xregd)

These are the minimal requirements to capture a core from crashing processes.

However, in order to be able and open a core on the Filr appliance for inspection directly, the gdb package needs to be installed.

When the desire is to not configure one single location for application coredumps, by default any application coredumps are written in the path where the faulty binary resides :
Apache-Tomcat         /opt/Novell/jetty8
FAMT                        /opt/novell/filr
Lucene (Standalone) /opt/novell/search/indexserver
Jetty                         /opt/novell/jetty8
MySQL doesn't produce a core by default

Additional Information

Additional steps :

The Novell Filr 1.0 appliance does not have the gdb package installed by default, but in a scenario where multiple crashes do occur, it would be beneficial to verify if the faulty process crashes in the same code path, or if various crashes have different root causes. The gdb package is required to be able and open the core on the appliance directly.

It is possible to extract the gdb-7.3-0.6.1.x86_64.rpm package from a SLES11 SP2 ISO image, and SCP that to the /var/local/dumps/ location on the Filr appliance (future releases of Novell Filr will ship with the gdb package).

There are two methods to create a bundle of the core, as well as the relevant libraries, however, both tools already require gdb to be installed.

1. Using novell-getcore utility (separate download).
It is possible to download the novell-getcore 1.2.0.8 script and SCP that to the /var/local/dumps/ location on the Filr appliance.
Once the appliance has been prepared with the steps mentioned above, let's check for the existence of a core dump. The SLES11 SP2 gdb package, together with the novell-getcore script have been upload to the /var/local/dumps/ location on the appliance.

When the novell-getcore script has finished executing, a bundle with the core and all other required libraries is created in that directory.

2. Using getappcore script (ships with the appliance).

The script is located in /sbin/getappcore, however, since the script offers native support for uploading the coredump bundle to a Novell FTP server, by default running it also checks for the existence of /usr/bin/ftp.

As FTP is not provided with the Novell Filr appliance, and the FTP binary does not exist, this means we need to edit the /sbin/getappcore script and navigate to line 83 that says 'FTP_BIN=/usr/bin/ftp' and place a '#' in front of this to skip the FTP binary check.


Note : It is important to monitor for any core files in the /var/local/dumps location, as core files (especially when you have multiple of them), may very quickly occupy a lot of disk space. Therefor, after creating a core bundle, please copy the files from here off to another location for safe-keep purposes, and when asked to do so, these can be send to Novell Technical Support for further analysis as valuable step in aiding root cause of any crashes.

Example for setting up the pre-requisites :

vmfilr:~ # uname -a
Linux vmfilr 3.0.58-0.6.6-default #1 SMP Tue Feb 19 11:07:00 UTC 2013 (1576ecd) x86_64 x86_64 x86_64 GNU/Linux
vmfilr:~ # cat /etc/SuSE-release
SUSE Linux Enterprise Server 11 (x86_64)
VERSION = 11
PATCHLEVEL = 2
vmfilr:/var/local # install -m 1777 -d /var/local/dumps
vmfilr:/var/local # echo "/var/local/dumps/core.%e.%p" > /proc/sys/kernel/core_pattern
vmfilr:/var/local # vi /etc/sysctl.conf
vmfilr:/var/local # sysctl -w kernel.suid_dumpable=2
kernel.suid_dumpable = 2
vmfilr:/var/local # ps aux | grep famt
root      2334  0.3  0.3 733932  6752 ?        Sl   19:47   0:15 /opt/novell/filr/bin/famtd
root      5642  0.0  0.0   5716   808 pts/0    S+   20:50   0:00 grep famt
vmfilr:/var/local # kill -ABRT 2334
vmfilr:/var/local # ls /var/local/dumps/
core.famtd.2334
vmfilr:/var/local # cd /var/local/dumps/
vmfilr:/var/local/dumps # ll
total 3020
-rw------- 1 root root 623546368 May 29 20:50 core.famtd.2334
vmfilr:/var/local/dumps # file core.famtd.2334
core.famtd.2334: ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from '/opt/novell/filr/bin/famtd'
vmfilr:/var/local/dumps # gdb
-bash: gdb: command not found
vmfilr:/var/local/dumps # ll
total 3020
-rw------- 1 root root 623546368 May 29 20:50 core.famtd.2334
vmfilr:/var/local/dumps # ll
total 3056
-rw------- 1 root root 623546368 May 29 20:50 core.famtd.2334
-rw-r--r-- 1 root root     33082 May 29 21:12 novell-getcore
vmfilr:/var/local/dumps # chmod 755 novell-getcore
vmfilr:/var/local/dumps # ll
total 3056
-rw------- 1 root root 623546368 May 29 20:50 core.famtd.2334
-rwxr-xr-x 1 root root     33082 May 29 21:12 novell-getcore
vmfilr:/var/local/dumps # ll
total 3056
-rw------- 1 root root 623546368 May 29 20:50 core.famtd.2334
-rwxr-xr-x 1 root root     33082 May 29 21:12 novell-getcore
vmfilr:/var/local/dumps #


Example  1 (obtaining a core using novell-getcore) :

vmfilr:/var/local/dumps # file core.famtd.2334
core.famtd.2334: ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from '/opt/novell/filr/bin/famtd'
vmfilr:/var/local/dumps # ./novell-getcore -b core.famtd.2334 /opt/novell/filr/bin/famtd
Novell GetCore Utility 1.2.08 [Linux]
Copyright (C) 2006-2013 Novell, Inc. All rights reserved.

Error: Required utility 'gdb' not in path
vmfilr:/var/local/dumps # rpm -ihv gdb-7.3-0.6.1.x86_64.rpm
Preparing...                ########################################### [100%]
   1:gdb                    ########################################### [100%]
vmfilr:/var/local/dumps # ./novell-getcore -b core.famtd.2334 /opt/novell/filr/bin/famtd
Novell GetCore Utility 1.2.08 [Linux]
Copyright (C) 2006-2013 Novell, Inc. All rights reserved.

[*] User specified binary that generated core: /opt/novell/filr/bin/famtd
[*] Processing 'core.famtd.2334' with GDB...
[*] PreProcessing GDB output...
[*] Parsing GDB output...
[*] Core file core.famtd.2334 is a valid Linux core
[*] Core generated by: /opt/novell/filr/bin/famtd
[*] Obtaining names of shared libraries listed in core...
[*] Counting number of shared libraries listed in core...
[*] Total number of shared libraries listed in core:  53
[*] Corefile bundle: core_20130529_212012_linux_famtd_vmfilr
[*] Generating GDBINIT commands to open core remotely...
[*] Generating ./opencore.sh...
[*] Gathering package info...
[*] Including file: /etc/SuSE-release
[*] Creating core_20130529_212012_linux_famtd_vmfilr.tar...
[*] GZipping ./core_20130529_212012_linux_famtd_vmfilr.tar...
[*] Done. Corefile bundle is ./core_20130529_212012_linux_famtd_vmfilr.tar.gz

vmfilr:/var/local/dumps # ll
total 17992
-rw-r--r-- 1 root root  12929330 May 29 21:20 core_20130529_212012_linux_famtd_vmfilr.tar.gz
-rw------- 1 root root 623546368 May 29 20:50 core.famtd.2334
-r--r--r-- 1 root root   2336994 May 29 21:19 gdb-7.3-0.6.1.x86_64.rpm
-rwxr-xr-x 1 root root     33082 May 29 21:12 novell-getcore
vmfilr:/var/local/dumps #


Example 2 (obtaining a core using getappcore) :

vmfilr:/var/local/dumps # getappcore core
####################################################################
Get Application Core Tool, v1.27
Date:   06/06/13, 00:50:20
Server: vmfilr
OS: SUSE Linux Enterprise Server 11  - SP2
Kernel: 3.0.58-0.6.6-default  (x86_64)
Corefile: core
####################################################################

Binary file not provided, trying to determine source binary using gdb... Done (/opt/novell/filr/bin/famtd)
Checking Source Binary with chkbin... Done
Building list of required libraries with gdb... Done
Building list of required RPMs... Done
Building list of debuginfo RPMs...

     cyrus-sasl-2.1.22-182.20.1.x86_64.rpm   -->   cyrus-sasl-debuginfo-2.1.22-182.20.1.x86_64.rpm
     glibc-2.11.3-17.45.45.1.x86_64.rpm   -->   glibc-debuginfo-2.11.3-17.45.45.1.x86_64.rpm
     glibc-locale-2.11.3-17.45.45.1.x86_64.rpm   -->   glibc-debuginfo-2.11.3-17.45.45.1.x86_64.rpm
     keyutils-libs-1.2-107.22.x86_64.rpm   -->   keyutils-debuginfo-1.2-107.22.x86_64.rpm
     krb5-1.6.3-133.49.54.1.x86_64.rpm   -->   krb5-debuginfo-1.6.3-133.49.54.1.x86_64.rpm
     libcom_err2-1.41.9-2.9.1.x86_64.rpm   -->   e2fsprogs-debuginfo-1.41.9-2.9.1.x86_64.rpm
     libgcc46-4.6.1_20110701-0.13.9.x86_64.rpm   -->   gcc46-debuginfo-4.6.1_20110701-0.13.9.x86_64.rpm
     libldap-2_4-2-2.4.26-0.16.1.x86_64.rpm   -->   openldap2-client-debuginfo-2.4.26-0.16.1.x86_64.rpm
     libnscd-2.0.2-73.18.x86_64.rpm   -->   libnscd-debuginfo-2.0.2-73.18.x86_64.rpm
     libopenssl0_9_8-0.9.8j-0.50.1.x86_64.rpm   -->   openssl-debuginfo-0.9.8j-0.50.1.x86_64.rpm
     libsmbclient0-3.6.3-0.30.1.x86_64.rpm   -->   samba-debuginfo-3.6.3-0.30.1.x86_64.rpm
     libstdc++46-4.6.1_20110701-0.13.9.x86_64.rpm   -->   gcc46-debuginfo-4.6.1_20110701-0.13.9.x86_64.rpm
     libxml2-2.7.6-0.21.1.x86_64.rpm   -->   libxml2-debuginfo-2.7.6-0.21.1.x86_64.rpm
     novell-filr-famtd-1.0.0-13.1.x86_64.rpm   -->   novell-filr-famtd-debuginfo-1.0.0-13.1.x86_64.rpm
     novell-filr-famtmbase-1.0.0-10.1.x86_64.rpm   -->   novell-filr-famtmbase-debuginfo-1.0.0-10.1.x86_64.rpm
     novell-filr-famtmcifs-1.0.0-19.1.x86_64.rpm   -->   novell-filr-famtmcifs-debuginfo-1.0.0-19.1.x86_64.rpm
     novell-filr-famtmncp-1.0.0-23.1.x86_64.rpm   -->   novell-filr-famtmncp-debuginfo-1.0.0-23.1.x86_64.rpm
     novell-oes-samba-libtalloc2-3.6.3-17.18.9.x86_64.rpm   -->   novell-oes-samba-debuginfo-3.6.3-17.18.9.x86_64.rpm
     novell-oes-samba-libtdb1-3.6.3-17.18.9.x86_64.rpm   -->   novell-oes-samba-debuginfo-3.6.3-17.18.9.x86_64.rpm
     novell-oes-samba-libwbclient0-3.6.3-17.18.9.x86_64.rpm   -->   novell-oes-samba-debuginfo-3.6.3-17.18.9.x86_64.rpm
     novell-xplatlib-1.0.7-23.4.x86_64.rpm   -->   novell-xplatlib-debuginfo-1.0.7-23.4.x86_64.rpm
     novell-xtier-base-3.1.10-11.52.1.x86_64.rpm   -->   novell-xtier-base-debuginfo-3.1.10-11.52.1.x86_64.rpm
     novell-xtier-core-3.1.10-11.52.1.x86_64.rpm   -->   novell-xtier-base-debuginfo-3.1.10-11.52.1.x86_64.rpm
     novell-xtier-xplat-3.1.10-11.52.1.x86_64.rpm   -->   novell-xtier-base-debuginfo-3.1.10-11.52.1.x86_64.rpm
     zlib-1.2.3-106.34.x86_64.rpm   -->   zlib-debuginfo-1.2.3-106.34.x86_64.rpm

                               ... Done
Setting gdb environment variables... Done
Creating gdb startup files... Done
Creating core archive... Done
        Created archive as:  /var/log/nts_vmfilr_famtd_130606_0050_appcore.tbz
Removing required files and directories ... Done

Finished!
vmfilr:/var/local/dumps #