Handling ndsd (eDirectory) core files on Linux and Solaris

  • 3078409
  • 09-Jan-2008
  • 17-Mar-2020

Environment

Novell eDirectory 9 for Linux
Novell eDirectory 8.8 for Linux
Novell eDirectory 8.7.3 for Linux
Novell eDirectory 8.7.3 for Solaris
Novell eDirectory 8.8 for Solaris

Situation

When ndsd crashes a complete core file will be generated if the ulimit -c is configured to a value more than 0. If not configured to a value greater than 0, then the core may be incomplete and not include the memory dump that is required for analysis. Cores will be generated in the following locations:

Systemd - SLES 12 / Redhat 7 or later.
Default corefile directory: /var/lib/systemd/coredump/

Note: By default, the cores are compressed in a .xz format and need to be extracted prior to running novell-getcore. To extract a .xz corefile, execute
'unxz -d <path to corefile/corefilename> 
eg. 
unxz -d /var/lib/systemd/coredump/core.ndsd.....xz


For more information on cores in systemd, please refer to the following:
TID 7022632 - Ndsd cores on systemd are truncated.
TID 7017137 - How to obtain systemd service core dumps.

SysVinit - SLES 11 / RedHat 6 or earlier.
Note: Cores are created in the dib directory, which by default is located:
eDirectory 8.7.3          /var/nds/dib
eDirectory 8.8.x / 9.X    /var/opt/novell/eDirectory/data/dib

If ndsd crashes and the reason is not apparent, check for a core file in the above directories. If there is no core file present, change the ulimit -c setting to unlimited.

Many Linux distributions set the ulimit value to '0' in /etc/profile or use 'ulimit -Sc 0' to prevent core files.

In order for ndsd to use this setting it is necessary to add it to the ndsd script and then restart ndsd.
Note: It could also be added to the pre_ndsd_start script as this script is sourced when ndsd loads.

Modify the /etc/init.d/ndsd script and add the following on the 2nd line directly underneath “#!/bin/bash":

ulimit -c unlimited

Resolution

To understand capturing systemd service core files on SLE12, see TID 7017137 - How to obtain systemd service core dumps.


Novell-getcore

Novell-getcore is a script used to gather and bundle the ndsd core file and all associated libraries necessary to analyze the core file.  Novell-getcore is installed as part of the NDSserv package, beginning with eDirectory 8.7.3.9 and eDirectory 8.8.2.  If you have an earlier  eDirectory version, the very first thing you should do is update eDirectory to the latest available version as the current version most likely has the fix! However, the novell-getcore script can be downloaded from https://download.novell.com. Just enter “novell-getcore” in the keyword field and click search.

Using novell-getcore to bundle core and necessary libraries:

  1. Verify GDB is installed on the eDirectory server by typing "gdb -version".  GDB is required to be installed prior to using novell-getcore.

  2. Create a bundle with novell-getcore to send to Novell Technical Support:

    NOTE: Servers running systemd use a different procedure.  By default, the cores are compressed in a .xz format and in a different location. These compressed core files need to be decompressed prior to using the novell-getcore utility: unxz -d <path to compressed core file>

    eDirectory 8.7.3 example:
    novell-getcore -b /var/nds/dib/core.#### /usr/sbin/ndsd

    eDirectory 8.8 and 9.x example:
    novell-getcore -b /var/opt/novell/eDirectory/data/dib/core.#### /opt/novell/eDirectory/sbin/ndsd

    (where ####, is the PID number of ndsd when it cored)
    Note: eDir 8.8.7 and newer may not have a pid number at the end of the core name.

    This will generate a gzip'd tar bundle in the same directory as the core file with a name like the following:

    core_YYYYMMDD_162243_linux_ndsd_hostname.tar.gz

  3. Grab a supportconfig file from the server that cored.

    On Linux use supportconfig/supportutils. If needed, it can be downloaded from the following page:
    https://www.novell.com/communities/node/2332/supportconfig-linux

    On Solaris: Use unixinfo to create a unixinfo.log.  See TID 10075466 - How to create a UNIX configuration file.

    On Solaris:  Use pstack to get the stack of the core.  EX:  pstack core > ndsd.pstack

  4. Upload the supportconfig or unixinfo.log and novell-getcore bundles to ftp://ftp.novell.com/incoming

    NOTE:  Currently novell-getcore isn't functioning on Solaris.  Please gather the core file, the pstack output and a unixinfo.log, tar them together with the SR# and upload them to the ftp server (ftp.novell.com:/incoming)


Additional Information

Sometimes the reason ndsd crashes is due to memory corruption.  If this is the case, it is necessary to add variables setting to the ndsd environment to put the memory manager into a debug state. This will help to ensure that ndsd generates a core at the time the corruption occurs so the module that caused the corruption can more easily be identified in the core.

If ndsd cores due to stack corruption, Novell Technical Support will request that you add the appropriate memory manager setting and wait for another core to re-submit.

Linux

To set the necessary memory checking variable on Linux:

Systemd - SLES 12 / Redhat 7 or later:   Modify the "env" file located in the /etc/opt/novell/eDirectory/conf directory, then restart the eDirectory instance.  ( See 2nd bullet under "Please refer to the following notes:" for details. )

MALLOC_CHECK_=3


SysVinit
- SLES 11 / RedHat 6 or earlier:  Modify the pre_ndsd_start script and the following at the very top, then restart the eDirectory instance.

MALLOC_CHECK_=3
export MALLOC_CHECK_

 
Please refer to the following notes:

  • The contents of the pre_ndsd_start script are sourced into ndsd at the time ndsd loads.  Be aware that any permanent settings will be overwritten if left in the ndsd script the next time an eDirectory patch is applied while the pre_ndsd_start script will not be modified.  For this reason changes to the 'ndsd' script itself should not be made.  This is the purpose of the pre/post_ndsd_start scripts.

  • eDirectory on SLES 12 or RHEL 7:  You must add all environment variables required for the eDirectory service in the env file located in the /etc/opt/novell/eDirectory/conf directory.

  • MALLOC_CHECK_=3 should NOT be left permanently.  Once the cores have been gathered, remove this setting from the modified script and restart ndsd. This environment variable can have a performance impact on some systems due to the increased memory checking.  In eDirectory 8.8, it will cause ndsd to revert back to using malloc instead of tcmalloc_miminal which was added to enhance performance.

    Another side effect of using MALLOC_CHECK_=3 is the possibility of increased coring.  Malloc will cause ndsd to core whenever a memory violation is detected whether or not it would have caused ndsd to crash under normal running conditions.

    To verify this ndsd environment variable is set properly while ndsd is running, do the following as the user running the eDirectory instance ('root' most of the time):
    strings /proc/`pgrep ndsd`/environ | grep -i MALLOC_CHECK_

    The command above will not work on a server with multiple eDirectory instances (or ndsd processes).  To check a particular instance find that instance's process's PID and use that directly.  For PID 12345 the command would be the following:
    strings /proc/12345/environ | grep -i MALLOC_CHECK_

    After ndsd has cored, to verify the core file had the ndsd environment variable set, do the following:
    strings core.#### | grep -i MALLOC_CHECK_

    Bundle the core with MALLOC_CHECK_=3 set as in step 2.
    For more information on Malloc check see: TID 3113982 - Diagnosing Memory Heap Corruption in glibc with MALLOC_CHECK_

  • eDirectory 8.8.5 ftf2 (patch2) the location of the pre_ndsd_start has been moved from /etc/init.d to /opt/novell/eDirectory/sbin/

Solaris

In current code, eDirectory uses libumem as the memory manager.
To configure libumem for debugging add the following to the pre_ndsd_start script at the top and restart ndsd:

UMEM_DEBUG=default
UMEM_LOGGING=transaction
export UMEM_DEBUG UMEM_LOGGING

Submit a new core with these settings in place.

Changing the location where cores files are generated

In certain situations it may be desirable to change the location where core files are generated.  By default ndsd core files are placed in the dib directory.  If space in this directory is limited or if another location is desired, the following can be done:

mkdir /tmp/cores
chmod 777 /tmp/cores
echo "/tmp/cores/core"> /proc/sys/kernel/core_pattern


This example would now generate the core. <pid> file in /tmp/cores

To revert back to placing cores in default location:
echo core > /proc/sys/kernel/core_pattern

Symbol build of ndsd libriaries

In some cases, a core file generated while running libraries with symbols included may be necessary to analyze the core.
This is particularly true when analyzing cores generated by the 64 bit version of ndsd since the parameters aren't located at a specific location.
The symbol versions of the libraries can be obtained from Novell eDirectory backline support.