7.6 Troubleshooting Crashes and Hangs

7.6.1 The Access Gateway Hangs When the Audit Server Comes Back Online

When the Platform Agent loses its connection to the audit server, it enters caching mode. The default size of the audit cache file is unlimited. This means that if the connection is broken for a long time and traffic is high, the cache file can become quite large. When the connection to the audit server is re-established, the Platform Agent becomes very busy while it tries to upload the cached events to the audit server and still process new events. When coming out of caching mode, the Platform Agent appears unresponsive because it is so busy and because it holds application threads that are logging new events for a long period of time. If it holds too many threads, the system can appear to hang. You can minimize the effects of this scenario by configuring the following two parameters in the logevent file.

Table 7-5 Parameters for the logevent File

Parameter

Description

LogMaxCacheSize

Sets a limit to the amount of cache the Platform Agent can consume to log events when the audit server is unreachable. The default is unlimited.

LogCacheLimitAction

Specifies what the Platform Agent should do with incoming events when the maximum cache size limit is reached. You can select one of the following actions:

  • Delete the current cache file and start logging events in a new cache file.

  • Stop logging which preserves all entries in cache and stop collecting new events.

When you set a finite cache file size, it limits the number of events that must be uploaded to the audit server when caching mode is terminated and keeps the Platform Agent responsive to new audit events that are registered.

For more information about the logevent file and these parameters, see Logevent.

7.6.2 Access Gateway Crashes When the Log Files Are Removed

If you have enabled the debug level of logging for the laghttpheaders and the lagsoapmessages log files and these files grow to be over 200 MB, manual deletion of these files can cause the Access Gateway to crash.

To solve the problem, restart the Access Gateway after manually deleting the files.

7.6.3 Troubleshooting a Failed Linux Access Gateway Configuration

If the IP address and other network configurations are not reflected in the installed Linux Access Gateway, log in as a root user and run the following commands:

rm /opt/novell/legacy/etc/proxy/.novell_lag_lock
/etc/init.d/novell-vmc stop
/etc/init.d/novell-vmc start 

7.6.4 Troubleshooting a Linux Access Gateway Crash

The Linux Access Gateway might have crashed because of the following reasons:

  • SIGSEGV

  • ASSERT (for a debug build only)

The following sections explain how to gather the files that need to be sent to Novell for a resolution of the problem.

Linux Access Gateway Logs

  1. Enter the following command from the bash shell to collect the debug log files that are generated:

    /chroot/lag/opt/novell/bin/getlaglogs.sh
    
  2. The laglogs.tgz tar file is located in the /var/log directory.

  3. Send this tar file to Novell® Support.

Event Log

By default the event log size is 15 MB. The size of event log can be controlled by configuring the required event log size in the eventlogsize.cfg file, located at the /chroot/lag/etc/opt/novell directory. For example, if you specify 350 in the file, you can configure an event log of size 350 MB. This file should contain only the file size information. This file should not contain any other characters or new lines.

The procedure for obtaining the event log depends upon the build type:

Event Log for a Production Build

To get the event log for the production build:

  1. Log in as the root user.

  2. To disconnect all instances of Linux Access Gateway, enter the following command:

    /etc/init.d/novell-vmc stop

  3. Enter the following command to change the root environment:

    chroot /chroot/lag

  4. To start the process, enter the following command:

    gdb /opt/novell/bin/ics_dyn 2>/var/log/ics_dyn.log

  5. At the GDB prompt, run the following command:

    run -m <memory>

    Where <memory> is the percentage of total memory to be used for ics_dyn process. It is recommended to set this value in the range of 20-30 percent.

  6. Repeat the scenarios to reproduce the issue.

    1. If you are trying to reproduce the proxy crash, you see the GDB prompt as soon as the crash is reproduced.

    2. If you are trying to reproduce a functionality issue, press Crtl+C to enter the GDB prompt as soon as the issue is reproduced.

      For a list of commands that can be entered in the debugger, see Useful Debugger Commands.

  7. To save event logs to a file, enter the following command:

    d ,save 1
    

    This stores all the events in the /chroot/lag/opt/novell/debug/<pid>all_events.0.txt file.

  8. Tar or Zip this file and send it to Novell Support.

Event Log for a Debug Build

To get the event log:

  1. Log in as the root user.

  2. To stop all instances of Linux Access Gateway, enter the following command:

    /etc/init.d/novell-vmc stop

  3. To start the Novell Linux Access Gateway in debugging mode, enter the following command:

    /etc/init.d/novell-vmc gdb

  4. To run the Linux Access Gateway process, enter the following command at the GDB prompt:

    run -m <memory> 2>/var/log/ics_dyn.log

    Where <memory> is the percentage of total memory to be used for ics_dyn process. You should set this value with a range of 20-30 per cent.

  5. Repeat the scenarios to reproduce the issue.

    1. If you are trying to reproduce the proxy crash, you will enter the GDB prompt as soon as the crash is reproduced.

    2. If you are trying to reproduce a functionality issue, enter the following command to enter the GDB prompt as soon as the issue is reproduced:

      Crtl+C

      NOTE:For a list of commands that can be entered in the debugger, see Useful Debugger Commands.

  6. To save all event logs to a file, enter the following command:

    d ,save 1
    

    This stores all the events in the /chroot/lag-debug/opt/novell/debug/<pid>all_events.0.txt file.

  7. Tar or zip this file and send it to Novell Support.

Useful Debugger Commands

Table 7-6 GDB Commands

Command

Function

gcore

Generate core file

k

Kill process

q

Quit GDB prompt

bt

Print the back trace

Core Dump

Before you begin, make sure there is free space in root to hold the core file and that the space is at least equal to the RAM size

To collect a core dump:

  1. Log in as the root user.

  2. To disconnect all instances of the Linux Access Gateway, enter the following command:

    /etc/init.d/novell-vmc stop

  3. At the bash prompt, specify the following command:

    touch /tmp/.dumpcore

  4. Enter the following command to start the Linux Access Gateway:

    /etc/init.d/novell-vmc start

  5. Repeat the scenarios to reproduce the issue.

    The core is dumped to the /chroot/lag core.<pid> file.

    <pid> is the process ID of ics_dyn process.

    After the core is dumped, the Linux Access Gateway restarts.

  6. Tar or zip the core dump send it to Novell Support.

Proxy Hang Core

To analyze the proxy hang and create a core file:

  1. Enter the following command to change the root environment:

    chroot /chroot/lag

  2. Enter the following command to attach the ics_dyn process to the debugger:

    gdb /opt/novell/bin/ics_dyn <pid>

    Where <pid> refers to the Process ID of the ics_dyn process.

  3. At the GDB prompt, enter the following command:

    set logging on <filename>

    Where <filename> specifies the name of the file that will store the output of the executed debugger commands.

  4. Enter the following command to collect a stack trace of all threads:

    thread apply all bt

  5. Enter the following command to turn off logging:

    set logging off

  6. Enter the following command to save the core dump in the /chroot/lag directory.

    gcore

    The core dump is saved as core.<pid>.

  7. Tar or zip this file and send it to Novell Support.

Packet Capture

The tcpdump utility allows you to capture network trace packets.

  1. Log in as the root user.

  2. Enter the following command:

    tcpdump -s0 -n -t -p -i ‘any’ -w filename.cap

  3. Tar or zip this file and send it to Novell Support.

7.6.5 Linux Access Gateway Not Responding

If the Linux Access Gateway is not responding, do the following:

  1. Enter the following command to change the root environment:

    chroot /chroot/lag

  2. Enter the following command to attach the ics_dyn process to the debugger:

    gdb /opt/novell/bin/ics_dyn <pid>

    Where <pid> refers to the process ID of the ics_dyn process. You can get the process ID by entering the following command:

    pgrep ics_dyn

  3. At the GDB prompt, enter the following command:

    set logging file <filename>

    Where <filename> specifies the name of the file that will store the output of the executed debugger commands.

  4. Enter the following command to start logging:

    set logging on

  5. Enter the following command to collect a stack trace of all threads:

    thread apply all bt full

  6. Enter the following command to turn off logging:

    set logging off

  7. Enter the following command to save the core dump in the /chroot/lag directory.

    gcore

    The core dump is saved as core.<pid>.

  8. Tar or zip this file and send it to Novell Support.