NetIQ Access Gateway stops accepting new client requests

  • 7016023
  • 30-Dec-2014
  • 19-Feb-2015

Environment

NetIQ Access Manager 3.2
NetIQ Access Manager 3.2 Access Gateway Appliance
NetIQ Access Manager 3.2 Access Gateway Service on Linux

NetIQ Access Manager 4.0
NetIQ Access Manager 4.0 Access Gateway Appliance
NetIQ Access Manager 4.0 Access Gateway Service on Linux

Situation

  • Access Gateway stop servicing any new client requests
    no configuration changes have been applied before the problem was showing up
  • no service pack or hotfix has been installed before the problem showing up
  • restarting the the proxy using: "/etc/init.d/novell-apache2 restart" does not solve the problem
  • the: "/var/opt/novell/nam/logs/mag/apache2/error_log" reports the following httpd error:
    [DATE] gbllidmagwp12 httpd[PID]: [error] (103)Software caused connection abort:
    cache: error returned while trying to return disk cached data
  • during the Access Gateway installation process the disk cache size has been configured for 1024 MB

Resolution

  • adjust the disk cache size defined in the: "/etc/opt/novell/ag/mod_disk_cache_monitor.conf"

  • turn off the proxy cache size monitoring process using the following Advanced Option:
    "DiskCacheMonitorStats off"

  • use the Apache "htcacheclean" tool to make sure the the cache size will be kept within a certain limit.

  • Example for running htcacheclean every 30min as a cron job:

    ################################################################################
     */30  * * * *   root /opt/novell/apache2/sbin/htcacheclean -v -t -p/var/cache/novell-apache2 -l10240M >> /var/log/messages 2>&1
    ################################################################################



  • Example running htcacheclean as deamon every 30min using an  init script for SLES:

    ################################################################################
    #! /bin/sh

    ### BEGIN INIT INFO
    # Provides:          htcacheclaen
    # Required-Start:    $network
    # Required-Stop:
    # Default-Start:     3 5
    # Default-Stop:      0 1 2 6
    # Short-Description: bar daemon, providing a useful network service
    # Description:       The bar daemon is a sample network
    #    service.  We want it to be active in runlevels 3
    #    and 5, as these are the runlevels with the network
    #    available.
    ### END INIT INFO

    # Check for missing binaries
    HTCACHECLEAN_BIN=/opt/novell/apache2/sbin/htcacheclean
    CACHESIZE="10G"
    INTERVAL="5"

    test -x $HTCACHECLEAN_BIN || { echo "$HTCACHECLEAN_BIN not installed";
            if [ "$1" = "stop" ]; then exit 0;
            else exit 5; fi; }


    # Load the rc.status script for this service.
    . /etc/rc.status

    # Reset status of this service
    rc_reset

    case "$1" in
        start)
            echo -n "starting htcacheclean Cache cleanup Deamon "
            ## Start daemon with startproc(8). If this fails
            ## the return value is set appropriately by startproc.
            startproc -l $HTCACHECLEAN_BIN -p/var/cache/novell-apache2/ -l$CACHESIZE -d$INTERVAL -n -t
            # Remember status and be verbose
            rc_status -v
            ;;
        stop)
            echo -n "Shutting htcacheclean Cache cleanup Deamon down"
            ## Stop daemon with killproc(8) and if this fails
            ## killproc sets the return value according to LSB.

            killproc -TERM $HTCACHECLEAN_BIN

            # Remember status and be verbose
            rc_status -v
            ;;
        restart)
            ## Stop the service and regardless of whether it was
            ## running or not, start it again.
            $0 stop
            $0 start

            # Remember status and be quiet
            rc_status
            ;;
        status)
            echo -n "Checking for service htcacheclean "
            ## Check status with checkproc(8), if process is running
            ## checkproc will return with exit status 0.

            # Return value is slightly different for the status command:
            # 0 - service up and running
            # 1 - service dead, but /var/run/  pid  file exists
            # 2 - service dead, but /var/lock/ lock file exists
            # 3 - service not running (unused)
            # 4 - service status unknown :-(
            # 5--199 reserved (5--99 LSB, 100--149 distro, 150--199 appl.)

            # NOTE: checkproc returns LSB compliant status values.
            checkproc $HTCACHECLEAN_BIN
            # NOTE: rc_status knows that we called this init script with
            # "status" option and adapts its messages accordingly.
            rc_status -v
            ;;
        *)
            ## If no parameters are given, print which are avaiable.
            echo "Usage: $0 {start|stop|status|restart}"
            exit 1
            ;;
    esac
    ################################################################################

Cause

The Apache disk cache has been filled up as it was configured with a very small default size of 1 GB.

Additional Information

  • With the Access Gateway Service and Access gateway Appliance the Apache mod_disk_cache module will be used for web object caching. The cache size is in general limited by the partition size of the configured cache location which is per default: "/var/cache/novell-apache2".

  • for each proxy service a separate sub-directory using the DNS name of the proxy service will be stored per default below the: "/var/cache/novell-apache2/" directory

  • the directive defining the size is: "DiskCacheMonitorCacheStoreSize" which defines the size of the cache store in megabytes is stored at:
    --------------------------------------------------------------------------------
    Linux:  "/etc/opt/novell/ag/mod_disk_cache_monitor.conf"
    Windows: "C:\Program Files\Novell\ag\ac"
    --------------------------------------------------------------------------------

  • The Access Gateway runs a background cleaning process (a htcacheclean variant)
    to check for the defined size limit that we can use.

  • The Advanced Option directive which can be used to switch this monitor process off is: "DiskCacheMonitorStats off"

    Note: Caching is enabled per default for your Access Gateway and the proxy will use
    the HTTP cache control headers provided be the web server.