Access Gateway Web server healthcheck fails to check status of all back end web servers

  • 7012561
  • 06-Jun-2013
  • 06-Jun-2013

Environment

NetIQ Access Manager 3.2
NetIQ Access Manager 3.2 Access Gateway

Situation

Access Manager setup with the Access Gateway (AG) accelerating large number of back end web servers. Healthcheck reports that many of these backend web servers health remains unchecked and we see the following message for each back end:

"Worker connectivity not checked"

The end result is that some web server that are actually down, are not reported as down within the iManager healthcheck.

What seems to happen is that the healthcheck starts to check the health of each resource and after it hits two or three that fail, it does not check any additional resources below that, even though most of the services are up and running with users hitting them.

Resolution

Fixed in 3.2 SP2.

Cause

There was a blocking call using a 2 sec timeout  for each tcp connection made to each backend webserver. The Total timeout assigned for each Web server Health operation was 8 seconds. If the AG is not able to connect to all the web server within 8 seconds, it would not check for rest of backend servers.