GET request to any Access Gateway Service protected resource just shows "Server error!" message

  • 7017003
  • 19-Nov-2015
  • 19-Nov-2015

Environment

NetIQ Access Manager 4.1
Access Gateway Service on Red Hat Enterprise Linux version 6.7
Problem only exists with authentication enabled - public resources work fine

Situation

NAM 4.1 installed with the Access Gateway Service running on RHEL 6.7. Created a simple reverse proxy accelerating a sample application and all worked fine in public mode. As soon as we enabled a contract for our protected resource, users would get a "Server error!" message on the browser.

Looking at the error_log file in debug mode, we would see many references to "mkusr:fail". This occurs at the point where we should be creating the users session cookie (which we set in the corresponding 302 redirect to the ESP) but it if failing.

Nov 17 17:16:42 dvlnamag01 httpd[5984]: [info] Initial (No.1) HTTPS request received for child 21 (server dvlnamag01.novell.com:443)
Nov 17 17:16:42 dvlnamag01 httpd[5984]: [debug] ../mod_auth_liberty.c(707): AMEVENTID#180: Host Header is dvlnamag01.novell.com
Nov 17 17:16:42 dvlnamag01 httpd[5984]: [debug] ../mod_auth_liberty.c(714): AMEVENTID#180: total no of Host are 1, r->hostname = dvlnamag01.novell.com, r->server->server_hostname = dvlnamag01.novell.com
Nov 17 17:16:42 dvlnamag01 httpd[5984]: [debug] ../prerror.cpp(637): AM#604600000 AMDEVICEID#ag-0DB39C50EE09059E: AMAUTHID#: AMEVENTID#180: Requ: GET https://dvlnamag01.novell.com/prweb/?a  service:Base (10.229.10.36:59125->10.27.73.48:443)
Nov 17 17:16:42 dvlnamag01 httpd[5984]: [error] (-1)Unknown error 18446744073709551615: AMEVENTID#180: mkusr:fail:03000300000000000000000000000000611b7ac0
Nov 17 17:16:42 dvlnamag01 httpd[5984]: [info] AM#504600000 AMDEVICEID#ag-0DB39C50EE09059E: AMAUTHID#: AMEVENTID#180: validateCookie:local user.
Nov 17 17:16:42 dvlnamag01 httpd[5984]: [info] AM#504600100 AMDEVICEID#ag-0DB39C50EE09059E: AMAUTHID#: AMEVENTID#180: Restricted URL
Nov 17 17:16:42 dvlnamag01 httpd[5984]: [info] AM#504600000 AMDEVICEID#ag-0DB39C50EE09059E: AMAUTHID#: AMEVENTID#180: matched PR:root
Nov 17 17:16:42 dvlnamag01 httpd[5984]: [info] AM#504600401 AMDEVICEID#ag-0DB39C50EE09059E: AMAUTHID#: AMEVENTID#180: lredir1:https://dvlnamag01.novell.com:443/nesp/app/plogin?c=name/password/uri&%22https://dvlnamag01.hban.us/prweb/?a%22



The version of RHEL is custom built and locked down

Resolution

It was a permissions issue. The "wwwrun" id was pre-created but without the novlag group - adding the wwwrun user to the vovlag group (/etc/groups) fixed the issue

novlag:!:1001:wwwrun,novlwww

The RHEL server was heavily locked down for security reasons, with all users created on the system having minimal rights. Some of these users are needed for the AG specific service and were impacted as a result.

Cause

The issue was identified using the Linux strace command - running it against that Apache process when the issue occured showed the following errors over and over again where we are failing to open an AGSCD socket in /var/opt/novell/ag/sc.

26007 connect(13, {sa_family=AF_LOCAL, sun_path="/var/opt/novell/ag/sc/socket"}, 30) = -1 EACCES (Permission denied)
26007 shutdown(13, SHUT_RDWR)           = 0
26007 setsockopt(13, SOL_SOCKET, SO_LINGER, {onoff=1, linger=15}, 8) = 0
26007 close(13)                         = 0
26007 socket(PF_LOCAL, SOCK_STREAM, 0)  = 13
26007 connect(13, {sa_family=AF_LOCAL, sun_path="/var/opt/novell/ag/sc/socket"}, 30) = -1 EACCES (Permission denied)
26007 shutdown(13, SHUT_RDWR)

It would appear that we're running into a UID problem for write access - updating the above directory to "chmod -R 777 /var/opt/novell/ag/sc" initially seemed to fix the problem, but drilling down into the users/groups and rights allowed us to fix it with a better solution.