Applying configuration change to Linux Access Gateway cluster shows command status in 'Pending' for up to 15 minutes

  • 7005580
  • 01-Apr-2010
  • 26-Apr-2012

Environment

Novell Access Manager 3.1 Linux Access Gateway
Novell Access Manager 3.1 Support Pack 1 applied
Novell Access Manager 3.1 Linux Novell Identity Server
Novell Access Manager 3.1 Windows Novell Identity Server
Novell Access Manager 3.1 SSLVPN Server

Situation

When doing an update to apply a configuration change to a two node Linux Access Gateway (LAG), Identity (IDP) Server or SSLVPN ESP cluster, the Status and/or Command Status on the Access Gateways page will show pending for up to 15 minutes. During this time, clicking the "Pending" link in the Command column shows that the outstanding command is "<device_name> Service Provider Refresh". The problem cannot be duplicated and seems to occur with certain updates and not others. The following list shows the events that can trigger a tomcat restart:

Installation:
- When re-importing IDP/ESP on a windows machine.

Configuration Changes (that implicitly restart tomcat, without user intervention):
- Making change to /etc/hosts entry for LAG
- When making changes to Connector keystore for SSL VPN ESP
- When making changes to Connector, Consumer, Provider keystores of IDP

Configuration Changes ( that will prompt user to restart):
- Some changes on SSL VPN configuration (timeout, encryption algm, policies, etc., )
- When changes are made to SSL VPN ESP ( redirect option)
- Policy > Extensions > Distribute Jars
- Creating IDP cluster
- When IDP protocol configuration is enabled/disabled

Resolution

Engineering is working on the issue. For now, schedule changes for the following areas to be done at times when impact of delay is minimal

- changes to /etc/hosts entry for LAG
- changes to Connector keystore for SSL VPN ESP
- changes to Connector, Consumer, Provider keystores of IDP server

The reason for the delay is that tomcat is restarted whilst the JCC (a tomcat application!) process is still processing the reconfigure command. If the JCC command processing exits abruptly due to either a tomcat restart or some other reason, the Admin Console will wait effectively for 12 minutes (10 min jcc command send timeout + 2 min reschedule time) before retrying.

Additional Information

The app_sc log file shows the following exception when the error occurs:

The log shows the following:
com.volera.vcdn.application.sc.core.DeviceInfo(M)isConfigPending
(E)<b>UNKNOWN ERROR</b><br/>An unknown error has occurred.<br/>No specific
information was received about this error.<br
/>Please check the logs for more information.<!-- y:1514
VCDNException::ErrorCode=Scheduler-100.General,ResourceBundle=r
esources.application.sc.Resource,MessageKey=Scheduler-100.General --> ::
Detailed Exception:<b>UNKNOWN ERROR</b><br/>An
unknown error has occurred.<br/>No specific information was received about this
error.<br/>Please check the logs for mor
e information.<!-- y:639
VCDNException::ErrorCode=Scheduler-100.General,ResourceBundle=resources.application.sc.Resource
,MessageKey=Scheduler-100.General --> :: Detailed
Exception:javax.naming.NamingException: com.volera.vcdn.platform.stora
ge.core.SException:
com.volera.vcdn.platform.storage.protocol.ldap.SLdapExceptionDispatcher.login
(ldaps://10.251.202.15
1:636/o=novell/cn=admin,
com.volera.vcdn.platform.storage.core.SPasswordCredentials@8697e0)(): login
(ldaps://10.251.20
2.151:636/o=novell/cn=admin,
com.volera.vcdn.platform.storage.core.SPasswordCredentials@8697e0) failed
at com.volera.vcdn.application.sc.scheduler.A$_A.A(y:1514)
at com.volera.vcdn.application.sc.scheduler.A$_A.access$000(y:2875)
at com.volera.vcdn.application.sc.scheduler.A.A(y:3294)

There is an exception thrown in the Config Update work path. Because the Config update work needs to be reschuled following this event, the rescheduling does not get done for 10 mins after the second retry, causing the delays.