RADIUS users are timing out when authenticating.
(Last modified: 05Sep2005)
This document (10061640) is provided subject to the disclaimer at the end of this document.
Novell BorderManager Authentication Services 3.5
Novell BorderManager Authentication Services 3.6
Using RADIUS with or without RADIUS Proxy
3rd Party Network Access Server (NAS)
RADIUS users are timing out when authenticating.
Users are eventually getting "Access Accepted", however the connection times out and they are unable to login
In the RADIUS debug log (SYS:ETC\RADIUS\DEBUG\RADDBG.log), seeing the following messages:
- "Special Q Handling, Message dropped"
- "Inserting into RespQ , code(2) id(184)"
- "Reusing previous message in queue (Q is full)"
RADIUS has a built in intelligence mechanism for handling duplicate request for the same user, (specifically, Intelligent handling of retires) so that it does not attempt to authenticate the same user multiple times. If RADIUS receives a duplicate packet, it will drop the duplicate packet and increase the "Special Q Handler" counter. This is symptomatic of either NDS name resolution problems -or- settings specific to the NAS.
To resolve this, we did the following:
1- Increased Accounting retry interval (on the NAS) from 20 to 60 seconds and Authentication retry interval from 3 to 15 seconds.
2- Loaded RADIUS with the AFFINITY=<preferred replica servername>
NOTE: This new feature for BMAS 3.5 and later, allows you to specify a "preferred replica server" which holds replicas of most/all user partitions. If the RADIUS server does not have a local R/W replica for a user or service, it will first try to resolve the request on the server specified in the Affinity setting before walking the tree to find it.
3- Increased the THREADS parameter form 5 to 20 and increased the AUTHENTICATION THREADS from 3 per socket to 10 per socket.
Note: The authentication threads parameter is new to BM 3.6 and is included in RADIUS v3.20 and later.
NOTE: Details on the threads and authentication threads:
RADIUS v3.20 and later, contains a performance enhancement which by default improves the number of authentications per second from approximately 7 to approximately 33. This enhancement is contained in a new LOAD RADIUS 'authentication threads' option:
- AUTHTHREADS=<number of threads>
The default setting is 3, but this can be set as high as the server is capable of handling. In comparison, the default authentication threads setting in BMAS 3.5 (RADIUS v3.13 and earlier) was 1. This option defines the number of threads listening on the authentication socket.
This setting does not affect the accounting socket and is different from the THREADS=<number or threads> LOAD RADIUS parameter. The 'Threads' parameter handles threads that actually process the authentication requests, which by default is set to 5.
Specifically, focus on increasing the retry interval settings (NAS) and try bumping the Authentication threads to 10. (The special Q handing messages relate directly to the Authentication Threads.)
Q1- If the same requests comes in (a retransmit) from either the original proxy server or another proxy server for the same user account, does BMAS 3.5 drop those additional requests immediately or does it let it accumulate in the queue (assuming all threads are already busy processing something)?
A1- RADIUS has "intelligent handling of retries" built in which actually drops the duplicate requests immediately. In other words, the duplicate requests are not stored in a queue.
Q2- If the BMAS 3.5 drop those requests, is that why I am seeing the Radius debug msg of :
"Special Q Handling, message dropped (total=14162 or some other number in here)"
A2- Correct. If RADIUS receives a duplicate request, it increments the "duplicate packet counter" which is reflected in the "Special Q Handling" messages in RADDBG.log.
Q3- If those access-request msgs are not dropped but rather queued, how big is the radius queue and at point does it become full? I have also noticed msgs in the bmas debug saying that "(Q is full)".
A3- Not applicable, as the messages are not actually queued.
Q4- With regards to user authentication, I know Novell has stated that a full context along with the username results in faster resolving of the name and authentication process. If lookup contexts are used (configured in the Dial Access System attribute) and knowing that they are cached according to Novell's documentation on BMAS by the radius NLM, should there be that much of a difference in name resolution/authentication when I use just the username compared to a full context that is typed in along with the username for resolution?
A4- Using the full distinguished username is always going to be faster. When NDS knows exactly which context the user is in, it can resolve the name more quickly. The Lookup Context is more of a convenience feature, so that users can authenticate with their common name and are not required to enter a long context.
As you may have noticed in the RADDBG.log (and have mentioned from the documentation), RADIUS does indeed cache and prioritize lookup contexts based on which context receives the most requests for authentication. RADIUS will then search the tree beginning with the "top" container in the Lookup Context cache. If it does not find it there, it moves to the next container in cache, and so on... So, if your user is in the second container in cache, this process is essentially twice as slow. (Yet, keep in mind that we are measuring this in milliseconds, so it is not a huge factor.)
Q5- When a user logs in for the first time (and is able to login) using just the username (without the full context), does anything get cached on the RADIUS process with regards to any of this user's information (such as its context, its radius attributes, etc) so that is readily available for a duration of time if the user logs in again, or is there a new search for this user's context and properties overtime it comes in and requests authentication?
A5- There is no checking on user information or attributes. It goes through this new process each time the user requests to authenticate.
Q6- I have noticed that if there exists a GroupWise external object and a user with the same name (but under different OU's), when the Radius server walks the tree, if it finds the external GroupWise object first, it will try to use the object to validate the authentication which ultimately results in a -603 error because the external GroupWise object does not have dial access attributes associated with it. Why would the Radius server consider a GroupWise external object for possible resolution of an authentication request? Is this because the Radius server does not care what object class it is but simply looks for a matching name?
A6- You are absolutely correct... RADIUS does not care which object class the user is in. It merely performs a common name search for the user.
Q7- How does the Radius server walk the tree to get authentication information? For an example, I have a user and a GroupWise external object under different OU's. The radius server owns the master partition which contains the user (but does not reside within the [root] partition), the GroupWise external object is under the [root] partition but does not have a replica located on the radius server itself. I have noticed that when the same name is used for authentication request, the Radius server seems to walk and find the GroupWise external object first even though it has a local replica of the user object. I have to assume then that the Radius walks from the [root] partition down, but which partition would it hit next? In conjunction to this question, once the Radius finds a matching user for the authentication request, does it search other contexts specified for authentication for the same object name OR does it stop after the first match?
Actually, RADIUS looks first in the partition holding the DAS object before it begins to walk the tree. It "should" find the user in the local replica (if one is present) then it will perform a (query) DS Resolve Name request over NCP (TCP, UDP and IPX). Any servers holding a replica with this username, will then respond. RADIUS will ACK the first response it receives and verify the user is enabled for RADIUS. RADIUS then tries to authenticate the user with the information it received from the remote replica. Although I wouldn't recommend having duplicate usernames in your tree, I believe RADIUS will continue it's search for the valid user. (The Affinity setting mentioned above has proved to be very useful. Rather than randomly querying for a remote replica holding the user's partitions, you can put partitions where the RADIUS users reside on the replica server and point Affinity to this replica server. RADIUS treats this server as a "preferred replica server" when making NDS Resolve Name requests.) .
The Origin of this information may be internal or external to Novell. Novell makes all reasonable efforts to verify this information. However, the information provided in this document is for your information only. Novell makes no explicit or implied claims to the validity of this information.
Any trademarks referenced in this document are the property of their respective owners. Consult your product manuals for complete trademark information.
- Document ID:
- Solution ID: NOVL44952
- Creation Date: 09Apr2001
- Modified Date: 05Sep2005
- NovellBorderManager Services
Did this document solve your problem? Provide Feedback