Environment
Novell Client for Windows 2000/XP/2003 4.91 Support Pack
2
Novell Client for Windows 2000/XP/2003 4.91 Support Pack
1
Novell Client for Windows 2000/XP/2003 4.91
Novell Client for Windows 2000/XP/2003 4.90 Support Pack
2
Novell Client for Windows 2000/XP/2003 4.91 Support Pack
1a
Microsoft Windows XP Professional Windows XP Professional
Support Pack 2
Situation
The workstations are all joined in a Microsoft AD
domain. Login times with the Novell Client32 installed are
2.5 to 3 minutes. Login times without Client32 installed take
only 1 to 2 seconds.
Watching the logon traffic with Ethereal:
- On login, the MS client makes a request to a Domain Controller for a Kerberos TGT (ticket-granting ticket)
- The client get an answer back from the Domain Controller
- If the Novell Client32 is installed, it waits for 15 to 30 seconds with no network traffic at all, then repeats the request/response/pause pattern several more times.
- If the Novell client is not installed, it continues on with the login process immediately.
This happens both with or without "workstation only" checked,
using domain authentication.
The machine tested is a new Dell workstation, configured with
a Broadcom gig NIC. It's plugged into a D-Link Ethernet
10/100 desktop switch, which connects to a 100mbit Cisco
switch.
The problem is in the MS Kerberos client and involves the way
it uses UDP by default. The conflicts with a seemingly
unrelated setting in Client32. The MS Kerberos reply packet
continues the usual Kerberos stuff, plus a bunch of MS specific
information tacked on to the end, like group memberships.
This tends to balloon the reply packet.
Without the Novell client installed, the reply packet seems to
be around 1661 bytes for this particular person, in this
domain. It comes in one UDP packet
With the Novell client installed, the reply packet seems to
come in at 1400 bytes, with the fragmentation bit set in the
headers. No following fragment is seen in the trace.
This leads to a packet that Kerberos can't use so it backs off
5/10/20/30 seconds and tries again, over and over, until finally
giving up
Resolution
Things that help:
- Adjust the SLP Maximum Transmission Unit setting in the Novell Client. The default of 1400 bytes seems to bring on the fragmented reply packet. Changing this to 1700 bytes helps.
- Change the D-Link switch. Either connecting directly to the Cisco switch, or via an intermediate Cisco desktop switch helped. It is not clear why this would matter.
- Change the MS Kerberos client to work over TCP. This can be done on the machine, or via a Group Policy in the domain. Once changed, the Kerberos reply packet will come in fragmented, but all fragments will be seen, and reassembled by the stack, so the Kerberos client will work.
Additional Information
As long as the SLP MTU is larger than the UDP packet size
required for the Kerberos reply packet, login will happen
quickly. But since the UDP packet size will gradually
increase over time as users are added to groups, eventually this
problem will be seen. If Client32 is installed, it will be
seen sooner, but even without Client32 it will be seen
eventually.
This seems to be a known problem in the MS world. Once
the Kerberos/TCP setting was found, doing a Google search produced
other sites that have found that as the Domain expands there is a
threshold at which domain login traffic slows down
dramatically.
It appears that the only reason the Novell Client is involved
is that the SLP MTU setting actually cranks down the UDP MTU on the
machine, exposing the problem much earlier than it may be otherwise
seen. This could potentially affect other UDP applications as
well, if they do not correctly handle packet fragmentation.