Login times taking between 2.5 and 3 minutes

  • 3013441
  • 11-Jul-2006
  • 30-Apr-2012

Environment

Novell Client for Windows 2000/XP/2003 4.91 Support Pack 2
Novell Client for Windows 2000/XP/2003 4.91 Support Pack 1
Novell Client for Windows 2000/XP/2003 4.91
Novell Client for Windows 2000/XP/2003 4.90 Support Pack 2
Novell Client for Windows 2000/XP/2003 4.91 Support Pack 1a
Microsoft Windows XP Professional Windows XP Professional Support Pack 2

Situation

The workstations are all joined in a Microsoft AD domain. Login times with the Novell Client32 installed are 2.5 to 3 minutes. Login times without Client32 installed take only 1 to 2 seconds.
Watching the logon traffic with Ethereal:
  1. On login, the MS client makes a request to a Domain Controller for a Kerberos TGT (ticket-granting ticket)
  2. The client get an answer back from the Domain Controller
  3. If the Novell Client32 is installed, it waits for 15 to 30 seconds with no network traffic at all, then repeats the request/response/pause pattern several more times.
  4. If the Novell client is not installed, it continues on with the login process immediately.
This happens both with or without "workstation only" checked, using domain authentication.
The machine tested is a new Dell workstation, configured with a Broadcom gig NIC. It's plugged into a D-Link Ethernet 10/100 desktop switch, which connects to a 100mbit Cisco switch.
The problem is in the MS Kerberos client and involves the way it uses UDP by default. The conflicts with a seemingly unrelated setting in Client32. The MS Kerberos reply packet continues the usual Kerberos stuff, plus a bunch of MS specific information tacked on to the end, like group memberships. This tends to balloon the reply packet.
Without the Novell client installed, the reply packet seems to be around 1661 bytes for this particular person, in this domain. It comes in one UDP packet
With the Novell client installed, the reply packet seems to come in at 1400 bytes, with the fragmentation bit set in the headers. No following fragment is seen in the trace. This leads to a packet that Kerberos can't use so it backs off 5/10/20/30 seconds and tries again, over and over, until finally giving up

Resolution

Things that help:
  1. Adjust the SLP Maximum Transmission Unit setting in the Novell Client. The default of 1400 bytes seems to bring on the fragmented reply packet. Changing this to 1700 bytes helps.
  2. Change the D-Link switch. Either connecting directly to the Cisco switch, or via an intermediate Cisco desktop switch helped. It is not clear why this would matter.
  3. Change the MS Kerberos client to work over TCP. This can be done on the machine, or via a Group Policy in the domain. Once changed, the Kerberos reply packet will come in fragmented, but all fragments will be seen, and reassembled by the stack, so the Kerberos client will work.

Additional Information

As long as the SLP MTU is larger than the UDP packet size required for the Kerberos reply packet, login will happen quickly. But since the UDP packet size will gradually increase over time as users are added to groups, eventually this problem will be seen. If Client32 is installed, it will be seen sooner, but even without Client32 it will be seen eventually.
This seems to be a known problem in the MS world. Once the Kerberos/TCP setting was found, doing a Google search produced other sites that have found that as the Domain expands there is a threshold at which domain login traffic slows down dramatically.
It appears that the only reason the Novell Client is involved is that the SLP MTU setting actually cranks down the UDP MTU on the machine, exposing the problem much earlier than it may be otherwise seen. This could potentially affect other UDP applications as well, if they do not correctly handle packet fragmentation.