Excessive Xtier Process Errors in System Log

  • 7002867
  • 30-Mar-2009
  • 27-Apr-2012

Environment

Novell Open Enterprise Server 1 (OES 1) Support Pack 1 Linux

Situation

Novell Xtier server writes excessive amounts of error messages to syslog possibly filling up the /var partition or mount point.  The errors seen are:

Jan 20 07:35:18 OES1SRV novell-xsrvd-6[31313]: XSrvD -ServiceConnections- Unable to bind socket, error = 13
Jan 20 07:35:18 OES1SRV [XTCOM]: novell-xsrvd: Server re-started after it terminated unexpectedly
Jan 20 07:35:18 OES1SRV novell-xsrvd-0[31362]: XSrvD -ServiceConnections- Unable to bind socket, error = 13
Jan 20 07:35:18 OES1SRV [XTCOM]: novell-xsrvd: Server re-started after it terminated unexpectedly

Resolution

NOTE:  On OES 2 this problem was fixed with new runlevel scripts that set the directory permissions every time the processes start.  On OES 1 SP2, the latest updates, novell-xtier-base-3.1.5-20070821_114026.sles9 and novell-xtier-web-3.1.5-20070821_114026.sles9, contain fixes for the problem.

Root cause is that the UID and GID of the xtier users were changed but the directory permissions have not been updated.

To verify, check the directory /var/opt/novell/xtier and see if the entries are owned by the xtier users and the xtier group.  Correct output will look like:

# ls -la /var/opt/novell/xtier
  total 1
  drwxrwx---   4 novlxregd novlxtier 96 May 29  2007 .
  drwxr-xr-x  13 root root 312 May 19  2008 ..
  drwxrwx---   3 novlxregd novlxtier  104 Oct 18 18:51 xregd
  drwxrwx---   2 novlxsrvd novlxtier 1168 Oct 18 18:52 xsrvd

Incorrect output will look like:

# ls -la /var/opt/novell/xtier
  total 1
  drwxrwx---   4 104 103 96 May 29  2007 .
  drwxr-xr-x  13 root root 312 May 19  2008 ..
  drwxrwx---   3 104 103 104 Oct 18 18:51 xregd
  drwxrwx---   2 103 103 1168 Oct 18 18:52 xsrvd

Once root cause has been verified, check if you have the xtier users UID and GID available via lum:

# id novlxsrvd
  uid=104(novlxsrvd) gid=103(novlxtier) groups=103(novlxtier),8(www)
# id novlxregd
  uid=103(novlxregd) gid=103(novlxtier) groups=103(novlxtier)

If the correct UID and GID names are returned, the xtier process may simply need to be restarted.  If the UID and GID names do not return, then the cause may be LUM-related.  Please refer to TID 3280667 - Troubleshooting Linux User Management for details.

After restarting the xtier process, if the permissions are still incorrect, manually change the permissions on those directories:

# chown -R novlxregd.novlxtier /var/opt/novell/xtier
# chown -R novlxsrvd.novlxtier /var/opt/novell/xtier/xsrvd

Then, verify the new permissions:

# ls -la /var/opt/novell/xtier
  total 1
  drwxrwx---   4 novlxregd novlxtier   96 May 29  2007 .
  drwxr-xr-x  13 root      root       312 May 19  2008 ..
  drwxrwx---   3 novlxregd novlxtier  104 Oct 18 18:51 xregd
  drwxrwx---   2 novlxsrvd novlxtier 1168 Oct 18 18:52 xsrvd

And restart the xtier processes again:

# rcnovell-xsrvd restart
# rcnovell-xregd restart

Verify that the xtier socket files were created:

# ls -la /var/opt/novell/xtier/xsrvd
  total 0
  drwxrwx---  2 novlxsrvd novlxtier 368 Jan 20 07:37 .
  drwxrwx---  4 novlxsrvd novlxtier  96 Oct 19  2007 ..
  srwxrwxrwx  1 novlxsrvd novlxtier   0 Jan 20 07:37 srv-socket-0
  srwxrwxrwx  1 novlxsrvd novlxtier   0 Jan 20 07:37 srv-socket-1
  srwxrwxrwx  1 novlxsrvd novlxtier   0 Jan 20 07:37 srv-socket-2
  srwxrwxrwx  1 novlxsrvd novlxtier   0 Jan 20 07:37 srv-socket-3
  srwxrwxrwx  1 novlxsrvd novlxtier   0 Jan 20 07:37 srv-socket-4
  srwxrwxrwx  1 novlxsrvd novlxtier   0 Jan 20 07:37 srv-socket-5
  srwxrwxrwx  1 novlxsrvd novlxtier   0 Jan 20 07:37 srv-socket-6
  srwxrwxrwx  1 novlxsrvd novlxtier   0 Jan 20 07:37 srv-socket-7
  srwxrwxrwx  1 novlxsrvd novlxtier   0 Jan 20 07:37 srv-socket-8
  srwxrwxrwx  1 novlxsrvd novlxtier   0 Jan 20 07:37 srv-socket-9
 
And now the syslog messages should also be gone.