Novell Home

My Favorites

Close

Please to see your favorites.

DSFW: xadsd and rpcd hang when a cthread is canceled, causing the server to become unresponsive

This document (7013412) is provided subject to the disclaimer at the end of this document.

Environment

Novell Open Enterprise Server 11 SP1 (OES11SP1)
Domain Services for Windows
DSfW

Situation

xadsd and rpcd hang when a cthread is canceled
xadsd becomes unresponsive
DSfW becomes unresponsive

Resolution

Cause

2 daemons in DSfW - rpcd and xadsd - link to dcerpc libraries.
This results in call thread (cthread) creation in the respective daemon.
By default, the count of cthreads in a daemon is constant - 10 in xadsd, 5 in rpcd.
In some environments both the xadsd and rpcd hang situation has been observed.

When a hang is observed, it is noticed that the count of cthread of that particular daemon goes down by 1. The one instance of cthread that went down is holding the mutex - cthread_mutex. So the cthread_mutex is now in a locked state forever.

2 important mutexes are - cthread_mutex and rpc_g_global_mutex (global mutex).
One of the receiver thread after receiving all the queued data, will attempt to trigger a cthread (for the RPC execution). To do this, the receiver thread having already holding the global mutext (rpc_g_global_mutex), attempts to lock the cthread_mutex. Since the cthread_mutex is locked forever, the receiver thread ends up waiting on the mutex indefinitely which will hold the lock to the global mutex (rpc_g_global_mutex).
With both the mutexes getting into the locked state indefinitely, this results in the hang of respective daemon.

Disclaimer

This Support Knowledgebase provides a valuable tool for NetIQ/Novell/SUSE customers and parties interested in our products and solutions to acquire information, ideas and learn from one another. Materials are provided for informational, personal or non-commercial use within your organization and are presented "AS IS" WITHOUT WARRANTY OF ANY KIND.

  • Document ID:7013412
  • Creation Date:04-OCT-13
  • Modified Date:04-OCT-13
    • NovellOpen Enterprise Server
    • SUSESUSE Linux Enterprise Server
    • NetIQeDirectory

Did this document solve your problem? Provide Feedback