Novell is now a part of Micro Focus

AppNote: Writing NetWare Loadable Modules (NLMs) as Shared Libraries

Novell Cool Solutions: AppNote
By Russell Bateman

Digg This - Slashdot This

Posted: 17 Jun 2004

Russell Bateman
Senior Software Engineer
Server-library Development

This AppNote discusses how to write NetWare Loadable Modules (NLMs) that serve as shared libraries in the NLM environment.


Introduction to Single-process Address Spaces
The Elements of a Well-Architected NLM Library
Tricks in Writing Dual-Ring NLMs
Sample Code
Topics Library development, NLM development, kernel and low-level code, DLLs, shared libraries, POSIX
Products NetWare 5, NetWare 6, NetWare 6.5
Audience Developers
Level Advanced
Prerequisite Skills   Familiarity with NLM programming
Operating System      NetWare 5 and above
Tools NDK
Sample Code Yes


There are three traditional veins of NetWare Loadable Module (NLM) writing: (1) low-level kernel extensions such as drivers, protocol stacks, namespaces, and so on; (2) applications; and (3) libraries. Of the three, libraries have been the most vague and contradicted topic of discussion over the 15-year history of NLM writing. This topic was addressed by an extensive AppNote in May 2003 ( The present article updates that one for more POSIX-oriented work.

Linking LibC's prelude results in support for a new classification of three different approaches to library writing including coding a main, coding a _NonAppStart and coding a DllMain (– la Windows).

In fact, there is now a new way put in place to be closer to the UNIX world. Presently, there is no way to react to the GNU __attribute__((constructor/destructor)) phenomenon, so we recognize and permit the existence of _init and _fini in place of DllMain or main in an NLM as was the custom in certain quarters of the UNIX world.

As noted in the earlier article, LibC has offered dlfcn.h interfaces to aid in the porting of applications and libraries from other platforms. The functionality is simple:

  • Call dlopen with the path to the desired library.
  • Call dlsym for each symbol to be consumed.
  • Call dlclose once finished.

The POSIX solution, while tolerating the use of DllMain, does not encourage the use of any other windows.h solution, to wit, LoadLibrary. Moreover, in general support of the dlfcn.h solution, you should employ symbol prefixing as explained in the earlier article.

The Single-process Address Space
Every process on UNIX (or Linux) abides within its own address space. That is in fact the essence or meaning of "process containment." A call to fork creates a copy of the calling process in a new process or address space separate from the first except for shared resources.

NetWare has long suffered from a lack of process containment. With increased POSIX compliance, however, a solution to the lack of containment was inevitable and as we contemplated fork, the protected address space seemed the best candidate for hosting the concept. And so, the solution is to be found in a new, "single-process address space." This is a ring 3, protected address space set up to hold only one process consisting of an application NLM plus consumed libraries.

The behavior of this new NetWare concept is simple: any number of library NLMs may be loaded before or during execution of one, single application NLM—defined as any NLM sporting a main entry point.

NLM libraries may be autoloaded or loaded using dlopen. They may themselves be implemented using the UNIX _init and _fini or the Windows DllMain approach.

As soon as an NLM is loaded that contains a main, it is considered to be the application and no other NLMs with a main may be loaded. Execution of the first active thread then begins and the autoloaded libraries' initialization code is called in order and, once all that code has reported in complete and successful, main is executed.

In order to create a single-process address space instance, the NLM(s) are load protected in the usual way:

MYSERVER: load protected address space = FOO mynlm

All NLMs participating in this scenario must, however, obey certain rules when linked. First, they must link a recent version of LibC's prelude object containing POSIX_Start, POSIX_Exit and POSIX_CheckUnload. Statements to the linker formerly specifying _LibCPrelude, _LibCPostlude and _LibCCheckUnload are replaced with these. The check-unload function is, as always, purely optional and, in fact, we presently discourage it for use as part of a POSIX solution since it has no counterpart on other platforms.

(Note: As of this writing, the new POSIX features are in libcpre.o, but we have not ruled out the possibility of relegating this prelude object to a legacy role and placing the new POSIX features in a second prelude object to avoid confusion. Check out the June NDK read-me on this topic.)

As already noted, the best way to write a library NLM that participates in this environment is using the UNIX or Windows methods and this is the very subject of this article.

Backward Compatibility with Kernel Execution
Should the application and its library suite be loaded in the kernel (ring 0), they will still work as in this case and at a certain point execution under POSIX_Start is ultimately redirected to the old _LibCPrelude function and start-up proceeds as usual. The _init and _fini approach is still supported in that environment. The restriction against more than one NLM with a main is, of course, no longer in force at that point.

The reason behind this is to continue legacy use of NLMs–an expectation as old as the NetWare 5.0 protected address space in which any NLM or NLMs that run in ring 3 will also run in ring 0.

fork and exec interfaces are projected in NetWare's future. It is impossible to support these calls in the kernel. If you call fork from the kernel, it will return -1 and set errno to ENOSYS. If your NLM application must run both in the kernel and you want to use fork, you must code conditionally and handle a great number of difficult differences using procve or procxe in the kernel. We do not advise doing this. If your application consumes fork, it probably comes from another platform like UNIX/Linux where you ran in user space. There is little reason not to run in user space on NetWare too.

Which Development Environment?
The supported environments remain CodeWarrior, Watcom and gcc. Gcc can build a Linux executable that is then filtered by utility nlmconv into a NetWare-loadable Module (NLM). If Watcom is to be used, we suggest using the more recent OpenWatcom implementation.

The Elements of a Well-architected NLM Library

Outside its actual utility, the elements of a well-written NLM library are:

  1. Initialization
  2. Interface signatures and prefixing
  3. Insulation against "data incest" (cross-contamination of clients)
  4. Data instancing
  5. Attach and detach mechanisms and safeguards
  6. Kernel and protected address space support

#2 and #3 are well enough covered in the previous article that we won't spend the time here. Since our focus is on the protected or user address space, we'll also relegate the kernel-environment discussions of #6 to the last article. Writing libraries to run in the kernel is more of a traditional NetWare approach and has been well covered. The remaining items do, however, need some comment as well as frequent contrasting with the traditional kernel-load setting.

Initialization of a so-called POSIX user-space library is very simple. The opportunity is there to allocate and initialize global data in place of solving the age-old problem of instancing it as must be done in the kernel or even in a protected address space when it is not a single process address space.

Use DllMain or _init as the code settings for initialization. DllMain has long been a possibility and _init becomes a possibility with the new POSIX prelude. (It will not be a possiblity in the legacy prelude if we decide to split the two feature sets for simplicity.) As already noted, the POSIX prelude is perfectly capable of functioning in the kernel as well as the single-process address space.

Note: If the library must run in the kernel as well as in user space, then instance data must be maintained just discussed in the earlier article.

Otherwise, global application data can be just that: globals in C or C++ since NetWare will refuse to load more than one instance of an application per single-process address space. If you can afford to distribute, install and manage separate binaries for the two purposes (protected address space and kernel usage), then certainly the user mode library will be more stable, predictable and accurate in its behavior as well as being easier to write.

What differentiates a single-process address space from simply loading in a NetWare protected address space is the existence of the POSIX prelude. Anytime an NLM linked with the POSIX prelude is loaded protected, this situation is considered to be in force. Therefore, while so-called POSIX NLMs are backward-compatible with running in the kernel, they are not so compatible with a protected address space and cannot be loaded there without spoiling it. If such an NLM is loaded in an existing protected address space, at best it will sit there uninitialized until an NLM linked with the POSIX prelude is loaded and its main executes. However, from that point on instability may ensue depending on what other NLMs in that address space are doing to interact with the POSIX NLM suite. We don't specifically prohibit this scenario, but we discourage it.

Last, it is not excluded that NetWare fail to load a library implementing both DllMain and _init. As of this writing, it is prohibited. Do not do this as it leads to confusion even if, for whatever reason, the library is unable to detect your attempt to fool it.

Also, do not mix in an implementation of _NonAppStart. This has been discussed in How to Write Start-up Code for NLMs and How to Write NetWare Loadable Modules as Dynamic Libraries. Again, the _NonAppStart approach has no counterpart in POSIX. Nevertheless, it is not impossible nor dangerous to load such an NLM–first or last–into a single-process address space where it could behave as a library. It is just not recommended.

Data Instancing
As implied, the problem of application data instancing does not occur in the single-process address space (just as it does not in Windows or on UNIX).

Yet, there is still a problem of data instancing at the thread level since enforcing a single process per address space does not also confer the ability to use global variables for thread-specific data.

As part of global start-up (DLL_PROCESS_ATTACH or the call to _init), a key can be allocated using pthread_key_create. The result is saved in a global or static variable and then used every time a library function needs to deal with thread-specific data for the calling thread. The specific data is fetched by calling pthread_getspecific with the allocated key or set onto the thread using pthread_setspecific. This was covered in the previous article.

The nice thing about single-process address space libraries is that none of the interfaces from LibC's nonstandard library.h need to be used unless you are trying to write a dual-mode NLM. Instead, such a library is coded very nearly the same as it would be on other platforms.

Attach and Detach Mechanisms and Safeguards
In Windows, which has the most formal specification of dynamically loaded libraries (or DLLs), DllMain is a function coded by the library developer to handle several messages, including:


DllMain is called by client applications (a client application is one that calls LoadLibrary to connect to the dynamically linked library whose pathname is specified in the argument) at opportune times and asked to perform these essential tasks.

In process attach, the library has the opportunity to allocate a block of memory associated with the calling application. (I discussed doing this under the heading "Application Data" in the original article.) On Windows, this association is handled transparently because Windows libraries are not written in the kernel, but in a protected address space (also referred to as user address space). All memory allocated belongs transparently to the calling application because the Windows operating system instances the DLL for each of its client applications. NetWare does this too, but only for the single-process address space library.

In practice, the DllMain message DLL_PROCESS_ATTACH is synonymous with _init and therefore identical to DLL_NLM_STARTUP since there is only one process that can attach. (Of course, this is NOT the case outside the single-process address space and, at the risk of repeating myself, if your library is to load in the kernel, you must take this into account using the information from the May 2003 article.) Note too that both of these messages will be issued, so protect yourself against double initialization.

Then DLL_PROCESS_DETACH and DLL_NLM_SHUTDOWN are the same and an opportunity for the library to deallocate any process instance data created for its client. This may include memory, synchronization objects like mutexes, and per-thread keys. Again, all the caveats for process-attach and NLM-start-up apply to these messages.

DllMain is still called with DLL_THREAD_ATTACH and DLL_THREAD_DETACH only under specific circumstances that can be easily gotten around, so it's useful to retain the GetOrSetInstanceData call illustrated in the older article even if only to invoke pthread_getspecific.

Tricks in Writing Dual-Ring NLMs

As already noted, dual-ring NLMs are those written to load indiscriminately in the kernel or in a protected address space on NetWare. Remember that if your library NLM is to serve both environments, it cannot rely on the implicit per-application data-instancing available in the single-process address space, but must manage this just as a traditionally written library does.

Sample Code

I won't repeat the interfaces exported by the sample library of the earlier article. Here are DllMain, _init and _fini. Remember, you need _init and _fini as a set or just DllMain, but not both sets.

Sample Code and Commentary
Let's examine some definitions. Here's the internal header the library uses. Another header, foolib.h, mere contains the prototypes for foobar and __foo_errno, the only symbols we export.

#ifndef __private_h__
#define __private_h__

typedef struct
   int	thrX;
   int	thrErrno;
} thrdata_t;

// static data...
extern void	    *gModuleHandle;  // NLM handle
extern pthread_key_t  gKey;            // our per-thread data solution

// internal library function prototypes...
int GetOrSetInstanceData( int id, thrdata_t **thrdata );
int DisposeThrData      ( thrdata_t *data );


DllMain or _init can be used, but not both:

#include <windows.h>	// (only if writing DllMain)
#include "private.h"

pthread_key_t	gKey = 0;

int DllMain		// don't mix with _init and _fini
   void           *hinstDLL,
   unsigned long  fdwReason,
   void           *lvpReserved
   static int	initted = FALSE;

   switch (fdwReason)
      case DLL_THREAD_ATTACH :
      case DLL_THREAD_DETACH :
	** Nothing to do here: calls to pthread_getspecific or
	** pthread_setspecific are permitted from any thread in the
	** NLM without doing anything and any data allocated on a key
	** will be disposed of pursuant to pthread key/value data-pair
	** semantics by DisposeThrData.
         return TRUE;

      case DLL_NLM_STARTUP :
         if (initted)
            return TRUE;

         initted = TRUE;

	gModuleHandle = lvpReserved;

	if (pthread_key_create(&gKey, (void (*)(void*)) DisposeThrData))
	   return FALSE;

	return TRUE;
      case DLL_NLM_SHUTDOWN :
         if (!initted)
            return TRUE;

	initted = FALSE;
	return TRUE;

  return FALSE;

static int initted = FALSE;    // a little protection--probably unnecessary

int _init( void ) // don't write this if DllMain is to be used
    int err;

    if (initted)
        return 0;

    gModuleHandle = getnlmhandle();

    if (err=pthread_key_create(&gKey, (void (*)(void *)) DisposeThrData))
        return err;

    initted = TRUE;
    return 0;

int _fini( void ) // don't write this without implementing _init
    if (initted)

    return 0;

Here are the exported entry points consuming thread-specific data.

int foobar( void )
   int	err;

   if (err = GetOrSetInstanceData((thrdata_t **) NULL))
      foo_errno = err;
      return -1;

   ** Do stuff that foobar must do as a library function
   ** whatever this is...
   return 0;

int *__foo_errno( void )
   thrdata_t   *data;
   static int  MINUS_ONE = (-1);

   ** This library errno is implemented here only to show how a library
   ** might make use of one key in to use to store any thread-specific
   ** data that might be needed, like an errno.
   return (GetOrSetInstanceData(&data))
                  ? &MINUS_ONE
                  : &data->thrErrno;

Here's GetOrSetInstanceData which only handles thread-specific data. Code for such a function also handling application instance data can be found in the library stationery that ships with the CodeWarrior PDK for NetWare and perhaps also in our on-line documentation. This appears to have been overlooked in the earlier article.

int GetOrSetInstanceData
   thrdata_t	**data
   int         err;
   thrdata_t   *t = (thrdata_t *) NULL;

   t = (thrdata_t *) NULL;

   err = pthread_getspecific(gKey, (void **) &t);

   if (err || !t)         // thread-specific data as yet...
   {                      // ...unallocated for this thread
      t = malloc(sizeof(thrdata_t));

      if (!t)
         return ENOMEM;

      pthread_setspecific(gKey, t);

   *data = t;

   return 0;

int DisposeThrData
   thrdata_t	*data
   if (data)

   return 0;

Novell Cool Solutions (corporate web communities) are produced by WebWise Solutions.

© Copyright Micro Focus or one of its affiliates