Get the most out of Novell Open Enterprise Server 2.

  • 7006996
  • 05-Oct-2010
  • 06-Dec-2012

Environment

Novell Open Enterprise Server 2 (OES 2) Linux

Situation

As it was the case with NetWare, Open Enterprise server, when installed out of the box, is tuned for a default environment.
As most environments are not default, it requires some initial tuning to match the setting of core services to your environment.

Considerations for improving and increasing Novell Open Enterprise Server 2 performance and scalability by tuning OES2, to get the most out of your Novell Open Enterprise Server2.
Initial steps to address poor performance or occasional, intermittent sluggishness of Novell Open Enterprise Server 2.

This TID can be considered as a place holder to general tunings and pointers to other TIDs that also focus on the core File services that come with Novell Open Enterprise Services.
This TID does not pretend to be complete or suited for all environments, as each environment is unique compared to an other environment, it merely points to a couple things to consider when installing and tuning Novell Open Enterprise 2 Servers.
In order to tune the additional services, please consult the documentation, including TIDs and CoolSolutions regarding the specific product.

Resolution

During the installation phase:

Novell Open Enterprise Server 2 (OES2) is made available as an add-on for SuSE Linux Enterprise Server 10 (SLES10).
However make sure that the versions match each other; OES2 was made available for SLES10SP1, OES2SP1 was made available for SLES10SP2.
When downloading the OES2 add-on, verify for which version of SLES10 it is intended.
It is however recommended to use the latest versions available for download that match each other.

As the 64-bit architecture takes more advantage of most current hardware and is able to access more memory directly it is recommended to install Novell Open Enterprise 2 x86_64 when the hardware allows this. Though make sure that all services you intent to run on Novell Open Enterprise Server are compatible and capable of running on the 64-bit architecture.

Regarding the choice of file system underneath the services hosted on the Novell Open Enterprise Server 2, it is recommended to think trough if the Novell Storage Services (NSS) rights system and rich metadata is needed or would a POSIX file system be sufficient.
For instance, GroupWise does not need the vast metadata of NSS, the overhead of the file system may even cause delays in GroupWise. Additionally, GroupWise uses several temporary files in it's directory structures, which could cause a large amount of purgeable data, which at it's turn may cause sluggishness in NSS.
When GroupWise is installed on a NSS Volume, it is strongly recommended to disable salvage at least for the queue directories for all agents.

Most databases do not take any advantage of Novell Storage Services metadata. Several vendors of Databases have a preferred file system therefor it is recommended to verify which file system to use with the appropriate vendor.
At the other hand, NSS may be the choice for users and groups home-directories, as it allows controllable, gradient and inherited access levels.

A POSIX file system can be exported as a NCP volume, to the end-user this is presented as a regular Novell Storage Services volume, though without the additional functionality like salvage.
A Posix volume can even be used with Novell Cluster Services.

During the installation or lifespan of the server, do not create a local linux user named 'admin' or the same as the eDirectory [root] administrative account.
If such a user is created during the installation phase of the server, it can interfere with the configuration and installation process of the server.
More details on this can be found in TID 7008635.



During the lifespan of the server:

As security and performance updates are released over the update channels regularly it is strongly recommended to keep the servers code up to date with the code available on the update channels.



NDSD (eDirectory):

Configure the NDS agent so it preallocates it's cache and not actively allocate memory, as this may cause memory fragmentation and other unwanted phenomena.
This can be accomplished by going trough these steps:

1.  Open Novell iMonitor -- https://[server ip address]:8030
2.  Under "Links", select "Agent Configuration", then "Database Cache".  This is where we will make the adjustments.
3.  At the top of the screen, note the "DIB Size (KB)" value.
4.  Scroll down to "Database Cache Configuration".   There are three sections here.  To hard limit DS, click the "Hard Limit" radio button under the "Database Cache configuration".  In the text field, enter the required limit.  For most environments, double the number obtained from the previous step, round up and use that. For instance for a DIB of 1364 set it to 2800 (1364*2=2729 ~ 2800) with a minimum of 200MB.
If your DIB size is 300MB or larger, rather than double the DIB size, just add roughly 200MB.  It is required to use a number greater than the current DIB Size.  Increasing that number allows for growth, though do not exceed a maximum of 1,5 GB for 64 bit, 1GB for 32 bit as Cache Maximum Size. Make sure though the server has twice the size of free system memory available. (ie. Setting the Maximum Cache Size to 1GB implies the server has at least 2 GB free System memory).
5.  Towards the bottom of the page, check the box marked "Cache Settings Permanent", then click "Submit". If this check-box is not ticked, the tuning is not saved, and will be lost with the next restart of the ndsd.
6.  When "preallocation" was not enabled before reconfiguring the nds Cache tuning, it is required to at least restart the ndsd and the ncp2nss

As the DIB size on the servers can grow over time, it is recommended to revisit these pages every now and then to verify if the preallocated cache is still sufficient.

When in doubt in configuring the preallocation for the ndsd, or if the server is also running IDM, it is recommended to contact Novell Technical Service (NTS) or a Novell Consultant.


If the local server does not have a local replica or has only few replicas, so it relies on other NDS servers for it's eDirectory information, it is strongly recommended to enable Advanced Referral Costing (ARC), which is disabled by default.
This can be enabled by loading ndstrace as root, then execute 'set NDSTRACE =!ARC1'

The fewer replicas a server has, the more benefit you gain from ARC, even servers that have replicas can gain from this being enabled.
More on Advanced Referral Costing can be found here.



NCP services:

On a Novell Open Enterprise Server the NCP is a forked thread for the ndsd, therefor it is required to tune the ndsd to match your environment.
The main tuning in this matter consists in making sure that the server can have enough threads. This is achieved by tuning the n4u.server.max-threads.
The basic rule of fist is to have 1 thread per actively reading or writing client connection. Though as this parameter can be somewhat interpreted as the "Maximum Service Processes" of NetWare where you merely limit the amount of threads the server can start, not the amount of threads it initially starts, there is no harm tuning this to the maximum, provided the server has sufficient CPU and Memory available; each started thread uses additional memory.
This value must be a value dividable by 4, and can be as high as 512.

To determine if the server's n4u.server.max-threads is sufficient, execute this command as root:
ndstrace -c threads
This results in a summary, in this screen the Pool Workers will show us a Peak value, this value should not be higher than the value of n4u.server.max-threads.
If the Peak value of the Pool Workers is the same or close to the n4u.server.max-threads, tune this parameter, using this command as root:
ndsconfig set n4u.server.max-threads=[new value]

Using Novell Remote Manager, there are a couple more general NCP server improvements you can implement.
Log into NoRM, via https://[server ip address]:8009. Then in the left hand pane open the "Manage NCP Services" and click on the "Manage Server" option.
After the "Server Parameter Information" page opens you can click on the values in order to tune them.
These are a couple server tunings to consider:

Maximum_Cached_SubDirectories_Per_Volume tuned to 300000 (or higher)

Maximum_Cached_Files_Per_Volume tuned to 120000 (or higher)
Maximum_Cached_Files_Per_Subdirectory tuned to 6000 (or higher)


Additional, since the January 2011 Maintenance Update for OES2, it is possible to increase the concurrent NCP threads.
It is recommended to regularly verify if the server is not running out of concurrent async requests, preferably when the server has a couple days uptime.
This is accomplished by executing this command as root:
ncpcon threads
This results in a screen with the  NCP THREAD STATISTICS
If the value of the "Thread Peak Size" in the "Async (eDir) Threads and Requests Statistics" section is close to or matches the "Max Thread Size" it is strongly recommended to increase the value of the CONCURRENT_ASYNC_REQUESTS to a higher value. OES2 has a default value of 15, 50 would be a descent new initial value.
This is accomplished by executing this command as root:

ncpcon set CONCURRENT_ASYNC_REQUESTS=50


More information on this can be found in TID 7007134"OES2 NCP Engine upcoming changes and NCPCON new Functionality".

TID 7004888 and 7004848 will also be able to shed a light on and assist in tuning NCP.

Changing the settings and tuning of NCP should be on the fly, and do not require any process or the server to be restarted. However, if you do not notice the change, you way want to restart the ndsd and ncp2nss deamons.



CIFS

As of Novell Open Enterprise Server 2 SP3 and the "January 2011 Scheduled Maintenance Patch" for Novell Open Enterprise Server 2 SP2 CIFS Cache usage was made tunable.
The logic of tuning CIFS is basically the same as the logic for Tuning NCP, however, when CIFS becomes sluggish over time, these are a couple suggested values to tune Novell CIFS:
  • Maximum Cached Subdirectories Per Volume: 1024000
         This is settable by executing "novcifs -k SDIRCACHE=1024000" as root.
  • Maximum Cached Files Per Subdirectory: 102400
         This is settable by executing "novcifs -k DIRCACHE=102400" as root.
  • Maximum Cached Files Per Volume: 2048000
         This is settable by executing "novcifs -k FILECACHE=2048000" as root.

If you however are experiencing inaccessible CIFS shares, CIFS stops listening or communicating, becomes unresponsive or the novell-cifs daemon hangs or other CIFS or novell-cifs daemon related issues, please check TID 7008956 "Troubleshooting and Debugging CIFS on Open Enterprise Server".



NAMCD (LUM):


As several services rely on the Novell Authentication Module for their authentication to the eDirectory it is recommended to tune this so it uses preferably the local server if this has a local replica, or a server on the local subnet that does have a replica of the needed tree partitions.
By default the "preferred-server" is set to the first Novell Open Enterprise Server that was installed in the eDirectory tree, so there is a huge chance this parameter requires adjustment.
To get the current namcd configuration execute as root:

namconfig get


If the preferred-server value does not point to the local ip address, or an ip address of a server on the same physical subnet, this can be changed with the following sequence of commands as root:

namconfig set preferred-server=[
local ip address (*)]
namconfig -k
namconfig cache_refresh

(*) When in doubt use 'ndsconfig get' and use the ip address listed in n4u.server.interfaces without the @524.

Alternativly, if the server does not have a local replica, it is also possible to add one or a couple alternative LDAP servers.
Adding a single alternative LDAP server is possible by executing this command as root:

namconfig set alternative-ldap-server-list=[ds-server01]

Adding more than one alternative LDAP server is possible using a comma separated list and can be accomplished by executing the following command as root:

namconfig set alternative-ldap-server-list=[ds-server01],[ds-server02],[ds-server03]

The usage of alternative LDAP servers for LUM can also be used to make LUM more scalable.

It is also recommended to turn persistent search off and cache-only on.
This can be accomplished by executing these commands as root:

namconfig set persistent-search=no

namconfig set cache-only=yes


As the namconfig cache_refresh includes a restart of the namcd there is no need to restart this deamon to enable these changes.
For the other changes and tunings to be enabled, it is recommended to restart the namcd.



NSS.

Although it is required to tune Novell Storage Services to the requirements of the environment it is being used, these are some tunings worth considering:
- Increase the NSS IDCacheSize to 128K, this can be accomplished by executing, as root:

nsscon /idcachesize=131072


- Dissable the Access Time by executing the following line as root:

nsscon /noatime=[volume name]


- The Unplugalways parameter should be tuned to the Default value from OES2SP1 or later, which is disabled. To disable it execute as root:

nsscon /nounplugalways


In order to make these tunings persistent they must be added -as by default they are not there- to the /etc/opt/novell/nss/nssstart.cfg, though make sure to make no typos in this file, as they can cause the novell-nss to fail to start.
Make sure that all unmarked entries in this file start with a "/" and not a "nss /".

Using nssmu, launched as root, increase the "read ahead blocks" for all nss volumes to 64. If over time this is insufficient, this can be increased to 128. It is not really recommended to go beyond this value, as it may have a negative impact on the performance.
This change is activated "on the fly", and does not require a server restart.

Furter information regarding tuning NSS performance can be found in the online documentation.

During the lifespan of a NSS Volume it is recommended to preserve at least 20% free space, of which half is not purgeable data. If a volume drops below these thresholds the background process that takes care of purging the oldest data may become a more high priority thread, and cause higher I/O load, sluggishness.
When a volume nearly has no true free space but a lot of purgeable data there needs to be a purge action for each write action to disk, which can cause several unwanted phenomena like high disk I/O or system load or even data corruption or loss.
For this reason it is recommended for NSS volumes hosting services with a high usage of temporary files to mark either the folders used for temporary storage or the volume as "purge immediate", basically disabling salvage for those directories or the volume.
When a volume passes these thresholds it is recommended to either expand the pool and volume, delete obsolete data but a temporary measure may be to manually purge the volume.
This can be accomplished by executing the following command as root:

ncpcon purge volume [volume name]



SMS (tsafs / backup)

In case you are suffering from slow performing back-up, or regressing back-up speed, the documentation for optimizing sms on OES2 can be found here.



UDEV:

In case of lots of ndpapp errors in /var/log/messages, or "Device /dev/ndp not ready" errors in /var/log/boot.msg reconfigure /etc/sysconfig/udev so it matches:
UDEVD_MAX_CHILDS=1024
UDEVD_MAX_CHILDS_RUNNING=1024

This change requires a reboot of the system.
More information on this can be found in TID 7004877



NCS:

All the previous steps should also address most issues seen with OES2 servers running Novell Cluster Services and their cluster-enabled resources.
Non the less, if you are suffering from random split brains, without a physical reason (LAN or SAN outage), it would be advisable to investigate if there are time-jumps (back and or forward) caused by the CPU. Clock Jumps can occur on physical and virtual CPU's and is not restricted to either of those.
A potential workaround would be to add "clock=hpet" as a boot parameter, and rebooting the system.

More information on this is captured in TID 7005916 though this mentions a couple specific CPU and hardware models, it may affect different and later versions of the hardware too. Most common are the servers that have a "green" bios, that has a build in power management with the ability to stop and start CPUs or it's cores outside the knowledge of the OS, though other systems can be affected too. 
Forcing the server to use HPET as it's time source has no significant impact.



PKI (Certificate Server):

During the installation of any OES server, a couple default Server Certificates are generated.
These are by default valid for 2 years. After this period, when the certificates expire the server and all servers that use the server as LDAP source will no longer be able to access the required eDirectory information and all services that rely on it will fail.
Therefor it is recommended to validate the Default Server Certificates when some or all services of a particular server fails as one of the troubleshooting steps. If the failing server is using a different server as preferred_server, or LDAP source, that particular server's Default Certificate Material should be checked too.
The LDAP servers that the server might be using are listed under /ets/sysconfig/novell/ldap_servers/ though it is recommended to also check each service to determine which LDAP server it is using, as they may differ.

A method of validating the Default Certificates of a server is via iManager.
This can be accomplished under the "Novell Certificate Access", the "Server Certificates" task.

If one or more certificates is deemed invalid, the next step would be to repair these.
This task can be accomplished under "Novell Certificate Server", the "Repair Default Certificates" task.

More details on this can be found in TID 7000075


In case the LDAP server being used is a Novell NetWare server, these tasks can also be accomplished using the pkidiag.nlm


After the Certificates were replaced or repaired, it is recommended to at least restart the ndsd (rcndsd restart) so the server is using the new certificate for it's LDAPS communication and run a namconfig -k to update the certificate that namcd is using.


A more sustainable option might be to enable "Self-Provisioning" for the Certificate Authority (CA) of the tree.
This feature is described here in the Novell Documentation.
Enabling this for the CA of the tree, this feature is enabled for all servers in the tree that use this CA.
As soon as the server or it's ndsd is restarted it will be aware that "Server Self Provisioning" is enabled.

With this feature enabled, the certificates that are expired or about to be expired are extended automatically when executing either:
ndstrace
unload pkiserver
load pkiserver
or
rcndsd restart



Anti-Virus scanners:

When an Anti-Virus suite is installed, try to avoid the usage of "on access" scanning, as this can create a severe overhead.

The Anti-Virus suite, running either locally or remotely, should be excluded from scanning system crucial directories like the ._NETWARE of all NCP exported filesystems (both NSS and POSIX), /_admin and the Linux System directories.

In case Novell Cluster services is installed,  /admin should also be excluded from being scanned by the anti-virus.

In case Novell GroupWise is installed, all repository and queue directories should be excluded from being scanned by the anti-virus as well. Novell GroupWise stores it's information in an encrypted way, so scanning the physical files stored on disk is unnecessary and can even cause severe problems like file locking or even corruption.
There are several Anti-Virus suites that can scan inside the Novell GroupWise mail storage and / or incoming e-mail.

Additional Information

This information is also available for Novel Open Enterprise Server 11 as TID 7003585.