Troubleshooting ZENworks for Servers install and initial load after rebooting the server

  • 7004353
  • 02-Sep-2009
  • 30-Apr-2012

Environment

Novell ZENworks for Servers 2.0.
Novell ZENworks for Servers 3.0.2 - ZfS302.
Novell ZENworks 6.5 Server Management - ZfS65.
Novell ZENworks 7 Server Management - ZfS7.
Novell Tiered Electronic Distribution (TED).
Novell Management and Monitoring Services.

Situation

Troubleshooting ZENworks for Servers install and initial load after rebooting the server.

Resolution

I.  Before installing:

1. Have all the 'Minimum requirements' of the on-line Novell ZFS Install Documentation been met? (ZFS 3:  Meeting Basic ZfS Installation Requirements ; ZSM 6.5:  Prerequisites ; ZSM 7:  Prerequisites ).

2. Is there a R/W replica for every partition that ZFS objects will be created in (during the install process) on the ZFS server (see KB 10059335 Error: "Atlas exception: unable to connect to Atlas Manager at IP address" , the FIX section has links to documentation. This is required for the Site Server (MMS) install, but not for other MMS components.  For TED / POL, a replica may not be needed for the install but the Distributor reads DS for some of it's processes and Subscribers access DS when creating NAL Application Objects for NAL Application Distributions.  A replica on the Distributor would benefit performance.  A replica on Subscribers isn't necessary, but may also benefit performance if strategically placed, esp. in WAN environments.  If the initial ZFS (MMS) install was done before the appropriate NDS replica(s) was/were installed to the server, a reinstall of ZFS (MMS) is needed (see KB 10059335 FIX item 2, previously referenced above, and Note C below).

3. Is the user admin.root (or equivalent) to the tree?  The user must have full control of and access to NDS.  NDS forces the install of the new schema to the Master Root partition so the new schema can propagate to the rest of the tree and schema is synchronized across the tree. The propagation of the schema relies on a healthy NDS.

4. Has DSREPAIR, Advanced Repair, Repair local database, with all settings default, enable 'Rebuild operational schema' (no need to enable 'Rebuild operational schema' on eDir servers; this will lock the local database and prevent logins, but will not disconnect users) been run until zero errors?  [Not all versions of DSREPAIR.NLM should be used.  Contact Novell DS Support if uncertain which ones to use.]  This should be done before and after the NetWare Support Pack is installed and just before the ZFS install (if there has been more than a few hours).  Was it run in the same health check on all servers in the ring until all reported zero errors? This should potentially be done in the Tree?  KB 3197766 'Checking the OS and DS Health for Inconsistent ZENworks behavior' can and should be used before installing ZFS products. The TID's 'Cause' information may not always be relevant or has already been addressed, but the 'Fix' process must be done before the ZFS install to verify the health of the schema before the install.  KB 3196677 has links to the OS Performance & Optimization and DS Health Check TIDs.

5. Was the SYS:JAVA directory ever renamed, and not renamed back, or restored with the original NW OS install of the 1.2 directories and files included? Do not rename the SYS:JAVA directory to install a new JAVA version. The NW OS install installs items that are not installed with the JVMxxx.EXE install. Copy the \JAVA directory to a secure location and restore the original when needed.

6. Was the SYS:PUBLIC\MGMT\CONSOLEONE\1.2 directory ever renamed, and not renamed back, or restored with the original NW OS install of the 1.2 directories and files included? Do not rename the SYS:PUBLIC\MGMT\CONSOLEONE\1.2 directory to install a new C1 version. The NW OS Install installs items that are not installed with the C1.EXE install. Copy the SYS:PUBLIC\MGMT\CONSOLEONE\1.2\*.* /s directory to a secure location and restore the original when needed. If problems occur with the SYS:PUBLIC\MGMT\CONSOLEONE\1.2\, delete it and restore the original directory structure and start fresh. The install path for C1 should always end with the \1.2 folder.

7. With an RCONSOLE or RCONJ session active during the install, activity or errors on the server can be monitored and captured to provide Novell Technical Support, if engaged.

8.  During the install, choose 'NO' to starting services on the server.  They can be started later.

9. After the install is complete but before restarting the server, which is recommended, remark out all the ZFS Load commands that were installed. The 'search add' commands and BSTART.NCF need not be remarked out. Also confirm 'LOAD CONLOG.NLM MAXIMUM=100' (or greater if needed) is loading in the AUTOEXEC.NCF before INITSYS.NCF or the NIC drivers are loaded and bound. Do not unload CONLOG.NLM in the AUTOEXEC.NCF as it will miss the needed information.  Specifically, be sure to remark out in the AUTOEXEC.NCF file the SLOADER.NCF to prevent it from loading when the server is restarted.  Let MWSERVER.NCF and NETXPLOR.NCF run at least over night or for a weekend before runningSLOADER.NCF .  This allows the discovery modules to 'flesh out' the discovery database (NETXPLOR_0.DAT ) file with the details of the discovered hardware before SLOADER.NCF starts moving the data to the Sybase database that is viewed via ConsoleOne (see KB 10075469 Error: "interface not found." )

Notes:

A.  License (NLS) issues have been known to cause the install of ZFS to fail.  Confirm all NLS issues have been addressed.  A good TID to start with is KB 10013723 "Understanding NetWare 5 Licensing" .

B.  Certificate Server issues have been known to cause the install of ZFS to fail.  One symptom would be:

'An error occurred while attempting to create object: Distributor_Servername.context  Return Code = 1610677670' .

The Distributor object was actually created, even though the error said it was not.  The issue was the server had two NIC cards and was connected to two networks. The CA grabbed the wrong IP address for the CertificateDNS object.  PKIDIAG identified this problem. Deleting the SSL CertificateDNS object in ConsoleOne and running PKIDIAG again recreated the object with the proper IP address.  The reinstall was successful after that.

C.  If the schema install is successful but the schema verify fails, see KB 10058497 'Error: "Severe The selected tree does not have the required schema extensions" followed by "press OK to continue" (NO_SCHEMA_CHECK switch).


II.  Initial load of ZFS modules:

1. All the 'I.  Before installing:' must be done first.

2. Log in as the same user that did the install. This user is granted by the ZFS install ownership of all the RBS objects and must be used to assign out or add/change roles and role ownership.

3. Were the ZFS snapins installed to the server's SYS:PUBLIC\MGMT\CONSOLEONE\1.2 directory structure?  If not, they must be as the JAVA processes running on the server, and even C1 running on a local box, accesses files in the SYS:PUBLIC\MGMT\CONSOLEONE\1.2 directory structure. Once all the ZFS modules are loaded on the server, C1 must run correctly from SYS:PUBLIC\MGMT\CONSOLEONE\1.2  from a workstation. Once confirmed C1 works properly running in that manner, then C1 can be installed to a local w/s (to get all the local registry settings) and the ZFS snapins installed to the local w/s.  ZENworks for Desktops provides the utility SYS:PUBLIC\ZENWORKS\C1UPDATE.EXE to move the C1 version and snapins from the server to the workstation.  Once the SYS:PUBLIC\MGMT\CONSOLEONE\1.2\BIN\CONSOLEONE.EXE is working, make a backup of the directory.

4. Load the ZFS modules one at a time manually as listed in the AUTOEXEC.NCF (before loading modules, see KB 10075469 Error: "interface not found.", see item I. 9. above ) .  If done from a w/s and a RCONSOLE or RCONJ session is active, the first error (if any) generated on the server should be captured with a screen shot and should be investigated?  If an error is returned when loading the modules manually, run 'java -show' on the server console, unload CONLOG.NLM shortly after.  Confirm the data is in the SYS:ETC\CONSOLE.LOG file. Search support.novell.com, Knowledgebase for possible TIDs.  If none, a copy of the CONSOLE.LOG file and screen shots can be provided to Novell Technical Support.

5.  The status of the MMS modules and services running on the server can be checked with C1, Tools, Management Site Server Status.  Typically 'ampurge' is the only service not running if SLOADER.NCF (Atlas Manager) is successful loading.  Another option to see if SLOADER.NCF is successfully loading is to watch the server console screen when SLOADER.NCF is manually executed.  There should be one .NCF file run (MWSETENV.NCF ) and three nlms load:  JNCPV2.NLM, TRAPRCVR.NLM, & NDSNAME.NLM (if java was not already loaded, a java nlm may also auto-load).  If the nlms don't load, then most likely the install was not successful.  This is if often due to DS problems that need to be addressed or if no DS replica was on the sever before the initial install (see I. 'Before Installing' item 2 above).

III.  Notes:

A.  MGMTDBS.NCF must be loaded for both Management & Monitoring Services (MMS) and TED/POLICIES to have access to the Sybase database.  If loading manually, run this MGMTDBS.NCF before starting either MMS or TED/POLICIES.  If TED/POLICIES is installed first, MGMTDBS.NCF should be the first item to load in AUTOEXEC.NCF .  If MMS is installed second, the MGMTDBS.NCF line can be moved to just before MWSERVER.NCF .  If this is not loaded, the server will run out of memory as SLOADER.NCF is preparing Discovery data to be inserted into Sybase.

B.  Is the server in the production tree or a test tree?  Typically, the test tree works and the problems arise in the production tree.  If all the above items are addressed, this issue should be minimized.

C.  If all the above steps have been completed without resolution of the issue(s), a reinstall of ZFS is recommended.  Use TIDs:

10060637  Uninstalling ZENworks for Servers 2 Policy and Distribution Services; Uninstalling ZENworks for Servers 2 Site Management

10072445 Uninstalling / Reinstalling Novell ZENworks for Servers.

D.  Other files to provide Novell Technical Support are CONFIG.TXT (LOAD CONFIG.NLM /sd' ) & SYS:SYSTEM\SYS$LOG.ERR .

E.  A bad NIC (or driver) in the ZFS Site Server has caused corrupt SNMP packets preventing the proper SNMP communication.  If symptoms occur that are unexplainable, trying different hardware is always an option.

F.  If Discovery appears to be inconsistent and erratic, see KB 10021547 "MPK module is causing a server to have high utilization or abend" .  The PSM driver should not be loaded if the server has only one CPU.

Additional Information

Formerly known as TID# 10072684