This section provides troubleshooting suggestions for typical NetWare server problems such as abends, disk I/O errors, insufficient disk space, and insufficient memory.
The NetWare 4 operating system is very resilient, but errors can and will arise. Serious problems are usually accompanied by abend (abnormal end) messages.
Abend messages are usually caused by consistency check errors or CPU-detected software errors.
Consistency check errors are internal tests placed in the NetWare operating system to ensure the stability and integrity of internal operating system data.
Consistency check errors might be caused by a corrupted operating system file, by corrupted or outdated drivers and NetWare Loadable ModuleTM (NLMTM) programs, or by hardware failure.
When the server abends, it displays an abend message similar to the following:
Abend: SERVER-4.10-message_number message_string
ADDITIONAL INFORMATION: message
The Additional Information section states the probable cause of the abend. It indicates where the problem occurred and gives the name of any NLM program associated with the abend. This information helps you determine how to resolve the abend.
You can respond to the abend manually or have the server respond automatically.
When you respond manually, the server determines the nature of the abend and displays the appropriate response option on the screen, along with additional options for bringing down the server or executing a core dump. You must execute an option to respond to the abend.
When the server responds automatically, it executes the appropriate response without intervention.
IMPORTANT: Sometimes an abend (or a faulty NLM program) can cause the server console to hang (stop functioning). In this case, the abend message is not displayed and you cannot enter commands at the console prompt.
If this happens, press <Ctrl>+<Alt>+<Esc>. A message asks if you want to down the server. Enter Y to down the server and exit to DOS, or N to return to the console prompt.
The default method of responding to an abend is automatic. For more information about automatic response to abends, see Responding to the Abend Automatically.
To respond manually to abends, change either of the following SET parameters to the values shown:
AUTO RESTART AFTER ABEND = 0
DEVELOPER OPTION = ON
When an abend occurs, the server displays a short list of options appropriate to the nature of the abend. To respond to the abend, you must execute one of the options by typing the first letter of the option.
The following options may be displayed. Note that several of the options have the same first letter (such as R, S, or X). In a given abend situation, the option list will include only one option for any given first letter.
This option appears if the abend was software-detected, that is, detected by NetWare. It is important to save files, shut down the server, and try to solve the problem that caused the abend. Review the ABEND.LOG file to help determine the source of the problem.
When you execute this option, the server sends a message every two minutes to users advising them to save their files and log out. The server then stops the running process, updates the ABEND.LOG file, and attempts to shut down and restart the computer.
The amount of time before the server shuts down and restarts is determined by the SET parameter AUTO RESTART AFTER ABEND DELAY TIME. You can set this value from 2 to 60 minutes.
This option appears if the abend was an NMI (nonmaskable interrupt), indicating a parity error or a machine check processor exception. It is important to save files, shut down the server, and solve the problem causing the abend. Review the ABEND.LOG file to help determine the source of the problem. When you execute this option, the server sends a message every two minutes to users advising them to save their files and log out. The server then resumes the running process, updates the ABEND.LOG file, and attempts to shut down and restart the computer. The amount of time before the server shuts down and restarts is determined by the SET parameter AUTO RESTART AFTER ABEND DELAY TIME. You can set this value from 2 to 60 minutes.
This option appears if the abend was hardware-detected, that is, detected by the processor. All hardware-detected abends have the words processor exception in the abend message. These abends include page faults, protection faults, and invalid op codes. When this option is available, the server has determined that it cannot return the process to a safe state, but that it does not need to shut down the computer immediately to resolve the problem. You may still need to shut down the computer and restart it at a later time. When you execute this option, the server suspends the currently running process, updates the ABEND.LOG file, but does not shut down the computer. Server performance may be poor, because a loaded NLM is probably malfunctioning. Read the Additional Information part of the abend message to learn which NLM might be causing the problem. Wait until a convenient time, then shut down the server and restart it. Examine the ABEND.LOG file for more information about the source of the problem.
Like the previous option, this option appears if the abend was hardware-detected, that is, detected by the processor. All hardware-detected abends have the words processor exception in the abend message. These abends include page faults, protection faults, and invalid op codes. When this option is listed, the server has determined that it can return the process to a safe state. When you execute this option, the server returns the running process to a safe state and updates the ABEND.LOG file, but it does not shut down the computer. In most cases, the machine completely recovers and no further action is necessary. This option resolves most page fault abends.
Execute this option to perform a core dump that can be examined to determine the cause of an abend.
This option appears only if DOS has been removed. Execute this option if you want to restart the server. NOTE: If DOS has been removed, the server will not create or update an ABEND.LOG file.
Execute this option if you want to bring down the server and exit to DOS. If you power off the server without first executing one of the S or R options to resolve the abend, the server will not update the ABEND.LOG file.
If the console has been secured, you must power off and then back on to restart the server. If you power off the server without first executing one of the S or R options to resolve the abend, the server will not update the ABEND.LOG file.
NOTE: When the server restarts, it moves the ABEND.LOG file from the DOS partition to SYS:SYSTEM.
You can require the server to respond automatically to abends. Two automatic responses are possible.
AUTO RESTART AFTER ABEND = 1
DEVELOPER OPTION = OFF
Because these are the default values of the parameters, the default mode is to respond to abends automatically.
AUTO RESTART AFTER ABEND = 2 DEVELOPER OPTION = OFF
Use the following parameter to specify how long the server waits after an abend before attempting to shut down and restart the computer:
AUTO RESTART AFTER ABEND DELAY TIME = minutes
Use the SET command or the SERVMAN or MONITOR utilities to set the parameter values. See SET, SERVMAN, or MONITOR in Utilities Reference.
The DEVELOPER OPTION parameter is found in the Miscellaneous category of parameters.
The AUTO RESTART AFTER ABEND and AUTO RESTART AFTER ABEND DELAY TIME parameters are found in the Error Handling category.
All parameters can be set in the STARTUP.NCF file.
IMPORTANT: Because the server responds to the abend automatically, you may not know when an abend has occurred. Therefore, you should periodically check the ABEND.LOG file or the Server Up Time statistic on the Connection Information screen of MONITOR.
To resolve a general disk I/O error on the server, try one or more of the following remedies:
SCAN FOR NEW DEVICES <Enter>
This causes the operating system to request controller information about all devices.
To change the Hot Fix Redirection Area on an existing drive, back up all the data on the partition, delete the volumes on the partition, and delete the partition; then re-create it. Assign the partition a different percentage for the Hot Fix Redirection Area; then recreate the volumes and restore the data.
If you have tried all the preceding suggestions without success, contact your Novell® Authorized ResellerSM representative or drive manufacturer.
To resolve an insufficient disk space error, you should do one or more of the following:
For information on SET parameters, see Managing Server Hard Disks and SET in Utilities Reference.
To free up server memory temporarily (until you can add more memory to the server), do one or more of the following:
(This setting uses a lot of disk space but increases the amount of memory available.) For more information, see Maintaining the NetWare Server and SET in Utilities Reference.
To resolve a locked device error, try one or more of the following:
If you have tried all of the above without success, contact your Novell Authorized Reseller representative or the drive manufacturer.
To resolve a file I/O error, try one or more of the following:
To increase the amount of free space, do one or more of the following:
See also Resolving Volume I/O Errors.
To resolve a volume I/O error, try one or more of the following:
If you have tried all of the above without success, contact your Novell Authorized Reseller representative or disk drive manufacturer.
Event control block allocation system messages can occur when you first bring up the server or after the server has been running for some time.
These messages indicate that the server was unable to acquire sufficient packet receive buffers, usually called event control blocks (ECBs). Running out of ECBs is not a fatal condition.
Servers that run for several days where high loads occur in peaks might exceed the set maximum number of ECBs, causing the system to generate ECB system messages.
If these situations are caused by occasional peaks in the memory demand, you should probably maintain your current maximum ECB allocation and allow the message to be generated at those times.
On the other hand, if your server memory load is very high and you receive frequent ECB allocation errors, you should probably set your maximum ECB allocation higher. Use the following SET command in the STARTUP.NCF file:
SET MAXIMUM PACKET RECEIVE BUFFERS=number
NOTE: Memory allocated for ECBs cannot be used for other purposes.
The minimum number of buffers available for the server can also be set in the STARTUP.NCF file with the following command:
SET MINIMUM PACKET RECEIVE BUFFERS=number
To diagnose server console command problems, you should identify whether the following conditions exist:
To resolve server console command problems, you should perform the following actions:
Bring down the server, if possible. If not, wait a few minutes after all users have logged out; then reboot the server.
If your recorded network board configurations do not agree with the actual hardware configurations, reload the LAN driver with the correct parameters or change the hardware settings to match the LAN driver parameters.
The most common conflict occurs when a network board is set to interrupt 4 and a printer is connected to the server's serial port, which also uses interrupt 4.
To diagnose keyboard locking problems when copying files from CD-ROM, you should identify if the following conditions exist:
If you have a CD-ROM device that shares a SCSI bus with a disk subsystem containing volumes to which NetWare installation files are copied (typically volume SYS:), your keyboard may lock up while loading drivers or copying files to the volume.
Figure 36 shows possible configuration conflicts.
Figure 36
SCSI Adapter Conflicts
Remove the CD-ROM device drivers that you used to set up the CD-ROM drive as a DOS device from your CONFIG.SYS file. This will avoid possible conflicts when the NetWare 4.2 Operating System CD-ROM is mounted as a NetWare volume.
To resolve keyboard locking problems when copying files from CD-ROM, you should use the following procedure.
Press <Alt>+<Esc> until you are at the console prompt (:).
Type
DOWN <Enter>
Then type
EXIT <Enter>
Using a text editor, remove the CD-ROM device drivers from your CONFIG.SYS file.
Save the updated CONFIG.SYS file.
Using a text editor, remove any references to the CD-ROM drivers from your AUTOEXEC.BAT file.
Save the updated AUTOEXEC.BAT file.
Reboot the server by pressing <Ctrl>+<Alt>+<Del>.
(Conditional) If the server doesn't boot automatically from the AUTOEXEC.BAT file, change to the subdirectory to the SERVER.EXE file (the default directory is C:\NWSERVER) and other boot files, and type
CD\NWSERVER <Enter>
SERVER <Enter>
(Conditional) If you are using ASPI device drivers (for example, for an Adaptec controller), you need to perform one of the following commands:
LOAD AHA1540 <Enter>
or
LOAD ASPICD <Enter>
or
LOAD CDNASPI <Enter>
At the console, type
LOAD NWPA <Enter>
At the console, type the following
LOAD CDROM <Enter>
CD MOUNT NW410 <Enter>
At the console, type
LOAD INSTALL <Enter>
To diagnose problems when the server hangs after mounting the last volume, you should identify whether the following conditions exist:
To resolve problems when the server hangs after mounting the last volume, you should perform the following actions or ensure that the following conditions exist:
Volume SYS: is the backout volume for TTSTM (Transaction Tracking SystemTM). Volume SYS: also contains the NetWare system files and the NLM programs.
If volume SYS: does not mount when the server is booted, then the AUTOEXEC.NCF file does not execute, LAN drivers do not load, TTS can't be enabled, and the volume does not become part of the Directory tree.
To diagnose problems when no volumes mount, you should identify if the following conditions exist:
To resolve problems when no volumes mount, you should perform the following actions:
To diagnose problems when only some volumes mount, you should identify whether the following conditions exist:
To resolve problems when only some volumes mount, you should perform the following actions:
To diagnose problems when disk errors occur while a volume is mounting, you should identify whether the following conditions exist:
To resolve problems when disk errors while a volume is mounting, you should perform the following actions:
To diagnose problems when memory errors while a volume is mounting, you should identify whether the following conditions exist:
To resolve problems when memory errors while a volume is mounting, you should perform the following actions or ensure that the following conditions exist:
If you combine directories so that most directories have about 32 files, and then purge the deleted subdirectories and files, you will free up memory.
See Appendix A, Calculate RAM Requirements, in Installation for a formula for estimating the total amount of server memory needed.
If the percentage is below 20%, you should add more memory.
WARNING: This is a destructive step that destroys all the extended file information. Before taking this step, try to free up enough memory so that the volume mounts and you can back up the data.
Have all users log out, and then unload all modules except the volume's disk drivers. Dismount any mounted volumes. To remove the name space, load VREPAIR and choose the Remove Name Space Support From The Volume and Write All Directory and FAT Entries Out to Disk options. Then run VREPAIR on the volume that would not mount.
To diagnose problems when mismatches exist in the duplicate copies of the File Allocation Table (FAT) and Directory Entry Table (DET), you should identify whether the following conditions exist:
To resolve problems when mismatches in the duplicate copies of the FAT and DET exist, you should perform the following actions:
Use INSTALL to unmirror the hard disks (select the hard disk you think is least reliable and delete it from the mirroring list). Then run VREPAIR on the volume and mount the volume. If the volume still does not mount or the data shows some corruption, read the next suggestion before remirroring the hard disks.
Use INSTALL to unmirror the hard disks and to salvage the orphaned (Out Of Sync) hard disk as a new volume. Run VREPAIR on both the old and the new volumes. Mount both volumes and compare the files. Use INSTALL to delete the volume that has the least useful information; rename the salvaged volume, if necessary. Then use INSTALL to remirror the hard disks.
Once a volume has been configured to support more than the DOS naming convention, the name space loadable module must be loaded before the volume can be mounted.
To diagnose problems when a volume cannot mount because the name space module is not loaded, you should identify whether the following conditions exist:
To resolve problems when a volume cannot mount because of the name space module is not loaded, you should perform the following actions:
WARNING: This is a destructive step that destroys all of the extended file information.