Novell Documentation: NetWare 6 - Resolving Volume Audit Problems

Resolving Volume Audit Problems

This section describes solutions to potential volume audit problems. These include audit trail overflow and catastrophic failure.

Audit Trail Overflow

Preventing Loss of Audit Data describes the potential for audit loss if the configured number of audit files are filled or disk space fills up and the audit trail is improperly configured.

Audit Options Configuration describes the three overflow configuration options for volume audit trails:

Archive audit file

Disable auditable events

Disable event recording

The only option that prevents the loss of audit events (from audit overflow situations) is to disable auditable events. With this setting, the server goes into an overflow state when the current audit file reaches its maximum size or the server cannot write the current audit event.

To recover from this overflow state, an auditor (with the Write right to the volume Audit File object Audit Policy property) must reset the current audit trail.

If volume SYS: overflows, the server will allow an auditor to perform a read-only login to reset the audit file.

WARNING: To perform a read-only login when volume SYS: has overflowed, you must have sufficient software in your workstation to perform the login without downloading anything from the \LOGIN directory. The specific software this requires depends on your workstation.

In general, having a copy of the contents of \LOGIN will be sufficient. To do this, create a \LOGIN directory on your workstation, and copy everything from SYS:\LOGIN to your workstation \LOGIN directory. The required contents include not only the \LOGIN directory itself, but also subdirectories of \LOGIN (that is, \LOGIN\NLS and \LOGIN\NLS\ENGLISH).

If you don't keep a copy of \LOGIN on a workstation, you will be unable to recover from an audit overflow on the SYS: volume.

When in the overflow state, you can log in using your local copy of \LOGIN by changing to that directory and running LOGIN.EXE (or any other appropriate programs).

If you want to save the oldest audit file, and you haven't already backed it up, copy the oldest old audit file to offline storage (for example, a file in the server or workstation or removable media).

Reset the current volume audit file, as described in Reset Audit Data File. This rolls over the current audit file (to an old audit file), deleting the oldest old audit file, and initializes a new audit file.

If you want to save any audit files that you haven't already saved (including the newest of the old audit files), copy those audit files to offline storage.

Perform the following suggestions to help prevent volume overflow:

Review the status and size of the audit file frequently.

Manually reset the audit file before it overflows, if necessary.

Enable Automatic audit file archiving as described in Changing a Volume Audit Configuration. Set the Audit file maximum size large enough and the Days between audit archives low enough that the audit file will not overflow. Use caution in setting these parameters to prevent destruction of audit data.

Don't over audit.

WARNING: If the audit trail for a volume is full, the auditor's actions (for example, deleting data files, resetting the audit file) cannot be audited for that volume. In this case, you must keep a manual log of your actions for use when generating a complete history of actions performed on the server. You will be informed via a message from the server to your workstation when this occurs.

When the audit trail is reaches its configured threshold, you will receive the following notification on your workstation screen:

The audit overflow file for volume volname is almost full. Auditors must begin manual auditing now!

When the audit trail is completely full, you will receive the following notification on your workstation screen:

The audit overflow file for volume volname is full.

To avoid missing this message, you must not issue the SEND /A=N or SEND /A=P commands, or if using Windows and the NetWare User Tools, do not disable network warnings.

Catastrophic Failure Recovery

This section describes what to do if you have a catastrophic failure, for example, the volume being audited is destroyed (perhaps because of a hard disk failure) and you need to recover the audit state to what it was before the failure. In addition, it explains how to handle planned upgrades, such as when a volume is moved from a small disk drive to a larger disk drive.

There are several potential losses not addressed here:

Loss of offline audit data. Your offline audit data (whether stored in server or workstation file systems, or on removable media) should be backed up frequently enough that its loss would not be catastrophic.

Loss of some, but not all copies of the Audit File object describing the volume audit trail due to failure of one or more servers holding an NDS partition. In this case, NDS automatically uses whatever copies are available. If a server configured for the partition is brought back online, then it will automatically be updated with the Audit File object information.

There are two major catastrophic failures possible for volume audit:

Loss of all copies of the Audit File object describing the volume audit trail. If all copies of the Audit File object are lost (for example, because there only was one copy, and the server it was on suffered a disk failure), then you might be able to recover the Audit File object from a backup of your Directory tree (presuming you have backed up your Directory tree). If so, then you will be able to regain access to the existing online audit data. If not, then no access is possible to the online audit data. You must recreate the volume audit trail using the procedures in Enabling Volume Auditing (including selecting events, audit full actions, and so on).

Loss of a volume (for example, because of a disk failure). Because volume audit files are stored in an inaccessible directory which cannot be backed up, loss of a volume means that the online audit files (both the current audit file and any old audit files) are lost. Use AUDITCON to perform regular backups of audit data to avoid loss of online audit data.

WARNING: If you restore a volume from a backup, it will come back without auditing enabled. To avoid unaudited actions while you are configuring the audit system, you should take the server offline for the restoration process until the volume audit has been reconfigured. To do this, disconnect the server from any networks it is connected to, and attach it to a protected LAN containing only a trusted workstation located in a secure location. Then restore the volume from the backup. Use the trusted workstation to run AUDITCON to re-enable volume auditing. Restore the previous configuration, using your manual logs of which files are audited (as described in Changing a Volume Audit Configuration). Finally, reconnect the server to the standard networks.

You might need to take more than one server offline to perform this restoration, for example, if the server being restored does not have replicas of any NDS containers with administrative users, or if the Audit File object for the volume audit trail will not be stored in a container found on the server.

In addition to the above scenarios, if you restore an NDS User object from an NDS backup, it will come back without its per-user audit flag. To prevent a user from performing unaudited actions, you should take the server offline before restoring the User object, and use AUDITCON to set the per-user audit flag using the manual logs of audited users (as described in Audit by User or Audit by User).

If you upgrade a volume (for example, replacing it with a larger disk), that is equivalent to recovery from a catastrophic disk failure. To do an upgrade, you must first back up the old volume, and then restore it on the new disk. This loses all audit data. Therefore, before performing a volume upgrade, you should also back up all volume audit data stored on that server. Because the backup does not include the per-file audit flags, you should use the procedure described above to take the server offline for the recovery process, and use the manual logs of which files are audited to configure the audit system correctly before bringing the server back online.

Immediacy of Changes

When you modify the volume audit trail configuration (for example, to change the maximum size of the audit file or the set of events to be recorded), the change is made both to the Audit Policy property of the Audit File object and to the header of the current audit file.

Both changes will usually occur immediately. However, the effect of the change might not be immediate if the server holding the audit data is unavailable to receive the configuration change (for example, because it is down or the network has been split), even though the Audit File object can be modified. In this case, the delay will depend on how long it takes for the two servers to synchronize their NDS replicas.

In addition, if an auditor is performing audit trail management functions, changing the ACL will not affect the auditor's capabilities (either to increase or decrease them). An auditor's rights are recalculated every time he or she restarts AUDITCON and establishes access to an audit trail. To stop the auditor's actions immediately, you should break the auditor's connection to the server using the console MONITOR utility or the CLEAR STATION console command.