The purpose of the Pool Verify and Pool Rebuild utilities is to make sure you have a valid metadata structure for a pool. Use the utilities only when you have problems with the pool’s metadata structure.
Verifying a pool does not fix any problems. It is a read-only assessment of the pool’s metadata structure to identify the types of errors, the severity of errors, and in what volumes the errors occur.
Rebuilding a pool restores the consistency of the pool’s metadata structure. Rebuilding a pool does not restore lost data and does not repair the data integrity of corrupted data.
Rebuilding a pool does not fix problems for the following:
Journaling errors
Hardware and media errors
File system trustee assignments, trustee rights, and inherited rights filters
File system attributes for files and directories
Opportunistic locking
Content of files
Volume errors are typically transactions left unfinished during a system crash of some kind. Most volume errors are fixed automatically during volume mount when NSS resolves the journaled errors. If the pool can be mounted, mount its volume to allow the NSS journaling feature repair any transactional errors that occurred during a system failure.
Afterwards, there if there are still problems, use diagnostic tools to rule out hardware problems as the cause.
If non-hardware errors persist, and if you have a viable backup to restore the pool to the last known good state, restore the backup to recover the pool and restore the data. It is probably not necessary to verify or rebuild the pool.
If non-hardware errors persist, and if you do not have a viable backup, use the following Pool Verify utilities to identify any errors in the pool’s metadata:
Review the verification log to determine the type and severity of problems with the pool’s metadata. If necessary, rebuild the pool’s metadata, using the following utilities:
|
Operating System |
Utilities |
|---|---|
|
Linux |
RAVSUI (build option) RAVVIEW (reformats log files to human-readable format) |
|
NetWare |
PoolRebuild |
WARNING:You should rebuild a pool only as a last resort to restore the consistency of the pool’s metadata. The rebuild repairs the metadata; it does not recover lost data or repair the integrity of the data itself. Data loss occurs during a rebuild if the utility must prune leaves in the data structure to restore metadata consistency.
If all of the following conditions exist, then you should rebuild the pool to restore its metadata integrity.
Errors were not corrected by mounting the volume, or you could not mount the volume.
Errors were not caused by media or hardware problems, or they persisted after correcting any media or hardware issues.
You have no viable backup of the pool’s volumes to restore the pool to the last known good state.
The Pool Verify process reports errors in the physical integrity of any of the volumes’ metadata that would definitely cause data corruption if no action is taken.
More data will be lost from continued data corruption than will be lost from rebuilding the pools now.
(NetWare) If you are not sure whether you can tolerate a system rebuild, take a pool snapshot and run the rebuild against the pool snapshot instead. (The NSS pool snapshot feature is not available on Linux in OES SP2 and earlier.) Then if the rebuild is acceptable, you can replace the pool with the rebuilt snapshot.
If the Pool Verify process did not report errors, but you cannot create files or directories, you should run rebuild with the ReZID option. For information, see Section 11.12, ReZIDing Volumes in an NSS Pool.
Volume errors are typically transactions left unfinished during a system crash of some kind. This type of error is fixed automatically during volume mount by the NSS journaling feature. Journaling in NSS handles the same level of problems as Vrepair does on NetWare Traditional volumes.
If errors persist after you mount the volume, or if you cannot mount the volume, first rule out hardware causes for the problems. For information, see Section 11.11.3, Ruling Out Hardware Causes.
If a volume cannot be mounted or problems persist after journaling errors are resolved, check the hardware for faulty media or controller problems.
Make sure you have a good backup of the data.
Use the latest diagnostic software and utilities from the manufacturer of your hard drives and controllers to troubleshoot the hard drives without destroying the data.
For example, verify the media integrity and that devices are operating correctly.
If necessary, repair the media or controllers.
If errors persist after you have ruled out hardware causes, and you do not have a viable backup to restore to the last known good state, you should check the pool for metadata inconsistencies. For information, see Section 11.11.4, Verifying the Pool to Identify Metadata Inconsistencies.
The verify process is a read-only assessment of the pool. The Pool Verify option searches the pool for inconsistent data blocks or other errors in the file system’s metadata and reports data in the verification log. For information on where to find the verification log and how to interpret any reported errors, see Section 11.11.5, Reviewing Log Files for Errors.
Place the pool in maintenance mode.
At a terminal prompt, enter
nsscon
In nsscon, enter
nss /poolmaintenance=poolname
Verify a pool by entering the following at a terminal prompt:
ravsui verify poolname
Use RAVVIEW to read the logs.
For information about using RAVVIEW, see Section A.14, RAVVIEW (Linux).
Do one of the following:
If the log reports no errors with the pool’s metadata, it is safe to activate the pool and mount the volumes.
If the log reports no errors with the pool’s metadata, but you still cannot create files or directories, run a Pool Rebuild with the ReZID option. For information about renumbering storage object IDs in volumes, see Section 11.12, ReZIDing Volumes in an NSS Pool.
If the log reports errors with the pool’s metadata, the volumes affected remain in Maintenance mode. Decide whether to rebuild the pool based on the type of error and potential outcomes. For information about rebuilding the pool, see Section 11.11.6, Rebuilding NSS Pools to Repair Metadata Consistency.
Place the pool in maintenance mode. At a terminal prompt, enter
nss /poolmaintenance=poolname
Verify the pool by entering the following at the server console:
nss /poolverify=poolname
Review any errors on screen or in the volume_name.rlf file, located at the root of the DOS drive.
Do one of the following:
If the log reports no errors with the pool’s metadata, the pools and volumes are automatically activated. It is safe to mount the volumes.
If the log reports no errors with the pool’s metadata, but you still cannot create files or directories, run a Pool Rebuild with the ReZID option. For information about renumbering storage object IDs in volumes, see Section 11.12, ReZIDing Volumes in an NSS Pool.
If the log reports errors with the pool’s metadata, the volumes affected remain in Maintenance mode. Decide whether to rebuild the pool based on the type of error and potential outcomes. For information about rebuilding the pool, see Section 11.11.6, Rebuilding NSS Pools to Repair Metadata Consistency.
Make sure to check the error log whenever an NSS volume does not come up in active mode after a verify or rebuild.
Messages are written to the following logs:
Table 11-2 Location of Log Files for the NSS Pool Verify and Pool Rebuild Utilities
|
Platform |
Log |
Purpose |
|---|---|---|
|
Linux |
/var/opt/novell/log/nss/rav/ filename.vbf This is the default location, but you can specify the location and the filename. |
Log of the pool verify process using ravsui verify. If a volume has errors, the errors are displayed on the screen and written to this log file of errors and transactions. On Linux, use the RAVVIEW utility to read logs. For information, see Section A.14, RAVVIEW (Linux). |
|
/var/opt/novell/log/nss/rav/ filename.rtf |
Log of the pool rebuild process using ravsui rebuild. This log contains information about data that has been lost during a rebuild by the pruning of leaves in the data structure. |
|
|
NetWare |
filename.vlf, located at the root of the server’s DOS drive |
Log of the pool verify process using poolverify. If a volume in the pool has errors, the errors are displayed on the screen and written to this log file of errors and transactions. |
|
filename.rlf, located at the root of the server’s DOS drive |
Log of the pool rebuild process using poolrebuild. This log contains information about data that has been lost during a rebuild by the pruning of leaves in the data structure. |
Whenever you verify or rebuild a pool, the new information is appended at the end of the log file. If you want to keep old log files intact, rename the log file or move it to another location before you start the verify or rebuild process.
Warnings indicate that there are problems with the metadata, but that there is no threat of data corruption. Performaing a data restore from a backup tape or rebuilding the pool’s metadata are optional. However, rebuilding a pools’s metadata typically results in some data loss.
Errors indicate that there are physical integrity problems with the pool’s metadata, and data corruption will definitely occur, or it will continue to occur, if you continue to use the pool as it is. Do one of the following:
If you decide to rebuild the pool, use the Pool Rebuild utility. For information, see Section 11.11.6, Rebuilding NSS Pools to Repair Metadata Consistency.
If the verify log does not report errors, but you continue to be unable to create files or directories on volumes in the pool, it might be because the files’ ID numbers have exceeded the maximum size of file numbering field. You might need to rebuild the pool with the ReZID option. For information about how to decide if a ReZID is needed, see Section 11.12, ReZIDing Volumes in an NSS Pool.
The purpose of a pool rebuild is to repair the metadata consistency of the file system. Rebuild uses the existing leaves of an object tree to rebuild all the other trees in the system to restore visibility of files and directories. It checks all blocks in the system. Afterwards, the NSS volume remains in maintenance mode if there are still errors in the data structure; otherwise, it reverts to the active state.
WARNING:Data will be lost during the rebuild.
A pool rebuild depends on many variables so it is difficult to estimate how long it might take. The number of storage objects in a pool, such as volumes, directories, and files, is the primary consideration in determining the rebuild time, not the size of the pool. This is because a pool rebuild is reconstrucing the metadata for the pool, not its data. For example, it would take longer to rebuild the metadata for a 200 GB pool with many files than for a 1TB pool with only a few files. Other key varibles are the number of processors, the speed of the processors, and the size of the memory available in the server.
You do not need to bring down the server to rebuild a pool. NSS allows you to temporarily place an individual storage pool in maintenance mode while you verify or rebuild it. While the pool is deactivated, users do not have access to any of the volumes in that pool.
Depending on the nature of the reported errors, you might want to open a call with Novell Technical Support before you begin the rebuild process.
Before you rebuild, you must place the pool in maintenance mode.
At a terminal prompt, enter
nsscon
In nsscon, enter
nss /poolmaintenance=poolname
To rebuild a pool, enter the following at a terminal prompt
ravsui rebuild poolname
For information, see Section A.13, RAVSUI (Linux) for options to set the pruning parameters for the rebuild.
Rebuilding can take several minutes to several hours, depending on the number of storage objects in the pool.
Review the log on screen or in the filename.rtf file to learn what data has been lost during the rebuild.
For information, see Section 11.11.5, Reviewing Log Files for Errors.
Do one of the following:
No Errors: If errors do not exist at the end of the rebuild, the pool’s volumes are available for mounting.
Errors: If errors still exist, the pool remains in the maintenance state. Repeat the pool verify to determine the nature of the errors, then call Novell Technical Support for assistance.
Depending on the nature of the reported errors, you might want to open a call with Novell Technical Support before you begin the rebuild process.
Place the pool in maintenance mode. At a terminal prompt, enter
nss /poolmaintenance=poolname
To run Rebuild, enter the following command at the server console:
nss /poolrebuild=poolname
Replace poolname with the name of the pool you want to rebuild.
Rebuilding can take several minutes to several hours, depending on the number of objects in the pool.
Read the filename.rlf file at the root of the DOS drive on your server for information about data that has been lost.
For information, see Section 11.11.5, Reviewing Log Files for Errors.
Do one of the following:
No Errors: If errors do not exist at the end of the rebuild, the pool is activated automatically. It is safe to mount the volumes.
Errors: If errors still exist, the pool remains in the maintenance state. Repeat the pool verify to determine the nature of the errors, then call Tech Support for assistance.