16.3 ReZIDing Volumes in an NSS Pool

16.3.1 What Is a ZID?

When a file is created, it is assigned a unique file number, called a ZID. In NSS, the maximum number of file ZIDs available is a 64-bit number, which provides for up to 8 trillion (8E12) ZIDs, so NSS was designed to not re-use ZIDs. However, NCP clients and other traditional applications can only work with 32-bit IDs, which support up to 4 billion (4E9) ZIDs. NSS restricts ZIDs, and thus the number of files, to the lower value.

It is possible for a file system to reach the 32-bit limit on the ZID number. For example, if a lot of files and directories are regularly created, exist for a short time, and are then deleted, the ZIDs are used up at that rate. Otherwise, reaching the upper limit is something that happens rarely.

The ZID of a file is an internal file system bit of information. Under Linux, inode number and ZIDs are the same. You can view a file's ZID using the command ls --inode. However, the highest ZID number in use for each volume is reported when you verify a pool.

16.3.2 Understanding ReZID

The ReZID option for a pool rebuild changes the ZIDs for all the files on the volume, thus freeing ZIDs so they are available for creating new files and directories. The rezid does not modify any other metadata on the volumes, nor does it modify any file’s content. The reZID is unrelated to any other rebuild activities that might occur.

IMPORTANT:The reZID step in a rebuild adds a third review of the pool and can increase the time of a rebuild by 50%.

16.3.3 When to ReZID

After verifying a pool, the log reports the highest ZID (highestZID parameter) for each of the pool’s volumes. If the highest ZID number is close to the 4 billion ZID limit (4E9), you should reZID the volume. For NSS volumes on Linux, if the highest ZID is 2 billion (2E9) or greater, the reZID option occurs automatically when you rebuild a pool. You can optionally specify a different ZID limit to trigger the rezid, or use the /noReZID option to stop the reZID from occurring with that rebuild.

There are no errors reported if ZIDs are nearing the upper limit of 4 billion for a volume. You might get errors creating a file or directory that suggest a reZID needs to be done. For example:

  • NDS database is locked.

  • Server hangs at the end of load stage 1.

  • Cannot copy to a volume.

NSS API calls return Error 20108 zERR_ZID_GREATER_THAN_32_BITS, which means that the ZID numbering has exceeded the 4 billion (4E9) limit. NSS also sends a volume alert to the server console that reZID needs to be done on a specified volume. The calling application gets only a generic error when it attempts and fails to create the file.

After rebuilding a pool with the ReZID option, the errors you were getting when creating files and directories no longer occur. You can also verify the pool again, then check the highest ZID number reported for the pool’s volumes to know that each is well under the 4 billion ZIDs limit.

If you do not place the pool in maintenance mode before rebuilding the pool with the ReZID option, you receive NSS Error 21726:

NSS error: PoolVerify results
   Status: 21726
     Name: zERR_RAV_STATE_MAINTENANCE_REQUIRED
   Source: nXML.cpp[1289]

16.3.4 Viewing the Highest ZID for a Volume

To view the highest ZID per volume:

  • On verifying a pool, look in the log to find the highest ZID value that has been assigned for each of the pool’s volume. Look at each value to see whether you should rezid the pool as part of the rebuild process.

    OR

  • Go to the file, _admin\Manage_NSS\Volume\SYS\VolumeInfo.xml and search for nextAllocatableZid.

You should be aware of the rate at which you are consuming ZIDs by creating and deleting files. If the highest ZID for a given volume reaches the limit of 4 billion (4E9), you cannot create new files on the volume until you rezid the pool.

16.3.5 ReZIDing Volumes

  1. For a 32-bit machine, make sure you have enough space available in the Linux kernel cache memory to run a pool rebuild.

    When running ravsui(8) for a pool verify or a pool rebuild on Linux, the utility needs contiguous space in kernel memory separate from the space allocated to the core NSS process. The larger the pool, the larger the space that is needed. To make space available, you might need to reduce the space used by other processes. You can optionally reduce the minimum number of buffers reserved for the core NSS process to as little as 10,000 4-KB buffers.

    1. Open a terminal console as the root user.

    2. At the console prompt, enter

      nsscon
      
    3. In nsscon, enter

      nss MinBufferCacheSize=10000
      
  2. Place the pool in maintenance mode.

    1. At a terminal prompt, enter

      nsscon
      
    2. In nsscon, enter

      nss /PoolMaintenance=poolname
      
  3. If you have not already verified the volume, enter the following at a command prompt:

    ravsui verify poolname
    

    For information, see Section B.8, ravsui.

  4. Review any errors on-screen or in the filename.vbf file, located where you specified.

    For information, see Section 16.2.4, Reviewing Log Files for Metadata Consistency Errors.

  5. Rebuild a pool by entering the following at a command prompt

    ravsui --rezid=zid rebuild poolname
    

    Replace zid with the value of a threshold to cause a reZID of a volume. The default value is 0xefffffff. For information, see Section B.8, ravsui for options to set the pruning parameters for the rebuild.

    For NSS on OES Linux, rebuild automatically causes a reZID of a volume if rebuild finds a ZID over 2 billion.

    This checks all blocks in the system. Rebuilding can take several minutes to several hours, depending on the number of objects in the pool. For all systems, reZID adds a third pass to the rebuild, which increases the time to rebuild a volume by about 50%.

  6. Review the log on-screen or in the filename.rtf file to learn what data has been lost during the rebuild.

    For information, see Section 16.2.4, Reviewing Log Files for Metadata Consistency Errors.

  7. Do one of the following:

    • Errors: If errors still exist, the pool remains in the maintenance state. Repeat the pool verify to determine the nature of the errors, then contact Novell Support for assistance.

    • No Errors: If errors do not exist, the pool’s volumes are mounted automatically.

  8. For a 32-bit machine, if you modified the MinBufferCacheSize setting in Step 1, you can change it back to its original setting.

    1. Open a terminal console as the root user.

    2. At the console prompt, enter

      nsscon
      
    3. In nsscon, enter

      nss MinBufferCacheSize=value
      

      Replace value with the desired minimum number of 4-KB buffers. The default value is 30000.