16.3 ReZIDing Volumes in an NSS Pool

16.3.1 What Is a ZID?

When a file is created, it is assigned a unique file number, called a ZID. In NSS, the maximum number of file ZIDs available is a 64-bit number, which provides for up to 8 trillion (8E12) ZIDs, so NSS was designed to not re-use ZIDs.

The ZID of a file is an internal file system bit of information. Under Linux, inode number and ZIDs are the same. You can view a file's ZID using the command ls --inode. However, the highest ZID number in use for each volume is reported when you verify a pool.

16.3.2 Understanding ReZID

The ReZID option for a pool rebuild changes the ZIDs for all the files on the volume, thus freeing ZIDs so they are available for creating new files and directories. The rezid does not modify any other metadata on the volumes, nor does it modify any file’s content. The reZID is unrelated to any other rebuild activities that might occur.

IMPORTANT:The reZID step in a rebuild adds a third review of the pool and can increase the time of a rebuild by 50%.

16.3.3 When to ReZID

After verifying a pool, the log reports the highest ZID (highestZID parameter) for each volume in the pool. For NSS volumes, if the nextAllocatableZid is greater than 0xefffffff (default value), the reZID occurs automatically when you rebuild a pool. You can optionally specify a different ZID limit to trigger the rezid.

Beginning with May 2017 patch:

  • The 64-bit ZID support is enabled only by force enabling using the /ForceEnableZID64 command. To check if the server is already enabled for 64-bit ZID, use /DisplayZID64Status command.

  • If the nextAllocatableZid is greater than 0xffffffff, the rezid operation is skipped for that volume.

  • When a resource with a volume whose nextAllocatableZid is greater than 0xffffffff is migrated to a node that is not 64-bit ZID force enabled, you are still allowed to create new files and folders for that volume.

If 64-bit ZID is not enabled, the following needs to be taken care:

There are no errors reported if ZIDs are nearing the upper limit of 4 billion for a volume. You might get errors creating a file or directory that suggest a reZID needs to be done. For example:

  • NDS database is locked.

  • Server hangs at the end of load stage 1.

  • Cannot copy to a volume.

NSS API calls return Error 20108 zERR_ZID_GREATER_THAN_32_BITS, which means that the ZID numbering has reached the 4 billion (4E9) limit. NSS also sends a volume alert to the server console that reZID needs to be done on a specified volume. The calling application gets only a generic error when it attempts and fails to create the file.

After rebuilding a pool with the ReZID option, the errors you were getting when creating files and directories no longer occur. You can also verify the pool again, then check the highest ZID number reported for the pool’s volumes to know that each is well under the 4 billion ZIDs limit.

If you do not place the pool in maintenance mode before rebuilding the pool with the ReZID option, you receive NSS Error 21726:

NSS error: PoolVerify results
   Status: 21726
     Name: zERR_RAV_STATE_MAINTENANCE_REQUIRED
   Source: nXML.cpp[1289]

16.3.4 Viewing the Highest ZID for a Volume

To view the highest ZID per volume:

  • On verifying a pool, look in the log to find the highest ZID value that has been assigned for each of the pool’s volume. Look at each value to see whether you should rezid the pool as part of the rebuild process.

    OR

  • Go to the file, _admin\Manage_NSS\Volume\SYS\VolumeInfo.xml and search for nextAllocatableZid.

You should be aware of the rate at which you are consuming ZIDs by creating and deleting files. If the nextAllocatableZid for a given volume is greater than 0xffffffffffffffff, you cannot create new files on the volume.

16.3.5 ReZIDing Volumes

  1. Place the pool in maintenance mode.

    1. At a terminal prompt, enter

      nsscon
    2. In nsscon, enter

      nss /PoolMaintenance=poolname
  2. If you have not already verified the volume, enter the following at a command prompt:

    ravsui verify poolname

    For information, see Section B.7, ravsui.

  3. Review any errors on-screen or in the filename.vbf file, located where you specified.

    For information, see Section 16.2.4, Reviewing Log Files for Metadata Consistency Errors.

  4. Rebuild a pool by entering the following at a command prompt

    ravsui --rezid=zid rebuild poolname

    Replace zid with the value of a threshold to cause a reZID of a volume. The default value is 0xefffffff. For information, see Section B.7, ravsui for options to set the pruning parameters for the rebuild.

    For NSS, a rebuild automatically causes a reZID of a volume if the rebuild finds a ZID over the default value.

    This checks all blocks in the system. Rebuilding can take several minutes to several hours, depending on the number of objects in the pool. For all systems, reZID adds a third pass to the rebuild, which increases the time to rebuild a volume by about 50%.

  5. Review the log on-screen or in the filename.rtf file to learn what data has been lost during the rebuild.

    For information, see Section 16.2.4, Reviewing Log Files for Metadata Consistency Errors.

  6. Do one of the following:

    • Errors: If errors still exist, the pool remains in the maintenance state. Repeat the pool verify to determine the nature of the errors, then contact Novell Support for assistance.

    • No Errors: If errors do not exist, the pool’s volumes are mounted automatically.