The nature of backup enables predictable access of files across the file system. TSAFS has been re-designed to takes advantage of this property and incorporates read-ahead caching while maintaining backward compatibility with the current serial model of usage.
The TSA library model has been modified to de-couple the serial usage of the interface from the file system access. In this model, the TSA takes advantage of the predictable nature of requests and caches data ahead of time, so that engine requests can be serviced from the memory instead of the disk. The TSA achieves this using a multi-threading model.
The four primary tasks that constitute a backup operation and use a co-operative pre-fetching mechanism are:
Scan: The Scan (NWSMTSScanDataSetBegin) defines the scope of the backup operation. The Scan thread uses this as a hint to predict and build the meta-data cache and the list of data sets to be opened. Further scan requests (NWSMTSScanNextDataSet) are serviced from this cache.
Open: The Open thread works on the data set list built by the scan thread to open data sets in parallel with other tasks. Open requests (NWSMTSOpenDataSetForBackup) are serviced from this cache.
Read: The Read task is implemented by multiple threads that issue simultaneous read requests to the file system and builds the data cache. This results in multiple pending I/O requests at the disk hardware which enables the subsystem to minimize the seek and rotational latencies. The read requests also enable building up a cache of data blocks ahead of engine requests (NWSMTSReadDataSet) thereby reducing the service time to the engine.
Close: The Close thread implements a lazy close mechanism wherein data sets are closed asynchronously with respect to the engine close requests (NWSMTSCloseDataSet).
In the TSAFS model, each of these tasks is executed in parallel.
From this new model, it maybe evident that appropriate usage of the API would help backup engines exploit the performance benefits delivered by TSAFS.