NASD/AFS Prototype
AFS also requires enforcement of a per-volume quota on allocated disk space. This is more difficult in NASD because quotas are logically managed by the file manager on each write but the file manager is not accessed on each write. However, because NASD has a byte range restriction in its capabilities, the file manager can create a write capability that escrows space for the file to grow by selecting a byte range larger than the current object. After the capability has been relinquished to the file manager (or has expired), the file manager can examine the object to determine its new size and update the quota data structures appropriately. We have implemented a client and server for NASD/AFS on Digital UNIX and AFS 3.4a. NASD/AFS stores AFS files in single NASD objects with the AFS FID (file identifier) constructed from the NASD identifier and the identity of the NASD drive on which the corresponding object is stored. Some files are stored locally on a file manager's disk. Specifically, the root object for each AFS volume and the volume index files are maintained on the file manager's local disk. NASD/AFS objects use their fs-specific NASD attribute to hold the AFS VnodeDiskData structure, which includes such information as the file's owner, group, unix mode bits, and the AFS notion of the modification time. This implementation also entailed a few simple modifications to the Transarc AFS code. File and directory creation operations were modified to return a capability for accessing the newly-created object, as well as the object FID. Three new RPCs were added: GetCapability, GetWCapability , and ReturnCapability. GetCapability is called by a client to obtain a read and getattr capability whose lifetime is equal to the unexpired time of the client's AFS token. GetCapability also registers a callback. GetWCapability returns a limited-duration capability which allows the client a reasonably small window of opportunity to write an object. ReturnCapability indicates that a client is done writing an object, the write capability returned may be revoked, and outstanding callbacks on the object may be broken. As the table below shows, times for the various components of the Andrew benchmark are roughly 10% for SAD over NASD/AFS. Because these tests have no parallelism, no contention and we deliberately make no attempt to tune our NASD/AFS more than our SAD implementation, we do not expect to run any faster. Moreover, because of our experimental design consists of regular workstations with SCSI disks representing NASD drives, the RPC communication costs of all operations mediated by the file manager suffer extra overhead. Our raw bandwidth benchmark reads a large file sequentially in 512k chunks. Our results demonstrate that, as the number of client/disk pairs increase, the NASD configuration is able to linearly scale up the aggregate transfer bandwidth, while the SAD configuration is limited by the throughput of the AFS server. In this figure, NASD/AFS bandwidth is about 14% lower with 1 and 2 client-drive pairs. This results from a different RPC package being used in SAD (RX) than NASD/AFS (DCE RPC). While the raw read benchmark demonstrates the improved scalability of NASD over SAD for data transfer, it does not reflect performance in a real workload. For this we use the agrep and gnuld benchmarks. Our agrep test reads many small files sequentially and must perform directory operations to traverse a multilevel tree to locate each file. Our results show that, for this workload, NASD/AFS also offers 20% lower runtime relative to SAD at four client-drive pairs. The gnuld benchmark offers a richer set of activity than either of these benchmarks. To complete its task, it must read a large number of object files of varying size in a nonsequential manner, and write a single large output file. The amount of computation required is considerable, so the workload is not a continuous stream of I/Os, but bursts of filesystem activity interleaved with periods of computation. Our results demonstrate that NASD scales more effectively than SAD on this workload running in 30% less time in the four client-drive configuration.
Table 1: Comparison of NASD/NFS and NASD/AFS performance for the Andrew benchmark against NFSv3 and AFS 3.4a on Digital Unix. All results are averages of five runs with standard deviations in parentheses. |