NASD Extensions
With asynchronous filesystem oversight and direct communication between clients and storage, NASD drives are uniquely qualified to offer extensions to be used by client applications both with and without file manager involvement. The NASD model focuses on direct communication between clients and storage. The primary reason for this direct communication is to avoid the latency of store-and-forward copying through the file manager, but it also allows storage to export specialized functionality, such as read-modify-write and informed prefetching, to clients with minimal file manager involvement. Since NASD filesystems are likely to evolve more slowly than specialized niche extensions, this minimal involvement is important; it allows storage implementors and application writers to collaborate for better performance without waiting for filesystem support. For application writers to be able to take direct advantage of extensions, the filesystem must provide some services: clients must be able to name pieces of files in NASD namespace (that is, they must be able to translate names and offsets in their namespace to drive IDs, object IDs, and offsets) and obtain NASD capabilities for those pieces. Extensions in our NASD prototype are described by their numeric RPC operation code. The interface provides a well-known mapping of an operation code to an area of the Drive Control object that describes the extension. Drives also enumerate the extensions they support at a well-known location in the Drive Control object. To solve the name resolution and capability problems, we propose an interface similar to SCSI pass-through. The application specifies an extension type, a list of files and offsets, and extension-specific data. The filesystem code on the client resolves the names and offsets to NASD drives, partitions, objects, and offsets, and then forwards the specified RPC to the drives with extension-specific data as arguments. As a platform for NASD extension experiments, we have built a preliminary implementation of remote informed prefetching on NASD. To evaluate its performance, we used a hinting version of the XDataSlice 3D scientific visualization application to render 25 random planar slices through a 256 x 256 x 256 cube of 32-bit data values. We used a DEC 3000/400 (133 MHz, 64 MB, Digital UNIX 3.2g-3) client with a DEC 3000/600 (175 MHz, 64 MB, Digital UNIX 3.2g-3) for the drive, and an AlphaStation 600 5/266 (266 MHz, 64 MB, Digital UNIX 3.2g-3) as the file manager. The drive machine striped its data over three local 1.0GB HPC2247 disks. Without prefetching, slice rendering took an average of 120.2 seconds. With prefetching and read notifications, slice rendering took an average of 68.1 seconds, a speed up of 1.76. We believe that benefits like this will motivate specific applications to purchase appropriate disks provided the specialized disk's advantages can be achieved without changing the community's distributed filesystem.
|