Other PDL Research Areas
Contact: Greg Ganger
We propose that applications should issue hints which disclose their future I/O accesses to help applications leverage disk array parallelism as well as take full advantage of available network bandwidth to minimize access latency.
For this project, we have implemented an aggressive prefetching strategy based on application access-pattern disclosure (hints) that allows high storage-system throughput to be converted to low application latency. We have also built a mechanism which allocates file buffers dynamically where they will have the best impact on application execution time. To do this, we estimate the impact of various allocations independently, weigh the costs against the benefits, and then allocate the buffers for the most benefit.
Manually modifying applications to issue prefetching hints can require
substantial programming and debugging effort. To address this concern,
we have built a binary modification tool that transforms application
binaries so that they will generate prefetching hints automatically.
In particular, the transformation causes the applications to discover
their future data needs by performing speculative execution while
they would ordinarily be stalled on
- For more details on hint generation, visit Automatic I/O Hint Generation through Speculative Execution
- Within the list of TIP Publications, see the conference paper "Automatic I/O Hint Generation through Speculative Execution".
We are employing broad strategies that expose parallelism in high levels of system and application software, allowing storage devices to be used more efficiently and application programs' actual throughput and latency needs to be better met.
The Scotch Parallel File System (SPFS), is an advanced multicomputer file system, developed concurrently with the first generation Scotch High-Peformance Storage Testbed (Scotch-1). SPFS has a number of interesting features including:
- High scalability by striping over independent storage servers,
- Fault tolerance and availability selectable on a file by file basis,
- Disclosure based prefetching, write-behind and resource management, and
- Application controlled cache consistency on read-write files.
- Scotch Parallel Storage System Publications
- A section of the paper "The Scotch Parallel Storage Systems" discusses the design of
We have built a new network parallel flow service that supports high-bandwidth data movement from parallel storage servers to parallel clients over switched networks (e.g., ATM, HiPPI, switched Ethernet, and switched FDDI). The two main components of the parallel flow service are: 1) a new network abstraction that supports parallel communication and bandwidth planning; and 2) a coordinated routing mechanism that dictates the flow of data flow, leading to better use of network resources and high application throughput.
Redundant disk arrays are emerging as an important architecture for high performance, high reliability, cost effective secondary storage. The broad goals of this project are to advance the state of the art of redundant disk arrays. Each of the following projects deals with reducing the complexity of RAID designs.
Correctness verification of RAID implementations is difficult and over 50 percent of code is often devoted to error-handling. The overall goal of this project is to enable correctness verification, decrease design-cycle time, and ensure that our error-handling method is extensible by decoupling implementation from design. We use antecedence graphs to achieve these goals.
- The PDL RAID web pages.
- Our publications on error recovery.
- An SCS paper on Proving Correctness of a Controller Algorithm for the RAID Level 5 System.
To evaluate new RAID architectures and algorithms, we have developed an extensible RAID driver that runs as a simulator, a user-level software array controller, and a device driver in the kernel. This controller, RAIDframe, is based upon representing RAID read and write operations as directed acyclic graphs (DAGs) of primitive operations.
- RAIDframe code and documentation
- The PDL RAID web pages
- A port of the RAIDframe kernel driver to NetBSD.
This project, now finished, resulted in a RAID organization for improving performance during the recovery of a failed disk.
- Mark's Holland's thesis and other papers on declustering research.
This project, now finished, resulted in a RAID organization for improving throughput workloads that emphasize small, random writes.
There is a great demand for low maintenance coupled with high storage from small disks. Our goal is to replace the single disk in portable systems with an array of four to six 1" disks that would provide both the needed capacity and lower power use. Adding a redundant disk will also increase the reliability and availability of storage on mobile computers.
- Rachad Youssef's Master's thesis "RAID for Mobile Computers"
The first Scotch testbed, Scotch-1, no longer in use, was primarily used for the early Transparent Informed Prefetching research. Scotch-1 was composed of a 25 MHz Decstation 5000/200 with a turbochannel system bus (100 MB/s) running the Mach 3.0 operating system. It was equipped with two SCSI buses and four 300 MB IBM 0661 "Lightning" drives.
The second Scotch testbed, Scotch-2, was a larger and faster version of Scotch-1 used for the RAID architecture and implementation research in the Error Recovery and RAIDframe projects and for second generation TIP experiments. Scotch-2 was composed of a 150-Mhz DEC 3000/500 (Alpha workstation running the OSF/1 operating system and equipped with six fast SCSI bus controllers. Each bus had five HP 2247 drives, giving the total system a capacity of 30 GB.
The third testbed, Scotch-3, was the storage component in a heterogenous multicomputer composed of 38 workstations, 30 DEC 3000 (Alpha) and 8 IBM RS6000 (Power PC), distributed over switched-HIPPI and OC3 ATM networks. This multicomputer was used for parallel application, parallel programming tool, and multicomputer operating system experiments in addition to TIP research. Scotch-3 was composed of ten DEC 3000 (Alpha) workstations with turbochannel system buses. Each workstation contained one fast, wide, differential SCSI adapter connected to both controllers of an AT&T (NCR) 6299 disk array. All workstations were interconnected by OC3 (155 Mbit/s) links to a FORE ASX-200 ATM switch complex and five of the workstations were also connected by HIPPI (800 Mbit/s) links to an NSC PS-32 HIPPI switch complex. All storage was available to any node through the Scotch parallel file system and the appropriate routing.
- An HTML version of "The
Scotch Parallel Storage Systems"
We thank the members and companies of the PDL Consortium: Broadcom, Ltd., Citadel, Dell EMC, Facebook, Google, Hewlett-Packard Labs, Hitachi Ltd., Intel Corporation, Microsoft Research, MongoDB, NetApp, Inc., Oracle Corporation, Samsung Information Systems America, Seagate Technology, Tintri, Two Sigma, Uber, Veritas and Western Digital for their interest, insights, feedback, and support.