Date: Mon, 25 Jan 93 09:34:21 EST To: margie@GAUSS.ECE.CMU.EDU Subject: Executive summary section on storage and computer systems integration The main objectives of the thrust on storage and computer systems integration are to explore efficient integration of storage systems into high-performance computing environments. This thrust area was created to foster interdisciplinary work with computer science. It has been transformed over its first three years by staff changes and by the learning implicit in any new interdisciplinary effort. We believe its current shape provides the framework for innovative and productive research in advanced architectures for storage systems. During our first three years we have constructed a distributed file system tracing package for Mach/UNIX and have collected about 140 GB of usage data from about 20 Mach workstations in the School of Computer Science. This flexible tracing system has a negligible impact on system performance. These traces have been used to resolve portable computer storage design questions, to capture magnetic disk performance for geometry diagnosis tools, and as a model for a synthetic reference generation tool. The tracing tools are also used for understanding and debugging our computer systems. We have also worked along side the developing Nectar gigabit local area network. By examining the performance of a file server on existing ethernet and prototype Nectar, we determined system bottlenecks in the server's architecture and software structure. This knowledge is being used in our network server interface designs. We are constructing a high-performance storage systems based on disk arrays for the soon-to-complete Nectar network hardware. Disk arrays have become a major focus of this thrust area. Disk arrays are the essential mechanism for delivering storage system throughput for tomorrow's high-performance computing. In addition to developing a disk array platform for Nectar network experimentation, we have begun to explore other aspects of disk arrays. We have developed an implementation scheme for trading cost and reliability for highly available disk arrays. This scheme is based on balanced incomplete block designs and has been used to demonstrate that proposed optimizing algorithms are not uniformly effective. We have also developed efficient mechanisms for reconstructing a failed disk in a redundant disk array. An important effort in the structure of disk arrays based on small diameter (less than one inch) magnetic disks is getting underway. The project that has undergone the most change in the last three years is our effort to develop a new model for the interaction between storage and computer systems. We began by examining advanced controller designs, progressed by recognizing the limitations of such controllers in the absence of substantial amounts of work, and arrived at a model of interaction that exploits knowledge of pending work in applications and other high-levels of the system. This "Transparent Informed Prefetching" approach can convert the high throughput of disk arrays to low latency for applications while preserving the sound software engineering principles of modularity and layering. We have done some preliminary user-level experimentation that shows the necessity of a operating system integrated implementatation and an attached disk array. We are currently developing such a system.