PDL Abstract

A Statistical Study for File System Meta Data On High Performance Computing Sites

M.S. Thesis, Information Networking Institute, Carnegie Mellon University. May 2012..

Yifan Wang

Information Networking Institute
Carnegie Mellon University
Pittsburgh, PA 15213

*Intel Labs


High performance parallel file systems are critical to the performance of super computers, are specialized to provide different computing services and are changing rapidly in both hardware and software, whose unusual access pattern has drawn great research interest. Yet little knowledge of how file systems evolve and how the way people use file systems change is known, even though significant effort and money has been put into upgrading storage device and designing new file systems. In this paper, we report on the statistics of supercomputing file systems from Parallel Data Lab (PDL) and Los Alamos National Lab (LANL) and compare the current data against their statistics 4 years ago to discover changes in technology and usage pattern and to observe new interesting characters.