DATE: Thursday July 26, 2007
TIME: 11:30 am - 12:30 pm
PLACE: Intel Research Pittsburgh, 4720 Forbes Avenue, CIC Building 4th Floor, Suite 410

Randal Bryant
Carnegie Mellon University

Data-Intensive Super Computing: Taking Google-Style Computing Beyond Web Search

Web search engines have become fixtures in our society, but few people realize that they are actually publicly accessible supercomputing systems, where a single query can unleash the power of several hundred processors operating on a data set of over 200 terabytes. With Internet search, computing has risen to entirely new levels of scale, especially in terms of the sizes of the data sets involved. Google and its competitors have created a new class of large-scale computer systems, which we label "Data-Intensive Super Computer" (DISC) systems. DISC systems differ from conventional supercomputers in their focus is on data: they acquire and maintain continually changing data sets, in addition to performing large-scale computations over the data.

With the massive amounts of data arising from such diverse sources as telescope imagery, medical records, online transaction records, and web pages, DISC systems have the potential to achieve major advances in science, health care, business, and information access. DISC opens up many important research topics in system design, resource management, programming models, parallel algorithms, and applications. By engaging the academic research community in these issues, we can more systematically and in a more open forum explore fundamental aspects of a societally important style of computing. Recent papers on parallel programming by researchers at Google (OSDI '04) and Microsoft (EuroSys '07) present the results of using up to 1800 processors to perform computations accessing up to 10 terabytes of data. Operating at this scale requires fundamentally new approaches to scheduling, load balancing, and fault tolerance. The academic research community must start working at these scales to have impact on the future of computing and to ensure the relevance of their educational programs.

Randal E. Bryant is Dean of the Carnegie Mellon University School of Computer Science.  He has been on the faculty at Carnegie Mellon since 1984, starting as an Assistant Professor and progressing to his current rank of University Professor of Computer Science.  He also holds a courtesy appointment in the Electrical and Computer Engineering Department.

Dr. Bryant teaches courses in computer systems.  Along with David R. O'Hallaron, he developed a novel approach to teaching about the hardware, networking, and system software that comprise a system from the perspective of an advanced programmer, rather than from those of the system designers.  Their textbook ``Computer Systems: A Programmer's Perspective'' is now in use at over 110 universities worldwide and has been translated into Chinese and Russian.

Dr. Bryant received his B.S. in Applied Mathematics from the University of Michigan in 1973, and his PhD from MIT in 1981.  He was on the faculty at Caltech from 1981 to 1984.

