Thursday, May 1, 2008
2:00 pm - 4:00 pm NOTE SPECIAL TIME
PLACE: CIC 2101 NOTE SPECIAL LOCATION
Randal Bryant, CMU
Data-Intensive Scalable Computing
Search engine companies have devised a class of systems for supporting web search, providing interactive response to queries over hundreds of terabytes of data. These "Data-Intensive Scalable Computer" (DISC) systems differ from more traditional high-performance systems in their focus on data: they acquire and maintain continually changing data sets, in addition to performing large-scale computations over the data. With the massive amounts of data arising from such diverse sources as telescope imagery, medical records, online transaction records, and web pages, DISC systems have the potential to achieve major advances in science, health care, business, and information access. DISC opens up many important research topics in system design, resource management, programming models, parallel algorithms, and applications. By engaging the academic research community in these issues, we can more systematically and in a more open forum explore fundamental aspects of a societally important style of computing.
Randal E. Bryant is Dean of the Carnegie Mellon University School of
Computer Science. He has been on the faculty at Carnegie Mellon since
1984, starting as an Assistant Professor and progressing to his
current rank of University Professor of Computer Science. He also
holds a courtesy appointment in the Electrical and Computer
Dr. Bryant's research focuses on methods for formally verifying
digital hardware, and more recently some forms of software. His 1986
paper on symbolic Boolean manipulation using Ordered Binary Decision
Diagrams (BDDs) has the highest citation count of any publication in
the Citeseer database of computer science literature. In addition, he
has developed several techniques to verify circuits by symbolic
simulation, with levels of abstraction ranging from transistors to
very high-level representations.
Dr. Bryant has received widespread recognition for his work. He is a
fellow of the IEEE and the ACM, as well as a member of the National
Academy of Engineering. His awards include the 2007 IEEE Piore Award,
the 1997 ACM Kanellakis Theory and Practice Award (shared with Edmund
M. Clarke, Ken McMillan, and Allen Emerson) for contributing to the
development of symbolic model checking, as well as the 1989 IEEE
W.R.G. Baker Prize for the best paper appearing in any IEEE
publication during the preceding year.
Dr. Bryant teaches courses in computer systems. Along with David
R. O'Hallaron, he developed a novel approach to teaching about the
hardware, networking, and system software that comprise a system from
the perspective of an advanced programmer, rather than from those of
the system designers. Their textbook ``Computer Systems: A
Programmer's Perspective'' is now in use at over 110 universities worldwide
and has been translated into Chinese and Russian.
Dr. Bryant received his B.S. in Applied Mathematics from the
University of Michigan in 1973, and his PhD from MIT in 1981. He was
on the faculty at Caltech from 1981 to 1984.
Steven Schlosser, Intel Research Pittsburgh
Building Ground Models of S. California
Earthquake simulation not only generates a great deal of data, but also requires a great deal of input data describing the physical ground characteristics of the region to be simulated. In the past, these characteristics were generated using discrete programs (often written in creaky Fortran) developed over decades by teams of seismologists. These programs are expensive to run, query, and cross-validate. Recently the seismology community has moved to a new model in which pre-built physical datasets efficiently stored in etrees are shared between scientists, improving repeatability, enabling cross-validation, and increasing query performance. This talk will be about our experiences using Hadoop as a means to generate these datasets, and the interesting things we learned along the way.
Joint work with Mike Ryan, Ricardo Taborda, Julio Lopez, Dave O'Hallaron, Jacobo Bielak.
Steve Schlosser is a research scientist at Intel Research Pittsburgh, whose research is focused on increasing the performance and manageability of modern data storage systems. Most recently, Steve has been involved in a variety of projects relating to what he calls Big Data; an effort to develop new techniques for building, programming, managing, and using large-scale computing clusters, making them more widely available to a range of scientific disciplines.
or visit http://www.pdl.cmu.edu/SDI/