Thursday, June 26, 2008
12:00 noon - 1:00 pm
Intel Research Pittsburgh, 4720 Forbes Avenue, CIC Building 4th Floor, Suite 410
NOTE SPECIAL LOCATION
H. Andrés Lagar-Cavilla
University of Toronto
SnowFlock: Parallel Cloud Computing Made Agile
Cloud computing represents an excellent opportunity for the execution of emerging scientific and engineering high performance applications. In particular, bioinformatics workloads which are characterized by their access to large datasets and their embarrassing parallelism are specially amenable for this model. Many researchers would like to test new algorithms or complete their jobs with the maximum possible number of processors, something they cannot always obtain easily or quickly. Encapsulating their applications in VMs and submitting them to shared compute clusters allows them to get hold of vast computing resources beyond their usual reach, without the need to learn new software tools or even rebuild their applications. Further, access to large data sets (genomes, phylogenetic) that are too unwieldy or dynamic to cache locally can be simplified by co-locating them with a compute cluster.
By virtue of their embarrassingly parallel quality, many bioinformatics applications are able to shrink their completion times to the order of seconds, providing quasi-interactive response times to external requests arriving via e.g. a web server interface. Achieving such speedups demands an agility in spawning new computing elements that compute clusters based on virtualization currently lack. Today, when a cluster user requests new VMs, they are provided by booting a copy of her VM from scratch, resuming from a saved state on secondary storage, or live-migrating an idle VM from a host where it was being consolidated. These primitives do not scale gracefully and will typically take longer than the processing of a single worker thread.
In this talk I will present a new primitive, VM cloning, that is able to replicate a running VM to a large number of hosts in sub-second time, and with a runtime overhead that is not noticeable for most workloads. This allows applications in a shared virtualized cluster to scale in an agile manner, and to shrink runtimes of easily parallelizable jobs to mere seconds. While conceived from within a bioinformatics scope, this work applies to many other fields with large-scale parallel applicationss: financial, rendering, search, etc.
H. Andrés Lagar-Cavilla is an experimental computer systems researcher. He is a PhD candidate in Computer Science at the University of Toronto, where he finished his MSc in December 2004. Prior to that he obtained a Computer Systems Engineering bachelor at the Universidad Nacional del Sur university in Bahia Blanca, Argentina. Throughout his PhD research he has explored applications of virtualization to high-performance and cluster computing, security, computation migration, and graphics-intensive interactive applications.
Host: Dave O'Hallaron
Visitor Coordinator: Tracy Farbacher, email@example.com, 8-8824
or visit http://www.pdl.cmu.edu/SDI/