PARALLEL DATA LAB 

PDL News

2018

December 2018
Mor Harchol-Balter made an IEEE Fellow

Mor Harchol-Balter has been elevated to fellow status in the Institute of Electrical and Electronics Engineers (IEEE), the world's largest technical professional organization. Fellow status is a distinction reserved for select members who have demonstrated extraordinary accomplishments in an IEEE field of interest. Mor, a professor in CSD since 1999, was cited "for contributions to performance analysis and design of computer systems." Her work on designing new resource-allocation policies includes load-balancing policies, power-management policies and scheduling policies for distributed systems. She is heavily involved in the SIGMETRICS/PERFORMANCE research community and is the author of a popular textbook, "Performance Analysis and Design of Computer Systems."
-- The Piper, CMU Community News, Dec. 12, 2018

December 2018
PDL Team Designing Record-breaking Supercomputing File System Framework at Los Alamos National Lab

Trinity occupies a footprint the size of an entire floor of most office buildings, but its silently toiling workers are not flesh and blood. Trinity is a supercomputer at Los Alamos National Laboratory in New Mexico, made up of row upon row of CPUs stacked from the white-tiled floor to the fluorescent ceiling.

The machine is responsible for helping to maintain the United States’ nuclear stockpile, but it is also a valuable tool for researchers from a broad range of fields. The supercomputer can run huge simulations, modeling some of the most complex phenomena known to science.

However continued advances in computing power have raised new issues for researchers.

“If you find a way to double the number of CPUs that you have,” says George Amvrosiadis “you still have a problem of building software that will scale to use them efficiently.” He’s an assistant research professor in Carnegie Mellon’s Parallel Data Lab.

Amvrosiadis was part of a team, including Professors Garth Gibson, and Greg Ganger, Systems Scientist Chuck Cranor, and Ph.D. student Qing Zheng. The team recently lent a hand to a cosmologist from Los Alamos struggling to simulate complex plasma phenomena. The problem wasn’t that Trinity lacked the power to run the simulations, but rather, that it was unable to create and store the massive amounts of data quickly and efficiently. That’s where Amvrosiadis and the DeltaFS team came in.

DeltaFS is a file system designed to alleviate the significant burden placed on supercomputers by data-intensive simulations like the cosmologist’s plasma simulation. When it comes to supercomputing, efficiency is the name of the game. If a task can’t be completed within the amount of time allotted, then the simulation will go incomplete, and precious time will have been wasted. With researchers vying for limited computing resources, any time wasted is a major loss.

DeltaFS was able to streamline the plasma simulation, bringing what had once been too resource-demanding a task within the supercomputer’s capabilities by tweaking a couple parts of how Trinity processed and moved the data.

First, DeltaFS changed the size and quantity of files the simulation program created. Rather than taking large snapshots encompassing every particle in the simulation—which numbered more than a trillion—at once, DeltaFS created a much smaller file for each individual particle. This made it much easier for the scientists to track the activity of individual particles.

Through DeltaFS, Trinity was able to create a record-breaking trillion files in just two minutes. Additionally, DeltaFS was able to take advantage of the roughly 10% of simulation time that is usually spent storing the data created, during which Trinity’s CPUs are sitting idle. The system tagged data as it flowed to storage and created searchable indices that eliminated hours of time that scientists would have had to spend combing through data manually. This allowed the scientists to retrieve the information they needed 1,000–5,000 times faster than prior methods.

The team could not have been more thrilled with the success of DeltaFS’ first real-world test run and are already looking ahead to the future. “We're looking to get it into production and have the cosmologist who originally contacted us use it in his latest experiment,” says Amvrosiadis. “To me that's more of a success story than anything else. Often a lot of the work ends with just publishing a paper and then you're done; that’s just anticlimactic.”

But he and the rest of the team aren’t just looking to limit their efforts to cosmological simulations. They’re currently looking at ways to expand DeltaFS for use with everything from earthquake simulations to crystallography. With countries across the globe striving to create machines that can compute at the exascale, meaning 1018 calculations per second, there’s a growing need to streamline these demanding processes wherever possible.

The trick to finding a one-size-fits-all (or at least most) replacement for the current purpose-built systems in use, is designing the file system to be flexible enough for scientists and researchers to tailor it to their own specific needs.

“What researchers end up doing is stitching a solution together that is customized to exactly what they need, which takes a lot of developer hours,” says Amvrosiadis. “As soon as something changes they have to sit back down to the drawing board and start from scratch and redesign all their code.”

Amvrosiadis and the team have already demonstrated a couple of ways that efficiency can be improved, such as indexing or altering file size and quantity. Now they’re looking into further ways to take advantage of potential inefficiencies, like using in-process analysis to eliminate unneeded data before it ever reaches storage or compressing information in preparation for transfer to other labs.

Solutions like these center around repurposing CPU downtime to perform tasks that will contribute back into the information pipeline and creating smarter ways to organize and store data, increasing overall efficiency. The idea is to let the expert scientists identify the areas where they have room for improvement or untapped resources, and to take advantage of the toolkit and versatile framework DeltaFS can provide.

As the world moves toward exascale computing, the pace that software development must maintain to keep pace with hardware improvements will only increase. Amvrosiadis even hopes that one day more advanced AI techniques could be incorporated to do much of the observational work performed by scientists, cutting down on observation time and freeing them to focus on analysis and study. But for him and the rest of the DeltaFS team, all of that starts with finding little solutions to improve huge processes.

“I don’t know if there’s one framework to rule them all yet—but that’s the goal.”
-- Dan Carroll, CMU Engineering News, December 1, 2018.

November 2018
Gauri Joshi Recipient of 2018 IBM Faculty Award

Gauri Joshi, an assistant professor of electrical and computer engineering, has been named a recipient of a 2018 IBM Faculty Award for her research in distributed machine learning. Faculty Award recipients are nominated by IBM employees in recognition of a specific project that is of significant interest to the company and receive a cash award in support of the selected project. Joshi’s research is about distributing deep learning training algorithms. The data sets used to train neural network models are massive in size, so a single machine is not sufficient to handle the amount of data and the computing required to the analyze that data. Therefore, data sets and computations are typically divided across multiple computing nodes (i.e. computers, machines, or servers), with each node responsible for one part of the data set. In a distributed machine learning system with data sets divided across nodes, researchers use an algorithm called stochastic gradient descent (SGD), which is at the center of Joshi’s research. The algorithm is distributed across the nodes and helps achieve the lowest possible error in the data. It requires exact synchronization, which can lead to delays. “My work is about trying to strike the best balance between the error and the delay in distributed SGD algorithms,” Joshi said. “In particular, this framework fits well with the IBM Watson machine learning platform. I will be working with the IBM Watson Machine Learning vision; I will be working with the IBM Research AI team.” Find out more.
-- The Piper, CMU Community News, Nov. 1, 2018

October 2018
Best Student Paper at SoCC '18!

Congratulations to Andrew Chung and Jun Woo Park, who submitted the Best Student Paper to SoCC '18. Their paper at the Symposium for Cloud Computing, titled "Stratus: Cost-aware Container Scheduling in the Public Cloud," discusses cost considerations of a new cluster scheduler specialized to orchestrate batch job execution on virtual clusters, which dynamically allocates collections of virtual machine instances on public IaaS platforms.

September 2018
PDL Alum Wei Dai Winner of Pittsburgh Business Times 30 under 30 Award!

Wei (David) Dai, who graduated with his Ph.D. in Machine Learning from CMU in 2018 has been listed as one of Pittsburgh's 30 under 30 by the Pittsburgh Business Times. Wei is now the Senior Director of Engineering at Petuum, where they build scalable machine learning platforms for enterprises to easily create and manage complex ML workflows.

July 2018
PDL Team Tests New File System on Trinity Supercomputer

A team from the PDL recently completed work with Los Alamos National Lab simulating physical phenomena involving as many as a trillion individual particles. Their project used the Trinity supercomputer to test a new file system that created a trillion files in just two minutes, allowing them to retrieve data one to five thousand times faster than conventional methods. The team included George Amvrosiadis, Chuck Cranor, Greg Ganger, and Ph.D. student Qing Zheng; the PDL’s Garth Gibson; and Los Alamos National Lab’s Brad Settlemyer and Gary Grider. Read the full article in Wired here.
-- ECE News July 31, 2018

May 2018
Best Paper at SIGMOD 2018!

The Carnegie Mellon Database Group is pleased to announce that their latest paper SuRF: Practical Range Query Filtering with Fast Succinct Tries has won 2018 SIGMOD Best Paper Award. The paper’s lead author author was CMU CSD Ph.D. Huanchen Zhang. This work was in collaboration with CMU professors Dave Andersen and Andy Pavlo, CMU post-doc Hyeontaek Lim, TUM visiting scholar Viktor Leis, Hewlett Packard Labs’ Distinguished Technologist Kimberly Keeton, and Intel Labs’ senior research scientist Michael Kaminsky.

April 2018
Andy Pavlo Receives 2018 Joel & Ruth Spira Teaching Award

The School of Computer Science honored outstanding faculty and staff members April 5 during the annual Founder’s Day ceremony in Rashid Auditorium. It was the seventh year for the event and was hosted by Dean Andrew Moore. Andy Pavlo, Assistant Professor in the Computer Science Department (CSD), was the winner of the Joel and Ruth Spira Teaching Award, sponsored by Lutron Electronics Co. of Coopersburg, Pa., in honor of the company’s founders and the inventor of the electronic dimmer switch.
-- CMU SCS news, April 5, 2018

April 2018
Srinivasan Seshan Appointed Head of CSD

Srinivasan Seshan has been appointed head of the Computer Science Department (CSD), effective July 1. He succeeds Frank Pfenning, who will return to full-time teaching and research. "We are all excited about Srini Seshan's new role as head of CSD," said School of Computer Science Dean Andrew Moore. "He is an outstanding researcher and teacher, and I'm confident that his expanded role in leadership will help the department reach even greater heights." Seshan joined the CSD faculty in 2000, and served as the department's associate head for graduate education from 2011 to 2015. His research focuses on improving the design, performance and security of computer networks, including wireless and mobile networks. He earned his bachelor's, master's and doctoral degrees in computer science at the University of California, Berkeley. He worked as a research staff member at IBM's T.J. Watson Research Center for five years before joining Carnegie Mellon.
--CMU Piper, April 5, 2018

April 2018
Lorrie Cranor Receives IAPP Leadership Award

Lorrie Cranor has received the 2018 Leadership Award from The International Association of Privacy Professionals (IAPP). Cranor, a professor in the Institute for Software Research and the Department of Engineering and Public Policy, accepted the award at the IAPP’s Global Privacy Summit on March 27. “Lorrie Cranor, for 20 years, has been a leading voice and a leader in the privacy field,” said IAPP President and CEO Trevor Hughes. “She developed some of the earliest privacy enhancing technologies, she developed a groundbreaking program at Carnegie Mellon University to create future generations of privacy engineers and she has been a steadfast supporter, participant and leader of the field of privacy for that entire time. Her merits as recipient for our privacy leadership award are unimpeachable. She’s as great a person as we have in our world.” The IAPP Leadership Award is given annually to individuals who demonstrate an “ongoing commitment to furthering privacy policy, promoting recognition of privacy issues and advancing the growth and visibility of the privacy profession.” Cranor helped develop and is now co-director of CMU's MSIT-Privacy Engineering master's degree program as well as director of the CyLab Usable Privacy and Security Laboratory.
--CMU Piper, April 5, 2018

March 2018
Rashmi Vinayak Wins Facebook Communications & Networking Research Award

Congratulations to Rashmi on winning a 2018 Facebook Communications and Networking Research Award for her work on "Navigating the Latency-Quality Tradeoff in Personalized Live Video Streaming." A total of five awards were granted each values at $50,000. Rasmi will present her research to a Faculty Research Summit at Facebook offices later this year.

March 2018
Andy Pavlo Wins Google Faculty Research Award

The CMU Database Group and the PDL are pleased to announce that Prof. Andy Pavlo has won a 2018 Google Faculty Research Award. This award was for his research on automatic database management systems. Andy was one of a total 14 faculty members at Carnegie Mellon University selected for this award. The Google Faculty Research Awards is an annual open call for proposals on computer science and related topics such as machine learning, machine perception, natural language processing, and quantum computing. Grants cover tuition for a graduate student and provide both faculty and students the opportunity to work directly with Google researchers and engineers. This round received 1033 proposals covering 46 countries and over 360 universities from which 152 were chosen to fund. The subject areas that received the most support this year were human computer interaction, machine learning, machine perception, and systems. Here are a few observations from this round:
-- Google and CMU Database Group news, March 20, 2018

February 2018
Lorrie Cranor Wins Top SIGCHI Award

Lorrie Cranor, a professor in the Institute for Software Research and the Department of Engineering and Public Policy, is this year’s recipient of the Social Impact Award from the Association for Computing Machinery Special Interest Group on Computer Human Interaction (SIGCHI).

The Social Impact Award is given to mid-level or senior individuals who promote the application of human-computer interaction research to pressing social needs and includes an honorarium of $5,000, the opportunity to give a talk about the awarded work at the CHI conference, and lifetime invitations to the annual SIGCHI award banquet.

“Lorrie's work has had a huge impact on the ability of non-technical users to protect their security and privacy through her user-centered approach to security and privacy research and development of numerous tools and technologies,” said Blase Ur, who prepared Lorrie'snomination. Ur is a former Ph.D. student of Lorrie's, and is now an assistant professor at the University of Chicago.

In addition to Ur, three former students from Cranor’s CyLab Usable Privacy and Security Lab – Michelle Mazurek, Florian Schaub and Yang Wang – supported Lorrie's nomination. “All four of us are currently assistant professors, spread out across the United States,” said Ur, who received his doctorate degree in 2016. “In addition to this impact on end users, the four of us who jointly nominated her have also benefitted greatly from her mentorship."

A full summary of this year’s SIGCHI award recipients can be found on the organization’s website.
-- info from Cylab News, Daniel Tkacik, Feb 23, 2018

February 2018
Andy Pavlo Awarded a Sloan Fellowship

"The Sloan Research Fellows represent the very best science has to offer," said Sloan President Adam Falk. "The brightest minds, tackling the hardest problems, and succeeding brilliantly — fellows are quite literally the future of 21st century science."

Andrew Pavlo, an assistant professor of computer science, specializes in the study of database management systems, specifically main memory systems, non-relational systems (NoSQL), transaction processing systems (NewSQL) and large-scale data analytics. He is a member of the Database Group and the Parallel Data Laboratory. He joined the Computer Science Department in 2013 after earning a Ph.D. in computer science at Brown University. He won the 2014 Jim Gray Doctoral Dissertation Award from the Association for Computing Machinery's (ACM) Special Interest Group on the Management of Data.
-- Carnegie Mellon University News, Feb. 15, 2018

December 2017
Mor Harchol-Balter and Onur Mutlu Made Fellows of the ACM

Congratulations to Mor (Professor of CS) and Onur (adjunct Professor of ECE), who have been made Fellows of the ACM.

From the ACM website: "To be selected as a Fellow is to join our most renowned member grade and an elite group that represents less than 1 percent of ACM’s overall membership,” explains ACM President Vicki L. Hanson. “The Fellows program allows us to shine a light on landmark contributions to computing, as well as the men and women whose hard work, dedication, and inspiration are responsible for groundbreaking work that improves our lives in so many ways."

Mor was selected "for contributions to performance modeling and analysis of distributed computing systems."

Onur, who is now at ETH Zurich was chosen for "contributions to computer architecture research, especially in memory systems."
--with info from www.acm.org

2017

December 2017
Mor Harchol-Balter and Onur Mutlu made Fellows of the ACM

Congratulations to Mor (Professor of CS) and Onur (adjunct Professor of ECE), who have been made Fellows of the ACM.

From the ACM website: "To be selected as a Fellow is to join our most renowned member grade and an elite group that represents less than 1 percent of ACM’s overall membership,” explains ACM President Vicki L. Hanson. “The Fellows program allows us to shine a light on landmark contributions to computing, as well as the men and women whose hard work, dedication, and inspiration are responsible for groundbreaking work that improves our lives in so many ways."

Mor was selected "for contributions to performance modeling and analysis of distributed computing systems." Onur , who is now at ETH Zurich was chosen for "contributions to computer architecture research, especially in memory systems."
--with info from www.acm.org

October 2017
Lorrie Cranor Awarded FORE Systems Chair of Computer Science

We are very pleased to announce that, in addition to a long list of accomplishments, which has included a term as the Chief Technologist of the Federal Trade Commission, Lorrie Cranor has been made the FORE Systems Professor of Computer Science and Engineering & Public Policy at CMU.

Lorrie provided information that "the founders of FORE Systems, Inc. established the FORE Systems Professorship in 1995 to support a faculty member in the School of Computer Science. The company’s name is an acronym formed by the initials of the founders’ first names. Before it was acquired by Great Britain’s Marconi in 1998, FORE created technology that allows computer networks to link and transfer information at a rapid speed. Ericsson purchased much of Marconi in 2006." The chair was previously held by CMU University Professor Emeritus, Edmund M. Clarke.

More recent PDL news here.