Recent Publications

  • Mochi: Composing Data Services for High-Performance Computing Environments. Robert B. Ross, George Amvrosiadis, Philip Carns, Charles D. Cranor, Matthieu Dorier, Kevin Harms, Greg Ganger, Garth Gibson, Samuel K. Gutierrez, Robert Latham, Bob Robey, Dana Robinson, Bradley Settlemyer, Galen Shipman, Shane Snyder, Jerome Soumagne, Qing Zheng. Journal of Computer Science and Technology 35(1): 121–144 Jan. 2020.
    Abstract / PDF [1.3M]

  • Processing-in-Memory: A Workload-Driven Perspective. S. Ghose, A. Boroumand, J. S. Kim, J. Gómez-Luna, O. Mutlu. To appear in IBM Journal of Research and Development (JRD), November 2019.
    Abstract / PDF [2.1M]

  • Multiversioned Page Overlays: Enabling Faster Serializable Hardware Transactional Memory. Ziqi Wang, Michael A. Kozuch, Todd C. Mowry, Vivek Seshadri. 28th Parallel Architecture and Compiler Technologies 2019 (PACT'19), Sept 21-25, 2019, Seattle, WA.
    Abstract / PDF [475K]

  • Compact Filters for Fast Online Data Partitioning. Qing Zheng, Charles D. Cranor, Ankush Jain, Gregory R. Ganger, Garth A. Gibson, George Amvrosiadis, Bradley W. Settlemyer, Gary Grider. IEEE CLUSTER 2019. September 23 - 26, 2019, Albuquerque, New Mexico, USA.
    Abstract / PDF [1M]

  • File Systems Unfit as Distributed Storage Backends: Lessons from 10 Years of Ceph Evolution. Abutalib Aghayev, Sage Weil, Michael Kuchnik, Mark Nelson, Gregory R. Ganger, George Amvrosiadis. SOSP ’19, October 27–30, 2019, Huntsville, ON, Canada.
    Abstract / PDF [870K]

  • Parity Models: Erasure-Coded Resilience for Prediction Serving Systems. Jack Kosaian, K. V. Rashmi, Shivaram Venkataraman. SOSP ’19, October 27–30, 2019, Huntsville, ON, Canada.
    Abstract / PDF [1M]

  • PipeDream: Generalized Pipeline Parallelism for DNN Training. Deepak Narayanan, Aaron Harlap, Amar Phanishayee, Vivek Seshadri, Nikhil R. Devanur, Gregory R. Ganger, Phillip B. Gibbons, Matei Zaharia. SOSP ’19, October 27–30, 2019, Huntsville, ON, Canada.
    Abstract / PDF [1M]

  • TVARAK: Software-Managed Hardware Offload for DAX NVM Storage Redundancy. Rajat Kateja, Nathan Beckmann, Greg Ganger. Carnegie Mellon University Parallel Data Lab Technical Report CMU-PDL-19-105, Aug 2019.
    Abstract / PDF [975K]

  • STRADS-AP: Simplifying Distributed Machine Learning Programming without Introducing a New Programming Model. Jin Kyu Kim, Abutalib Aghayev, Garth A. Gibson, Eric P. Xing. Proceedings of the 2019 USENIX Annual Technical Conference, July 10–12, 2019 • Renton, WA.
    Abstract / PDF [490K]

  • Rateless Codes for Distributed Computations with Sparse Compressed Matrices. Ankur Mallick, Gauri Joshi. IEEE International Symposium on Information Theory (ISIT), July 7-12, 2019, Paris, France.
    Abstract / PDF [672K]

  • Peering through the Dark: An Owl’s View of Inter-job Dependencies and Jobs’ Impact in Shared Clusters. Andrew Chung, Carlo Curino, Subru Krishnan, Konstantinos Karanasos, Panagiotis Garefalakis, Gregory R. Ganger. SIGMOD ’19, June 30–July 5, 2019, Amsterdam, Netherlands.
    Abstract / PDF [1.6M]

  • Distribution-based Cluster Scheduling. Jun Woo Park. Carnegie Mellon University School of Computer Science PhD Dissertation, June 2019.
    Abstract / PDF [1.47M]

  • Enabling Practical Processing in and Near Memory for Data-Intensive Computing. O. Mutlu, S. Ghose, J. Gómez-Luna, R. Ausavarungnirun. Proc. of the Design Automation Conference (DAC), Las Vegas, NV, June 2019.
    Abstract / PDF [477K]

  • CROW: A Low-Cost Substrate for Improving DRAM Performance, Energy Efficiency, and Reliability. H. Hassan, M. Patel, J. S. Kim, A. G. Yaglikçi, N. Vijaykumar, N. Mansouri Ghiasi, S. Ghose, O. Mutlu. Proc. of the International Symposium on Computer Architecture (ISCA), Phoenix, AZ, June 2019.
    Abstract / PDF [1.45M]

  • CoNDA: Efficient Cache Coherence Support for Near-Data Accelerators. A. Boroumand, S. Ghose, M. Patel, H. Hassan, B. Lucia, R. Ausavarungnirun, K. Hsieh, N. Hajinazar, K. T. Malladi, H. Zheng, O. Mutlu. Proc. of the International Symposium on Computer Architecture (ISCA), Phoenix, AZ, June 2019.
    Abstract / PDF [1.1M]

  • Understanding the Interactions ofWorkloads and DRAM Types: A Comprehensive Experimental Study. S. Ghose, T. Li, N. Hajinazar, D. Senol Cali, O. Mutlu. Proc. of the Joint ACM SIGMETRICS/IFIP Performance Conference, Phoenix, AZ, June 2019; To appear in Proceedings of the ACM on Measurement and Analysis of Computing Systems (POMACS), 2019.
    Abstract / PDF [2M]

  • Compact Filter Structures for Fast Data Partitioning. Qing Zheng, Charles D. Cranor, Ankush Jain, Gregory R. Ganger, Garth A. Gibson, George Amvrosiadis, Bradley W. Settlemyer, Gary A. Grider. Carnegie Mellon University Parallel Data Lab Technical Report CMU-PDL-19-104, June 2019.
    Abstract / PDF[574K]

  • Improving ML Applications in Shared Computing Environments. Aaron Harlap. Carnegie Mellon University Electrical and Computer Engineering PhD Dissertation, May 2019.
    Abstract / PDF [1.4M]

  • This is Why ML-driven Cluster Scheduling Remains Widely Impractical. Michael Kuchnik, Jun Woo Park, Chuck Cranor, Elisabeth Moore, Nathan DeBardeleben, George Amvrosiadis. Carnegie Mellon University Parallel Data Lab Technical Report CMU-PDL-19-103, May 2019.
    Abstract / PDF [715K]

  • Fast and Efficient Distributed Matrix-Vector Multiplication Using Rateless Fountain Codes. Ankur Mallick, Malhar Chaudhari, Gauri Joshi. International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 12 - 17 May, 2019 · Brighton, UK.
    Abstract / PDF [485K]

  • Reconciling LSM-Trees with Modern Hard Drives using BlueFS. Abutalib Aghayev, Sage Weil, Greg Ganger, George Amvrosiadis. Carnegie Mellon University Parallel Data Lab Technical Report CMU-PDL-19-102, April 2019.
    Abstract / PDF [735K]

  • Intelligence Beyond the Edge: Inference on Intermittent Embedded Systems. Graham Gobieski, Brandon Lucia, Nathan Beckmann Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’19), April 13th – April 17th, Providence, RI.
    Abstract / PDF [3.35M]
  • Lazy Redundancy for NVM Storage: Handing the Performance-Reliability Tradeoff to Applications. Rajat Kateja, Andy Pavlo, Greg Ganger Carnegie Mellon University Parallel Data Lab Technical Report CMU-PDL-19-101, April 2019.
    Abstract / PDF [800K]

  • Scaling Video Analytics on Constrained Edge Nodes. Christopher Canel, Thomas Kim, Giulio Zhou, Conglong Li, Hyeontaek Lim, David G. Andersen, Michael Kaminsky, Subramanya R. Dulloor. 2nd SysML Conference (SysML ’19). March 31-April 2, 2019, Palo Alto, CA.
    Abstract / PDF [8.5M]

  • Datacenter RPCs can be General and Fast. Anuj Kalia Michael, Kaminsky, David G. Andersen. 16th USENIX Symposium on Networked Systems Design and Implementation (NSDI), Feb. 26–28, 2019, Boston, MA. Best Paper award!
    Abstract / PDF [555K]

  • Cluster Storage Systems Gotta Have HeART: Improving Storage Efficiency by Exploiting Disk-reliability Heterogeneity. Saurabh Kadekodi, K. V. Rashmi, Gregory R. Ganger. 17th USENIX Conference on File and Storage Technologies (FAST '19) Feb. 25–28, 2019 Boston, MA.
    Abstract / PDF [1.1M]

  • A Scalable Priority-Aware Approach to Managing Data Center Server Power. Yang Li, Charles R. Lefurgy, Karthick Rajamani, Malcolm S. Allen-Ware, Guillermo J. Silva, Daniel D. Heimsoth, Saugata Ghose, Onur Mutlu. HPCA 2019: The 25th International Symposium on High-Performance Computer Architecture, February 16 - 20, 2019, Washington D.C.
    Abstract / PDF [610K]

  • What Your DRAM Power Models Are Not Telling You: Lessons from a Detailed Experimental Study. S. Ghose, A. G. Yaglikçi, R. Gupta, D. Lee, K. Kudrolli, W. X. Liu, H. Hassan, K. K. Chang, N. Chatterjee, A. Agrawal, M. O'Connor, O. Mutlu. Proc. of the ACM SIGMETRICS Conference, Irvine, CA, June 2018; Proceedings of the ACM on Measurement and Analysis of Computing Systems (POMACS), Vol. 2, No. 3, December 2018.
    Abstract / PDF [2.6M]

  • Improving 3D NAND Flash Memory Lifetime by Tolerating Early Retention Loss and Process Variation. Y. Luo, S. Ghose, Y. Cai, E. F. Haratsch, O. Mutlu. Proc. of the ACM SIGMETRICS Conference, Irvine, CA, June 2018; Proceedings of the ACM on Measurement and Analysis of Computing Systems (POMACS), Vol. 2, No. 3, December 2018.
    Abstract / PDF [3.2M]

  • SRPT for Multiserver Systems. Isaac Grosof, Ziv Scully, Mor Harchol-Balter. Performance Evaluation , vol. 127-128, Nov. 2018, pp. 154-175. Also in Proc. 36th International Symposium on Computer Performance, Modeling, Measurements, and Evaluation (Performance 2018) , Toulouse, France, December 2018. Best Student Paper Award.
    Abstract / PDF [780K]

  • Towards Lightweight and Robust Machine Learning for CDN Caching. Daniel S. Berger. HotNets-XVII, November 15–16, 2018, Redmond, WA, USA.
    Abstract / PDF [610K]

  • Scaling Embedded In-Situ Indexing with DeltaFS. Qing Zheng, Charles D. Cranor, Danhao Guo, Gregory R. Ganger, George Amvrosiadis, Garth A. Gibson, Bradley W. Settlemyer, Gary Grider, Fan Guo. SC18, November 11-16, 2018, Dallas, Texas, USA.
    Abstract / PDF [927K]

  • Stratus: Cost-aware Container Scheduling in the Public Cloud. Andrew Chung, Jun Woo Park, Gregory R. Ganger. ACM Symposium on Cloud Computing, 2018 (SoCC’18), Carlsbad, CA October 11-13, 2018.
    Abstract / PDF [1.5M]

  • Focus: Querying Large Video Datasets with Low Latency and Low Cost. Kevin Hsieh, Ganesh Ananthanarayanan, Peter Bodik, Shivaram Venkataraman, Paramvir Bahl, Matthai Philipose, Phillip B. Gibbons, Onur Mutlu. 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI), Oct. 8–10, 2018, Carlsbad, CA.
    Abstract / PDF [1.2M]

  • RobinHood: Tail Latency Aware Caching—Dynamic Reallocation from Cache-Rich to Cache-Poor. Daniel S. Berger, Benjamin Berg, Timothy Zhu, Siddhartha Sen, Mor Harchol-Balter. 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI ’18). October 8–10, 2018 • Carlsbad, CA, USA.
    Abstract / PDF [2.9M]

  • SOAP Bubbles: Robust Scheduling Under Adversarial Noise. Ziv Scully, Mor Harchol-Balter. 56th Annual Allerton Conference on Communication, Control, and Computing, 2-5 Oct. 2018. Monticello, IL.
    Abstract / PDF [245K]

  • Exploiting Locality in Graph Analytics through Hardware-Accelerated Traversal Scheduling. Anurag Mukkara, Nathan Beckmann, Maleen Abeydeera, Xiaosong Ma, Daniel Sanchez. 51st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), 20-24 Oct. 2018, Fukuoka, Japan.
    Abstract / PDF [660K]

  • The Parallel Persistent Memory Model. Guy E. Blelloch, Phillip B. Gibbons, Yan Gu, Charles McGuffey, Julian Shun. SPAA ’18, July 16–18, 2018, Vienna, Austria.
    Abstract / PDF [760K]

  • Putting the “Micro” Back in Microservice. Sol Boucher, Anuj Kalia, David G. Andersen, Michael Kaminsky. 2018 USENIX Annual Technical Conference (USENIX ATC ’18). July 11–13, 2018 • Boston, MA.
    Abstract / PDF [740K]

  • Geriatrix: Aging What You See and What You Don’t See -- A File System Aging Approach for Modern Storage Systems. Saurabh Kadekodi, Vaishnavh Nagarajan, Gregory R. Ganger, Garth A. Gibson. 2018 USENIX Annual Technical Conference (USENIX ATC ’18). July 11–13, 2018 • Boston, MA.
    Abstract / PDF [1.44M]

  • Cavs: An Efficient Runtime System for Dynamic Neural Networks. Shizhen Xu, Hao Zhang, Graham Neubig, Wei Dai, Jin Kyu Kim, Zhijie Deng, Qirong Ho, Guangwen Yang, Eric P. Xing. 2018 USENIX Annual Technical Conference (USENIX ATC ’18). July 11–13, 2018 • Boston, MA.
    Abstract / PDF [1.7M]

  • Litz: Elastic Framework for High-Performance Distributed Machine Learning. Aurick Qiao, Abutalib Aghayev, Weiren Yu, Haoyang Chen, Qirong Ho, Garth A. Gibson, Eric P. Xing. 2018 USENIX Annual Technical Conference (USENIX ATC ’18). July 11–13, 2018 • Boston, MA.
    Abstract / PDF [298K]

  • Mainstream: Dynamic Stem-Sharing for Multi-Tenant Video Processing. Angela H. Jiang, Daniel L.K. Wong, Christopher Canel, Lilia Tang, Ishan Misra, Michael Kaminsky*, Michael A. Kozuch*, Padmanabhan Pillai*, David G. Andersen Gregory R. Ganger. 2018 USENIX Annual Technical Conference (USENIX ATC ’18). July 11–13, 2018 • Boston, MA, USA.
    Abstract / PDF [1.5M]

  • Tributary: Spot-dancing for Elastic Services with Latency SLOs. Aaron Harlap, Andrew Chung, Alexey Tumanov, Gregory R. Ganger, Phillip B. Gibbons. 2018 USENIX Annual Technical Conference. July 11–13, 2018 Boston, MA, USA. Supersedes Carnagie Mellon University Parallel Data Lab Technical Report CMU-PDL-18-102.
    Abstract / PDF [1.25M]

  • On the Diversity of Cluster Workloads and its Impact on Research Results. George Amvrosiadis, Jun Woo Park, Gregory R. Ganger, Garth A. Gibson, Elisabeth Baseman, Nathan DeBardeleben. 2018 USENIX Annual Technical Conference (ATC '18), Boston, MA, July 11-13, 2018.
    Abstract / PDF [285K]

  • A Case for Packing and Indexing in Cloud File Systems. Saurabh Kadekodi, Bin Fan, Adit Madan, Garth A. Gibson, Gregory R. Ganger. 10th USENIX Workshop on Hot Topics in Cloud Computing, July 9, 2018, Boston, MA. Supersedes CMU-PDL-17-105.
    Abstract / PDF [250K]

  • FLIN: Enabling Fairness and Enhancing Performance in Modern NVMe Solid State Drives. A. Tavakkol, M. Sadrosadati, S. Ghose, J. Kim, Y. Luo, Y. Wang, N. M. Ghiasi, L. Orosa, J. Gómez-Luna, O. Mutlu. Proc. of the International Symposium on Computer Architecture (ISCA), Los Angeles, CA, June 2018.
    Abstract / PDF [888K]

  • Learning a Code: Machine Learning for Approximate Non-Linear Coded Computation. Jack Kosaian, K.V. Rashmi, Shivaram Venkataraman. arXiv:1806.01259v1 [cs.LG], 4 Jun 2018
    Abstract / PDF [575K]

  • Practical Bounds on Offline Caching with Variable Object Sizes. Daniel Berger, Nathan Beckmann, Mor Harchol-Balter. Proc. ACM Meas. Anal. Comput. Syst., Vol. 2, No. 2, Article 32. June 2018. POMACS 2018.
    Abstract / PDF [1.2M]

  • Query-based Workload Forecasting for Self-Driving Database Management Systems. Lin Ma, Dana Van Aken, Ahmed Hefny, Gustavo Mezerhane, Andrew Pavlo, Geoffrey J. Gordon. SIGMOD/PODS '18 International Conference on Management of Data, Houston, TX, USA, June 10 - 15, 2018.
    Abstract / PDF [1.25M]

  • Building a Bw-Tree Takes More Than Just Buzz Words. Ziqi Wang, Andrew Pavlo, Hyeontaek Lim, Viktor Leis, Huanchen Zhang, Michael Kaminsky, David G. Andersen. SIGMOD’18, June 10–15, 2018, Houston, TX, USA.
    Abstract / PDF [2.2M]

  • SuRF: Practical Range Query Filtering with Fast Succinct Tries. Huanchen Zhang, Hyeontaek Lim, Viktor Leis, David G. Andersen, Michael Kaminsky, Kimberly Keeton, Andrew Pavlo. SIGMOD’18, June 10–15, 2018, Houston, TX, USA.
    Abstract / PDF [1.9M]

  • The Locality Descriptor: A Holistic Cross-Layer Abstraction to Express Data Locality in GPUs. Nandita Vijaykumar, Eiman Ebrahimi, Kevin Hsieh, Phillip B. Gibbons, Onur Mutlu. The 45th International Symposium on Computer Architecture - June 2-6, ISCA 2018. Los Angeles, California, USA.
    Abstract / PDF [3.1M]

  • A Case for Richer Cross-layer Abstractions: Bridging the Semantic Gap with Expressive Memory. Nandita Vijaykumar, Abhilasha Jain, Diptesh Majumdar, Kevin Hsieh, Gennady Pekhimenko, Eiman Ebrahimi, Nastaran Hajinazaru, Phillip B. Gibbons, Onur Mutlu. 45th International Symposium on Computer Architecture (ISCA), Los Angeles, CA, USA, June 2018.
    Abstract / PDF [2M]

  • Practical Bounds on Optimal Caching with Variable Object Sizes. Daniel S. Berger, Nathan Beckmann, Mor Harchol-Balter. Proceedings of the ACM on Measurement and Analysis of Computing Systems. Vol. 2, No. 2, Article 32, June 2018.
    Abstract / PDF [1.2M]

  • Implicit Decomposition for Write-Efficient Connectivity Algorithms. Naama Ben-David, Guy E. Blelloch, Jeremy T. Fineman, Phillip B. Gibbons, Yan Gu, Charles McGuffey, and Julian Shun. 2018 International Parallel and Distributed Processing Symposium (IPDPS '18). May 21-25, 2018, Vancouver, BC, Canada.
    Abstract / PDF [716K]

  • 3Sigma: Distribution-based Cluster Scheduling for Runtime Uncertainty. Jun Woo Park, Alexey Tumanov, Angela Jiang, Michael A. Kozuch, Gregory R. Ganger. EuroSys ’18, April 23–26, 2018, Porto, Portugal. Supersedes CMU-PDL-17-107, Nov. 2017.
    Abstract / PDF [1.4M]

  • LHD: Improving Cache Hit Rate by Maximizing Hit Density. Nathan Beckmann, Haoxian Chen, Asaf Cidon. 15th USENIX Symposium on Networked Systems Design and Implementation ({NSDI} 18), April 9-11, 2018, Renton, WA..
    Abstract / PDF [1.1M]

  • Better Caching in Search Advertising Systems with Rapid Refresh Predictions. Conglong Li, David G. Andersen, Qiang Fu, Sameh Elnikety, Yuxiong He. Proceedings of the 2018 World Wide Web Conference, Lyon, France, April 23 - 27, 2018.
    Abstract / PDF [1.1M]

  • Rateless Codes for Near-Perfect Load Balancing in Distributed Matrix-Vector Multiplication. Ankur Mallick, Malhar Chaudhari, Gauri Joshi. arXiv:1804.10331v2 [cs.DC] 30 Apr 2018.
    Abstract / PDF [1.1M]

  • GoogleWorkloads for Consumer Devices: Mitigating Data Movement Bottlenecks. Amirali Boroumand, Saugata Ghose, Youngsok Kim, Rachata Ausavarungnirun, Eric Shiu, Rahul Thakur, Daehyun Kim, Aki Kuusela, Allan Knies, Parthasarathy Ranganathan, Onur Mutlu. ASPLOS’18, March 24–28, 2018, Williamsburg, VA, USA.
    Abstract / PDF [885K]

  • LTRF: Enabling High-Capacity Register Files for GPUs via Hardware/Software Cooperative Register Prefetching. Mohammad Sadrosadati, Amirhossein Mirhosseini, Seyed Borna Ehsani, Hamid Sarbazi-Azad, Mario Drumond, Babak Falsafi, Rachata Ausavarungnirun, Onur Mutlu. ASPLOS2018. The 23rd ACM International Conference on Architectural Support for Programming Languages and Operating Systems, March 24th – March 28th, Williamsburg, VA, USA.
    Abstract / PDF [1.M]

  • MASK: Redesigning the GPU Memory Hierarchy to Support Multi-Application Concurrency. Rachata Ausavarungnirun, Vance Miller, Joshua Landgraf, Saugata Ghose, Jayneel Gandhi, Adwait Jog, Christopher J. Rossbach, Onur Mutlu. ASPLOS2018. The 23rd ACM International Conference on Architectural Support for Programming Languages and Operating Systems, March 24th – March 28th, Williamsburg, VA, USA.
    Abstract / PDF [1.1M]

  • Slim NoC: A Low-Diameter On-Chip Network Topology for High Energy Efficiency and Scalability. Maciej Besta, Syed Minhaj Hassan, Sudhakar Yalamanchili, Rachata Ausavarungnirun, Onur Mutlu, Torsten Hoefler. ASPLOS2018. The 23rd ACM International Conference on Architectural Support for Programming Languages and Operating Systems, March 24th – March 28th, Williamsburg, VA, USA.
    Abstract / PDF [1.6M]

  • SOAP: One Clean Analysis of All Age-Based Scheduling Policies. Ziv Scully, Mor Harchol-Balter, Alan Scheller-Wolf. Proc. ACM Meas. Anal. Comput. Syst., Vol. 2, No. 1, Article 16, March 2018.
    Abstract / PDF [885K]

  • MLtuner: System Support for Automatic Machine Learning Tuning. Henggang Cui, Gregory R. Ganger, Phillip B. Gibbons. arXiv:1803.07445v1 [cs.LG] 20 Mar 2018.
    Abstract / PDF [1M]

  • Dynamic Stem-Sharing for Multi-Tenant Video Processing. Angela Jiang, Christopher Canel, Daniel Wong, Michael Kaminsky, Michael A. Kozuch, Padmanabhan Pillai, David G. Andersen, Gregory R. Ganger. SysML 18, February 15–16, 2018. Stanford, CA.
    Abstract / PDF [450K]

  • MQSim: A Framework for Enabling Realistic Studies of Modern Multi-Queue SSD Devices. A. Tavakkol, J. Gómez-Luna, M. Sadrosadati, S. Ghose, and O. Mutlu. USENIX Conference on File and Storage Technologies (FAST), Oakland, CA, February 2018.
    Abstract / PDF [2.25M]

  • 3LC: Lightweight and Effective Traffic Compression for Distributed Machine Learning. Hyeontaek Lim, David G. Andersen, Michael Kaminsky. arXiv:1802.07389v1 [cs.LG] 21 Feb 2018.
    Abstract / PDF [586K]

  • Efficient Multi-Tenant Inference on Video using Microclassifiers. Giulio Zhou, Thomas Kim, Christopher Canel, Conglong Li, Hyeontaek Lim, David G. Andersen, Michael Kaminsky, Subramanya R. Dulloor. SysML’18, February 15–16, 2018, Stanford, CA.
    Abstract / PDF [1.5M]

  • PipeDream: Fast and Efficient Pipeline Parallel DNN Training. Aaron Harlap, Deepak Narayanan, Amar Phanishayee, Vivek Seshadri, Nikhil Devanur, Greg Ganger, Phil Gibbons. SysML '18, Feb. 15-16, 2018 , Stanford, CA.
    Abstract / PDF [615K]

  • Intermittent Deep Neural Network Inference. Graham Gobieski, Nathan Beckmann, Brandon Lucia. SysML 2018, February 15-16, 2018, Stanford, CA.
    Abstract / PDF [450K]

  • Picking Interesting Frames in Streaming Video. Christopher Canel, Thomas Kim, Giulio Zhou, Conglong Li, Hyeontaek Lim, David G. Andersen, Michael Kaminsky, Subramanya R. Dulloor. SysML’18, February 15–16, 2018, Stanford, CA.
    Abstract / PDF [913K]

  • Tributary: Spot-dancing for elastic services with latency SLOs. Aaron Harlap, Andrew Chung, Alexey Tumanov, Gregory R. Ganger, Phillip B. Gibbons. Carnegie Mellon University Parallel Data Lab Technical Report CMU-PDL-18-102, Jan. 2018.
    Abstract / PDF [990K]

  • Addressing the Long-Lineage Bottleneck in Apache Spark. Haoran Wang, Jinliang Wei, Garth Gibson. Carnegie Mellon University Parallel Data Lab Technical Report CMU-PDL-18-101, January 2018.
    Abstract / PDF [250K]

  • Towards Optimality in Parallel Job Scheduling. Benjamin Berg, Jan-Pieter Dorsman, Mor Harchol-Balter. Proc. ACM Meas. Anal. Comput. Syst., Vol. 1, No. 2, Article 40. Publication date: December 2017.
    Abstract / PDF [4.3M]

  • SlimDB: A Space-Efficient Key-Value Storage Engine For Semi-Sorted Data. Kai Ren, Qing Zheng, Joy Arulraj, Garth Gibson. Proceedings of the VLDB Endowment, Vol. 10, No. 13, 2017.
    Abstract / PDF [2.15M]

  • 3Sigma: Distribution-based cluster scheduling for runtime uncertainty. Jun Woo Park, Alexey Tumanov, Angela Jiang, Michael A. Kozuch, Gregory R. Ganger. Carnegie Mellon University Parallel Data Lab Technical Report CMU-PDL-17-107, November 2017.
    Abstract / PDF [800K]

  • Software-Defined Storage for Fast Trajectory Queries using a DeltaFS Indexed Massive Directory. Qing Zheng, George Amvrosiadis, Saurabh Kadekodi, Garth Gibson, Chuck Cranor, Brad Settlemyer, Gary Grider, Fan Guo. PDSW-DISCS 2017: 2nd Joint International Workshop on Parallel Data Storage and Data Intensive Scalable Computing Systems held in conjunction with SC17, Denver, CO, November 2017.
    Abstract / PDF [1.25M]

  • Aging Gracefully with Geriatrix: A File System Aging Tool. Saurabh Kadekodi, Vaishnavh Nagarajan, Garth A. Gibson. Carnegie Mellon University Parallel Data Lab Technical Report CMU-PDL-17-106, October 2017. Supersedes Carnegie Mellon University Parallel Data Lab Technical Report CMU-PDL-16-105. October, 2016.
    Abstract / PDF [560K]

  • A Case for Packing and Indexing in Cloud File Systems. Saurabh Kadekodi, Bin Fan, Adit Madan, Garth A. Gibson. Carnegie Mellon University Parallel Data Lab Technical Report CMU-PDL-17-105, October 2017.
    Abstract / PDF [280K]

  • Bigger, Longer, Fewer: What do cluster jobs look like outside Google? George Amvrosiadis, Jun Woo Park, Gregory R. Ganger, Garth A. Gibson, Elisabeth Baseman, Nathan DeBardeleben. Carnegie Mellon University Parallel Data Lab Technical Report CMU-PDL-17-104, October 2017.
    Abstract / PDF [360K]

  • Mosaic: A GPU Memory Manager with Application-Transparent Support for Multiple Page Sizes. Rachata Ausavarungnirun, Joshua Landgraf, Vance Miller, Saugata Ghose, Jayneel Gandhi, Christopher J. Rossbach & Onur Mutlu. Proc. of the International Symposium on Microarchitecture (MICRO), Cambridge, MA, October 2017.
    Abstact / PDF [1.32M]

  • Ambit: In-Memory Accelerator for Bulk Bitwise Operations Using Commodity DRAM Technology. Vivek Seshadri, Donghyuk Lee, Thomas Mullins, Hasan Hassan, Amirali Boroumand, Jeremie Kim, Michael A. Kozuch, Onur Mutlu, Phillip B. Gibbons & Todd C. Mowry. Proceedings of the 50th International Symposium on Microarchitecture (MICRO), Boston, MA, USA, October 2017.
    Abstact / PDF [2.5M]

  • Detecting and Mitigating Data-Dependent DRAM Failures by Exploiting Current Memory Content. Samira Khan, Chris Wilkerson, Zhe Wang, Alaa R. Alameldeen, Donghyuk Lee & Onur Mutlu. Proceedings of the 50th International Symposium on Microarchitecture (MICRO), Boston, MA, USA, October 2017.
    Abstact / PDF [1.5M]

  • WorkloadCompactor: Reducing datacenter cost while providing tail latency SLO guarantees. Timothy Zhu, Michael A. Kozuch & Mor Harchol-Balter. ACM Symposium on Cloud Computing (SoCC'17) , Santa Clara, Oct 2017.
    Abstact / PDF [3.25M]

  • Utility-Based Hybrid Memory Management. Yang Li, Saugata Ghose, Jongmoo Choi, Jin Sun, Hui Wang & Onur Mutlu. In Proc. of the IEEE Cluster Conference (CLUSTER), Honolulu, HI, September 2017.
    Abstact / PDF [588K]

  • A Better Model for Job Redundancy: Decoupling Server Slowdown and Job Size. Kristen Gardner, Mor Harchol-Balter, Alan Scheller-Wolf & Benny Van Houdt. Transactions on Networking, September 2017.
    Abstact / PDF [544K]

  • Error Characterization, Mitigation, and Recovery in Flash-Memory-Based Solid-State Drives. Yu Cai, Saugata Ghose, Erich F. Haratsch, Yixin Luo & Onur Mutlu. Proceedings of the IEEE Volume: 105, Issue: 9, Sept. 2017.
    Abstact / PDF [5.3M]

  • Workload Analysis and Caching Strategies for Search Advertising Systems. Conglong Li, David G. Andersen, Qiang Fu, Sameh Elnikety, Yuxiong He. SoCC ’17, September 24–27, 2017, Santa Clara, CA, USA.
    Abstract / PDF [650K]

  • Scheduling for Efficiency and Fairness in Systems with Redundancy. Kristen Gardner, Mor Harchol-Balter, Esa Hyyti & Rhonda Righter. Performance Evaluation, July 2017.
    Abstact / PDF [784K]

  • Litz: An Elastic Framework for High-Performance Distributed Machine Learning. Aurick Qiao, Abutalib Aghayev, Weiren Yu, Haoyang Chen, Qirong Ho, Garth A. Gibson, Eric P. Xing. Carnegie Mellon Univedrsity Parallel Data Laboratory Technical Report CMU-PDL-17-103. June 2017.
    Abstract / PDF [424K]

  • Cachier: Edge-caching for Recognition Applications. Utsav Drolia, Katherine Guo (Bell Labs), Jiaqi Tan, Rajeev Gandhi, Priya Narasimhan. The 37th IEEE International Conference on Distributed Computing Systems (ICDCS 2017), June 5 – 8, 2017, Atlanta, GA, USA.
    Abstract / PDF [5.4M]

  • Carpool: A Bufferless On-Chip Network Supporting Adaptive Multicast and Hotspot Alleviation. Xiyue Xiang, Wentao Shi, Saugata Ghose, Lu Peng, Onur Mutlu & Nian-Feng Tzeng. In Proc. of the International Conference on Supercomputing (ICS), Chicago, IL, June 2017.
    Abstact / PDF [6.7M]

  • Viyojit: Decoupling Battery and DRAM Capacities for Battery-Backed DRAM. Rajat Kateja, Anirudh Badam, Sriram Govindan, Bikash Sharma, Greg Ganger. ISCA ’17, June 24-28, 2017, Toronto, ON, Canada.
    Abstract / PDF [1M]

  • Understanding Reduced-Voltage Operation in Modern DRAM Devices: Experimental Characterization, Analysis, and Mechanisms. Kevin K. Chang, A. Giray Yaglikçi, Saugata Ghose, Aditya Agrawal, Niladrish Chatterjee, Abhijith Kashyap, Donghyuk Lee, Mike O’Connor, Hasan Hassan & Onur Mutlu. Proceedings of the ACM on Measurement and Analysis of Computing Systems (POMACS), Vol. 1, No. 1, June 2017.
    Abstact / PDF [4M]

  • Design-Induced Latency Variation in Modern DRAM Chips: Characterization, Analysis, and Latency Reduction Mechanisms. Donghyuk Lee, Samira Khan, Lavanya Subramanian, Saugata Ghose, Rachata Ausavarungnirun, Gennady Pekhimenko, Vivek Seshadri & Onur Mutlu. Proceedings of the ACM on Measurement and Analysis of Computing Systems (POMACS), Vol. 1, No. 1, June 2017.
    Abstact / PDF [2.5M]

  • Relaxed Operator Fusion for In-Memory Databases: Making Compilation, Vectorization, and Prefetching Work Together At Last. Prashanth Menon, Todd C. Mowry & Andrew Pavlo. Proceedings of the VLDB Endowment, Vol. 11, No. 1, 2017.
    Abstact / PDF [970K]

  • Efficient Redundancy Techniques for Latency Reduction in Cloud Systems. Gauri Joshi, Emina Soljanin & Gregory Wornell. ACM Transactions on Modeling and Performance Evaluation of Computing Systems (TOMPECS) Volume 2 Issue 2, May 2017.
    Abstact / PDF [1.38M]

  • Automatic Database Management System Tuning Through Large-scale Machine Learning. Dana Van Aken, Andrew Pavlo, Geoffrey J. Gordon, Bohan Zhang. ACM SIGMOD International Conference on Management of Data, May 14-19, 2017. Chicago, IL, USA.
    Abstract / PDF [760K]

  • Online Deduplication for Databases. Lianghong Xu, Andrew Pavlo, Sudipta Sengupta, Gregory R. Ganger. ACM SIGMOD International Conference on Management of Data, May 14-19, 2017.
    Abstract / PDF [890K]

  • Proteus: Agile ML Elasticity through Tiered Reliability in Dynamic Resource Markets. Aaron Harlap, Alexey Tumanov, Andrew Chung, Greg Ganger, Phil Gibbons. ACM European Conference on Computer Systems, 2017 (EuroSys'17), 23rd-26th April, 2017, Belgrade, Serbia. Supersedes Carnegie Mellon University Parallel Data Lab Technical Report CMU-PDL-16-102. May 2016.
    Abstract / PDF [743K]

  • An Empirical Evaluation of In-Memory Multi-Version Concurrency Control. Yingjun Wu, Joy Arulraj, Jiexi Lin, Ran Xian, Andrew Pavlo. Proceedings of the VLDB Endowment, vol. 10, iss. 7, pages. 781—792, March 2017.
    Abstract / PDF [660K]

  • AdaptSize: Orchestrating the Hot Object Memory Cache in a Content Delivery Network. Daniel S. Berger, Ramesh K. Sitaraman, Mor Harchol-Balter. 14th USENIX Symposium on Networked Systems Design and Implementation (NSDI '17). March 27–29, 2017, Boston, MA.
    Abstract / PDF [560K]

  • Improving the Reliability of Chip-off Forensic Analysis of NAND Flash Memory Devices. Aya Fukami, Saugata Ghose, Yixin Luo, Yu CaI, Onur Mutlu. DFRWS Digital Forensics Research Conference Europe (DFRWS EU), March 21 - 23, 2017 Lake Constance, Germany.
    Abstract / PDF [1.5M]

  • Gaia: Geo-Distributed Machine Learning Approaching LAN Speeds. Kevin Hsieh, Aaron Harlap, Nandita Vijaykumar, Dimitris Konomis, Gregory R. Ganger, Phillip B. Gibbons, Onur Mutlu. 14th USENIX Symposium on Networked Systems Design and Implementation (NSDI), March 27–29, 2017, Boston, MA.
    Abstract / PDF [1.5M]

  • Towards Edge-caching for Image Recognition. Utsav Drolia, Katherine Guo, Jiaqi Tan, Rajeev Gandhi, Priya Narasimhan. First Workshop on Smart Edge Computing and Networking (SmartEdge) '17, held in conjunction with PerCom 2017, March 13 - 17, 2017, Hawaii, USA.
    Abstract / PDF [5.1M]

  • Evolving Ext4 for Shingled Disks. Abutalib Aghayev, Theodore Ts’o, Garth Gibson, Peter Desnoyers. 15th USENIX Conference on File and Storage Technologies (FAST '17), Feb 27–Mar 2, 2017. Santa Clara, CA.
    Abstract / PDF [1.4M]

  • Vulnerabilities in MLC NAND Flash Memory Programming: Experimental Analysis, Exploits, and Mitigation Techniques. Yu Cai, Saugata Ghose, Yixin Luo, Ken Mai, Onur Mutlu, Erich F. Haratsch. 23rd IEEE Symposium on High Performance Computer Architecture, Industrial session, February 2017.
    Abstract / PDF [8.4M]

  • SoftMC: A Flexible and Practical Open-Source Infrastructure for Enabling Experimental DRAM Studies. Hasan Hassan,Nandita Vijaykumar, Samira Khan, Saugata Ghose, Kevin Chang, Gennady Pekhimenko, Donghyuk Lee, Oguz Ergin, Onur Mutlu. International Symposium on High-Performance Computer Architecture (HPCA), February 2017.
    Abstract / PDF [1.6M]

  • An Evaluation of Distributed Concurrency Control. Rachael Harding, Dana Van Aken, Andrew Pavlo, Michael Stonebraker. Proceedings of the VLDB Endowment, vol. 10, iss. 5, pages. 553—564, January 2017.
    Abstract / PDF [421K]

  • Self-Driving Database Management Systems. A. Pavlo, G. Angulo, J. Arulraj, H. Lin, J. Lin, L. Ma, P. Menon, T. Mowry, M. Perron, I. Quah, S. Santurkar, A. Tomasic, S. Toor, D. V. Aken, Z. Wang, Y. Wu, R. Xian, and T. Zhang. In CIDR 2017, Conference on Innovative Data Systems Research. January 8-11, 2017, Chaminade, CA.
    Abstract / PDF [680K]

  • Write-Behind Logging. J. Arulraj, M. Perron, A. Pavlo. Proc. VLDB Endow., vol. 10, pp. 337-348, December, 2016.
    Abstract / PDF [931K]

  • Prescriptive Safety-Checks through Automated Proofs for Control-Flow Integrity. Jiaqi Tan. Carnegie Mellon University Electrical and Computer Engineering PhD Dissertation, November 2016.
    Abstract / PDF [5.75M]

  • A Survey of Security Vulnerabilities in Bluetooth Low Energy Beacons. Hui Jun Tay, Jiaqi Tan, Priya Narasimhan. Carnegie Mellon University Parallel Data Lab Technical Report CMU-PDL-16-109. November 2016.
    Abstract / PDF [110K]

  • AUSPICE-R: Automatic Safety-Property Proofs for Realistic Features in Machine Code. Jiaqi Tan, Hui Jun Tay, Rajeev Gandhi, Priya Narasimhan.14th Asian Symposium on Programming Languages and Systems (APLAS), November 2016.
    Abstract / PDF [325K]

  • FaSST: Fast, Scalable and Simple Distributed Transactions with Two-sided (RDMA) Datagram RPCs. Anuj Kalia, Michael Kaminsky, David G. Andersen.12th USENIX Symposium on Operating Systems Design and Implementation November 2–4, 2016, Savannah, GA, USA.
    Abstract / PDF [608K]

  • Stateless Model Checking with Data-Race Preemption Points. Ben Blum, Garth Gibson. SPLASH 2016 OOPSLA, Oct 30 - Nov 4, 2016, Amsterdam, Netherlands.
    Abstract / PDF [704K]

  • MLtuner: System Support for Automatic Machine Learning Tuning. Henggang Cui, Gregory R. Ganger, and Phillip B. Gibbons. Carnegie Mellon University Parallel Data Lab Technical Report CMU-PDL-16-108, October 2016.
    Abstract / PDF [900K]

  • Aging Gracefully with Geriatrix: A File System Aging Suite. Saurabh Kadekodi, Vaishnavh Nagarajan, Garth A. Gibson. Carnegie Mellon University Parallel Data Lab Technical Report CMU-PDL-16-105. October, 2016.
    Abstract / PDF [503K]

  • Benchmarking Apache Spark with Machine Learning Applications. Jinliang Wei, Jin Kyu Kim, Garth A. Gibson. Carnegie Mellon University Parallel Data Lab Technical Report CMU-PDL-16-107 October 2016.
    Abstract / PDF [360K]

  • Zorua: A Holistic Approach to Resource Virtualization in GPUs. Nandita Vijaykumar, Kevin Hsieh, Gennady Pekhimenko, Samira Khan, Ashish Shrestha,Saugata Ghose, Adwait Jogu, Phillip B. Gibbons, Onur Mutlu. 49th IEEE/ACM International Symposium on Microarchitecture (MICRO’16), October 15-19, 2016, Taipei, Taiwan.
    Abstract / PDF [1.5M]

  • Principled Workflow-centric Tracing of Distributed Systems. Raja R. Sambasivan, Ilari Shafer, Jonathan Mace, Benjamin H. Sigelman, Rodrigo Fonseca, Gregory R. Ganger. ACM Symposium on Cloud Computing 2016 (SoCC ’16) October 5-7, 2016, Santa Clara, CA, USA.
    Abstract / PDF [590K}

  • SNC-Meister: Admitting More Tenants with Tail Latency SLOs. Timothy Zhu, Daniel S. Berger, Mor Harchol-Balter. SoCC ’16, October 05-07, 2016, Santa Clara, CA, USA.
    Abstract / PDF [500K]

  • A Model for Application Slowdown Estimation in On-Chip Networks and Its Use for Improving System Fairness and Performance. Xiyue Xiang, Saugata Ghose, Onur Mutlu, Nian-Feng Tzeng. International Conference on Computer Design (ICCD), October 3-5, 2016, Phoenix, USA.
    Abstract / PDF [399K]

  • Accelerating Pointer Chasing in 3D-Stacked Memory: Challenges, Mechanisms, Evaluation.Kevin Hsieh, Samira Khan, Nandita Vijaykumar, Kevin K. Chang, Amirali Boroumand, Saugata Ghose, Onur Mutlu. International Conference on Computer Design (ICCD), October 3-5, 2016, Phoenix, USA.
    Abstract / PDF [1.67M]

  • PCFIRE: Towards Provable Preventative Control-Flow Integrity Enforcement for Realistic Embedded Software. Jiaqi Tan, Hui Jun Tay, Utsav Drolia, Rajeev Gandhi, Priya Narasimhan. EMSOFT’16, October 01-07, 2016, Pittsburgh, PA, USA.
    Abstract / PDF [722K]

  • Poster Abstract: BUFS: Towards Bottom-Up Foundational Security for Software in the Internet-of-Things. Jiaqi Tan, Rajeev Gandhi, Priya Narasimhan. 1st IEEE/ACM Symposium on Edge Computing (SEC 2016), October 2016.
    Abstract / PDF [682K]

  • Addressing the Straggler Problem for Iterative Convergent Parallel ML. Aaron Harlap, Henggang Cui, Wei Dai, Jinliang Wei Gregory R. Ganger, Phillip B. Gibbons, Garth A. Gibson, Eric P. Xing. ACM Symposium on Cloud Computing 2016. Oct 5-7, Santa Clara. CA. Supersedes Carnegie Mellon University Parallel Data Laboratory Technical Report CMU-PDL-15-102, April 2015.
    Abstract / PDF [519K]

  • μC-States: Fine-grained GPU Datapath Power Management. Onur Kayıran, Adwait Jog, Ashutosh Pattnaik, Rachata Ausavarungnirun, Xulong Tang, Mahmut T. Kandemir, Gabriel H. Loh, Onur Mutlu, Chita R. Das. Proceedings of the The 25th International Conference on Parallel Architectures and Compilation Techniques (PACT 2016), Haifa, Israel, September 2016.
    Abstract / PDF [823K]

  • Online Deduplication for Distributed Databases. Lianghong Xu. Ph.D. Dissertation, Carnegie Mellon University, Electrical and Computer Engineering, September 2016.
    Abstract / PDF [1.8M]

  • JamaisVu: Robust Scheduling with Auto-Estimated Job Runtimes. Alexey Tumanov, Angela Jiang, Jun Woo Park, Michael A. Kozuch, Gregory R. Ganger. Carnegie Mellon University Parallel Data Lab Technical Report CMU-PDL-16-104. September 2016.
    Abstract / PDF [1.6M]

  • A Better Model for Job Redundancy: Decoupling Server Slowdown and Job Size Kristen Gardner, Mor Harchol-Balter, Alan Scheller-Wolf. IEEE Modeling, Analysis and Simulation of Computer and Telecommunication Systems (MASCOTS 2016), London, UK, September 2016.
    Abstract / PDF [244K]

  • Soundness Proofs for Iterative Deepening. Ben Blum. Carnegie Mellon University Parallel Data Lab Technical Report CMU-PDL-16-103, September 6, 2016.
    Abstract / PDF [356K]

  • Efficient Algorithms with Asymmetric Read and Write Costs. Guy E Blelloch, Jeremy T Fineman, Phillip B Gibbons, Yan Gu, Julian Shun. 24th European Symposium on Algorithms (ESA’16). August, 2016.
    Abstract / PDF [623K]

  • Parallel Algorithms for Asymmetric Read-Write Costs. Naama Ben-David, Guy E. Blelloch, Jeremy T. Fineman, Phillip B. Gibbons, Yan Gu, Charles McGuffey, Julian Shun. 28th ACM Symposium on Parallelism in Algorithms and Architectures Jul 11, 2016 - Jul 13, 2016. Asilomar State Beach, California, USA.
    Abstract / PDF [386K]

  • Larger-than-Memory Data Management on Modern Storage Hardware for In-Memory OLTP Database Systems. Lin Ma, Joy Arulraj, Sam Zhao, Andrew Pavlo, Subramanya R. Dulloor, Michael J. Giardino, Jeff Parkhurst, Jason L. Gardner, Kshitij Dosh*, Col. Stanley Zdonik. DaMoN’16, June 26-July 01 2016, San Francisco, CA, USA.
    Abstract / PDF [1.25M]

  • Bridging the Archipelago between Row-Stores and Column-Stores for Hybrid Workloads. Joy Arulraj, Andrew Pavlo, Prashanth Menon. SIGMOD’16, June 26-July 01, 2016, San Francisco, CA, USA.
    Abstract / PDF [575K]

  • PARBOR: An Efficient System-Level Technique to Detect Data-Dependent Failures in DRAM. Samira Khan, Donghyuk Lee, Onur Mutlu. Proceedings of the 45th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN), Toulouse, France, June 28 - July 1 2016.
    Abstract / PDF [630K]

  • Transparent Offloading and Mapping (TOM): Enabling Programmer-Transparent Near-Data Processing in GPU Systems. Kevin Hsieh, Eiman Ebrahimi, Gwangsun Kim, Niladrish Chatterjee, Mike O'Connor, Nandita Vijaykumar, Onur Mutlu§, Stephen W. Keckler. Proceedings of the 43rd International Symposium on Computer Architecture (ISCA), Seoul, South Korea, June 18 - 22, 2016.
    Abstract / PDF [1M]

  • Understanding Latency Variation in Modern DRAM Chips: Experimental Characterization, Analysis, and Optimization. Kevin K. Chang, Abhijith Kashyap, Hasan Hassan, Saugata Ghose, Kevin Hsieh, Donghyuk Lee, Tianshi Li, Gennady Pekhimenko, Samira Khan, Onur Mutlu. Proceedings of the ACM International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS), Antibes Juan-Les-Pins, France, June 14 - 18, 2016.
    Abstract / PDF [3M]

  • Design Guidelines for High Performance RDMA Systems. Anuj Kalia, Michael Kaminsky, David G. Andersen. 2016 USENIX Annual Technical Conference (USENIX ATC'16), June 2016.
    Abstract / PDF [553K]

  • Reducing the Storage Overhead of Main-Memory OLTP Databases with Hybrid Indexes. Huanchen Zhang, Andy Pavlo, David G. Andersen, Michael Kaminsky, Lin Ma, Rui Shen. ACM SIGMOD International Conference on Management of Data 2016 (SIGMOD'16), June 2016.
    Abstract / PDF [715K]

  • Achieving One Billion Key-Value Requests Per Second on a Single Server. Sheng Li, Hyeontaek Lim, Victor Lee, Jung Ho Ahn, Anuj Kalia, Michael Kaminsky, David G. Andersen, Seongil O, Sukhan Lee, Pradeep Dubey. IEEE Micro's Top Picks from the Computer Architecture Conferences 2016, May/June 2016. Top Picks 2016 Award!
    Abstract / PDF [176K]

  • A Case for Hierarchical Rings with Deflection Routing: An energy-efficient on-chip communication substrate. Rachata Ausavarungnirun, Chris Fallin, Xiangyao Yu, Kevin Kai-Wei Chang, Greg Nazario, Reetuparna Das, Gabriel H. Loh, Onur Mutlu, Parallel Computing, Volume 54, May 2016, Pages 29-45, ISSN 0167-8191.
    Abstract / PDF [2M]

  • TierML: Using Tiers of Reliability for Agile Elasticity in Machine Learning. Aaron Harlap, Gregory R. Ganger, Phillip B. Gibbons. Carnegie Mellon University Parallel Data Lab Technical Report CMU-PDL-16-102. May 2016.
    Abstract / PDF [590K]

  • Similarity-based Deduplication for Databases. Lianghong Xu, Andrew Pavlo, Sudipta Sengupta, Gregory R. Ganger. Carnegie Mellon University Parallel Data Lab Technical Report CMU-PDL-16-101, April 2016.
    Abstract / PDF [1M]

  • GeePS: Scalable Deep Learning on Distributed GPUs with a GPU-Specialized Parameter Server. Henggang Cui, Hao Zhang, Gregory R. Ganger, Phillip B. Gibbons, and Eric P. Xing. ACM European Conference on Computer Systems, 2016 (EuroSys'16), 18th-21st April, 2016, London, UK.
    Abstract / PDF [617K]

  • TetriSched: Global Rescheduling with Adaptive Plan-ahead in Dynamic Heterogeneous Clusters. Alexey Tumanov, Timothy Zhu, Jun Woo Park, Michael A. Kozuch, Mor Harchol-Balter, Gregory R. Ganger. ACM European Conference on Computer Systems, 2016 (EuroSys'16), 18th-21st April, 2016, London, UK.
    Abstract / PDF [8M]

  • STRADS: A Distributed Framework for Scheduled Model Parallel Machine Learning. Jin Kyu Kim, Qirong Ho, Seunghak Lee, Xun Zheng, Wei Dai, Garth A. Gibson, Eric P. Xing. ACM European Conference on Computer Systems, 2016 (EuroSys'16), 18th-21st April, 2016, London, UK.
    Abstract / PDF [1.6M]

  • Full-Stack Architecting to Achieve a Billion Requests Per Second Throughput on a Single Key-Value Store Server Platform. Sheng Li, Hyeontaek Lim, Victor Lee, Jung Ho Ahn, Anuj Kalia, Michael Kaminsky, David G. Andersen, Seongil O, Sukhan Lee, Pradeep Dubey. ACM Transactions on Computer Systems (TOCS), Vol. 34, No. 2, April 2016.
    Abstract / PDF [1.14M]

  • ChargeCache: Reducing DRAM Latency by Exploiting Row Access Locality. Hasan Hassan, Gennady Pekhimenko, Nandita Vijaykumar Vivek Seshadri, Donghyuk Lee, Oguz Ergin, Onur Mutlu. Proceedings of the 22nd International Symposium on High-Performance Computer Architecture (HPCA), Barcelona, Spain, March 2016.
    Abstract / PDF [2M]

  • Be Fast, Cheap and in Control with SwitchKV. Xiaozhou Li, Raghav Sethi, Michael Kaminsky, David G. Andersen, Michael J. Freedman. In 13th USENIX Symposium on Networked Systems Design and Implementation (NSDI'16), Santa Clara, CA, March 2016.
    Abstract / PDF [594K]

  • Low-Cost Inter-Linked Subarrays (LISA): Enabling Fast Inter-Subarray Data Movement in DRAM. Kevin K. Chang, Prashant J. Nair, Donghyuk Lee, Saugata Ghose, Moinuddin K. Qureshi, and Onur Mutlu. Proceedings of the 22nd International Symposium on High-Performance Computer Architecture (HPCA), Barcelona, Spain, March 2016.
    Abstract / PDF [768K]

  • A Case for Toggle-Aware Compression for GPU Systems. Gennady Pekhimenko, Evgeny Bolotin, Nandita Vijaykumar, Onur Mutlu, Todd C. Mowry, Stephen W. Keckler. Proceedings of the 22nd International Symposium on High-Performance Computer Architecture (HPCA), Barcelona, Spain, March 2016.
    Abstract / PDF [713K]

  • SizeCap: Efficiently Handling Power Surges in Fuel Cell Powered Data Centers. Yang Li, Di Wang, Saugata Ghose, Jie Liu, Sriram Govindan, Sean James, Eric Peterson, John Siegler, Rachata Ausavarungnirun, Onur Mutlu. 22nd International Symposium on High Performance Computer Architecture (HPCA), March 12-16, Barcelona, Spain, 2016.
    Abstract / PDF [1.32M]

  • Achieving both High Energy Efficiency and High Performance in On-Chip Communication using Hierarchical Rings with Deflection Routing. Rachata Ausavarungnirun, Chris Fallin, Xiangyao Yu, Kevin Kai-Wei Chang, Greg Nazario, Reetuparna Das, Gabriel H. Loh, Onur Mutlu. arXiv:1602.06005v1 [cs.DC], 18 Feb 2016.
    Abstract / PDF [576K]

  • A Framework for Accelerating Bottlenecks in GPU Execution with Assist Warps. Nandita Vijaykumar, Gennady Pekhimenko, Adwait Jog, Saugata Ghose, Abhishek Bhowmick, Rachata Ausavarungnirun, Chita R. Das, Mahmut T. Kandemir, Todd C. Mowry, Onur Mutlu. arXiv:1602.01348v1 [cs.AR]. 3 Feb 2016.
    Abstract / PDF [1.87M]

  • Towards Accurate and Fast Evaluation of Multi-Stage Log-Structured Designs. Hyeontaek Lim, David G. Andersen, Michael Kaminsky. In 14th USENIX Conference on File and Storage Technologies (FAST'16), Santa Clara, CA, February 2016.
    Abstract / PDF [2M]

  • Simultaneous Multi-Layer Access: Improving 3D-Stacked Memory Bandwidth at Low Cost. Donghyuk Lee, Saugata Ghose, Gennady Pekhimenko, Samira Khan, Onur Mutlu. ACM Transactions on Architecture and Code Optimization (TACO), Vol. 12, January 2016. Presented at the 11th HiPEAC Conference, Prague, Czech Republic, January 2016.
    Abstract / PDF [2M]

  • Enabling Accurate and Practical Online Flash Channel Modeling for Modern MLC NAND Flash Memory. Yixin Luo, Saugata Ghose, Yu Cai, Erich F. Haratsch, Onur Mutlu JSAC Special Issue, 2016.
    Abstract / PDF [4.2M]

  • ThyNVM: Enabling Software-Transparent Crash Consistency in Persistent Memory Systems. Jinglei Ren, Jishen Zhao, Samira Khan, Jongmoo Choi, Yongwei Wu, Onur Mutlu. Proceedings of the 48th International Symposium on Microarchitecture (MICRO), Waikiki, Hawaii, USA, December 2015.
    Abstract / PDF [460K]

  • Scheduling Techniques for Hybrid Circuit/Packet Networks. He Liu, Matthew K. Mukerjee, Conglong Li, Nicolas Feltman, George Papen, Stefan Savage, Srinivasan Seshan, Geoffrey M. Voelker, David G. Andersen, Michael Kaminsky, George Porter, Alex C. Snoeren. In 11th International Conference on emerging Networking EXperiments and Technologies (CoNEXT 2015), Heidelberg, Germany, December 2015. Nominated for Best Paper.
    Abstract / PDF [510K]

  • The Application Slowdown Model: Quantifying and Controlling the Impact of Inter-Application Interference at Shared Caches and Main Memory. Lavanya Subramanian, Vivek Seshadri, Arnab Ghosh, Samira Khan, Onur Mutlu. Proceedings of the 48th International Symposium on Microarchitecture (MICRO), Waikiki, Hawaii, USA, December 2015.
    Abstract / PDF [604K]

  • Gather-Scatter DRAM: In-DRAM Address Translation to Improve the Spatial Locality of Non-unit Strided Accesses. Vivek Seshadri, Thomas Mullins, Amirali Boroumand, Onur Mutlu, Phillip B. Gibbons, Michael A. Kozuch, Todd C. Mowry. Proceedings of the 48th International Symposium on Microarchitecture (MICRO), Waikiki, Hawaii, USA, December 2015.
    Abstract / PDF [874K]

  • DeltaFS: Exascale File Systems Scale Better Without Dedicated Servers. Qing Zheng, Kai Ren, Garth Gibson, Bradley W. Settlemyer, Gary Grider. PDSW2015: 10th Parallel Data Storage Workshop, held in conjunction with SC15, Austin, TX, November 16, 2015.
    Abstract / PDF [930K]

  • High-Performance and Lightweight Transaction Support in Flash-Based SSDs. Youyou Lu, Jiwu Shu, Jia Guo, Shuai Li, Onur Mutlu. IEEE Transactions on Computers (TC), October 2015.
    Abstract / PDF [1.4M]

  • Decoupled Direct Memory Access: Isolating CPU and IO Traffic by Leveraging a Dual-Data-Port DRAM. Donghyuk Lee, Lavanya Subramanian, Rachata Ausavarungnirun, Jongmoo Choi, Onur Mutlu. Proceedings of the 24th International Conference on Parallel Architectures and Compilation Techniques (PACT), San Francisco, CA, USA, October 2015.
    Abstract / PDF [1.8M]

  • Tracking and Reducing Uncertainty in Dataflow Analysis-Based Dynamic Parallel Monitoring. Michelle Goodstein, Phillip Gibbons, Michael Kozuch, Todd Mowry. International Conference on Parallel Architectures and Compilation Techniques (PACT 2015), Oct 18, 2015 - Oct 21, 2015, San Francisco, CA.
    Abstract / PDF [341K]

  • Scalable Deep Learning on Distributed GPUs with a GPU-specialized Parameter Server. Henggang Cui, Gregory R. Ganger, Phillip B. Gibbons. Carnegie Mellon University Parallel Data Laboratory Technical Report CMU-PDL-15-107, October 2015.
    Abstract / PDF [537K]

  • Exploiting Inter-Warp Heterogeneity to Improve GPGPU Performance. Rachata Ausavarungnirun, Saugata Ghose, Onur Kayiran, Gabriel H. Loh, Chita R. Das, Mahmut T. Kandemir, Onur Mutlu. Proceedings of the The 24th International Conference on Parallel Architectures and Compilation Techniques (PACT 2015), San Francisco, October 2015.
    Abstract / PDF [556K]

  • Krowd: A Key-Value Store for Crowded Venues. Utsav Drolia, Nathan Mickulicz, Rajeev Gandhi, Priya Narasimhan.10th ACM Workshop on Mobility in the Evolving Internet Architecture (MobiArch), held in Paris, France in September 2015. Best Paper.
    Abstract / PDF [696K]

  • A Low-Overhead, Fully-Distributed, Guaranteed-Delivery Routing Algorithm for Faulty Network-on-Chips. Mohammad Fattah, Antti Airola, Rachata Ausavarungnirun, Nima Mirzaei, Pasi Liljeberg, Juha Plosila, Siamak Mohammadi, Tapio Pahikkala, Onur Mutlu, Hannu Tenhunen. Proceedings of the 9th ACM/IEEE International Symposium on Networks on Chip (NOCS), Vancouver, BC, Canada, September 2015.
    Abstract / PDF [1M]

  • Resource-Efficient Data-Intensive System Designs for High Performance and Capacity. Hyeontaek Lim. Carnegie Mellon University PhD Dissertation CMU-CS-15-132, September 2015.
    Abstract / PDF [3.1M]

  • ShardFS vs. IndexFS: Replication vs. Caching Strategies for Distributed Metadata Management in Cloud Storage Systems. Lin Xiao, Kai Ren, Qing Zheng, Garth Gibson. ACM Symposium on Cloud Computing 2015. Aug. 27 - 29, 2015, Kohala Coast, HI.
    Abstract / PDF [275K]

  • Using Data Transformations for Low-latency Time Series Analysis. Henggang Cui, Kimberly Keeton, Indrajit Roy, Krishnamurthy Viswanathan, Gregory R. Ganger. ACM Symposium on Cloud Computing 2015. Aug. 27 - 29, 2015, Kohala Coast, HI. See the extended Technical Report for more information.
    Abstract / PDF [1.3M]

  • Managed Communication and Consistency for Fast Data-Parallel Iterative Analytics. Jinliang Wei, Wei Dai, Aurick Qiao, Qirong Ho, Henggang Cui, Gregory R. Ganger, Phillip B. Gibbons, Garth A. Gibson, Eric P. Xing. ACM Symposium on Cloud Computing 2015. Aug. 27 - 29, 2015, Kohala Coast, HI.
    Abstract / PDF [369K]

  • Reducing Replication Bandwidth for Distributed Document Databases. Lianghong Xu, Andrew Pavlo, Sudipta Sengupta, Jin Li, Gregory R. Ganger. ACM Symposium on Cloud Computing 2015. Aug. 27 - 29, 2015, Kohala Coast, HI.
    Abstract / PDF [501K]

  • Scaling Up Clustered Network Appliances with ScaleBricks. Dong Zhou, Bin Fan, Hyeontaek Lim, David G. Andersen, Michael Kaminsky, Michael Mitzenmacher, Ren Wang, Ajaypal Singh. Proc. ACM SIGCOMM 2015, August 17-21, 2015, London, United Kingdom.
    Abstract / PDF [626K]

  • Cuckoo Linear Algebra. Li Zhou, David G. Andersen, Mu Li, Alexander J. Smola. KDD’15, August 10-13, 2015, Sydney, NSW, Australia.
    Abstract / PDF [611K]

  • AUSPICE: Automated Safety Property Verification for Unmodified Executables. Jiaqi Tan, Hui Jun Tay, Rajeev Gandhi, and Priya Narasimhan. In 7th Working Conference on Verified Software: Theories, Tools, and Experiments (VSTTE), July 2015.
    Abstract / PDF [390K]

  • WARM: Improving NAND Flash Memory Lifetime with Write-hotness Aware Retention Management. Yixin Luo, Yu Cai, Saugata Ghose, Jongmoo Choi, Onur Mutlu.MSST 2015: 31st International Conference on Massive Storage Systems and Technologies, Jun 1, 2015 - Jun 5, 2015, Santa Clara, CA.
    Abstract / PDF [1.5M]

  • Architecting to Achieve a Billion Requests Per Second Throughput on a Single Key-Value Store Server Platform. Sheng Li, Hyeontaek Lim, Victor Lee, Jung Ho Ahn, Anuj Kalia, Michael Kaminsky, David G. Andersen, Seongil O, Sukhan Lee, Pradeep Dubey. In Proceedings of the 42nd International Symposium on Computer Architecture (ISCA 2015), Portland, OR, June 2015. Fast-tracked to Transactions on Computer Systems (TOCS).
    Abstract / PDF [350K]

  • Reducing Latency via Redundant Requests: Exact Analysis. Kristen Gardner, Sam Zbarsky, Sherwin Doroudi, Mor Harchol-Balter, Esa Hyytia, Alan Scheller-Wolf. Proceedings of ACM Sigmetrics/Performance 2015 Conference on Measurement and Modeling of Computer Systems (SIGMETRICS 15), Portland, OR. June 2015.
    Abstract / PDF [725K]

  • A Case for Core-Assisted Bottleneck Acceleration in GPUs: Enabling Efficient Data Compression. Nandita Vijaykumar, Gennady Pekhimenko, Adwait Jog, Abhishek Bhowmick, Rachata Ausavarungnirun, Chita Das, Mahmut Kandemir, Todd C. Mowry, Onur Mutlu. Proceedings of the 42nd International Symposium on Computer Architecture (ISCA), Portland, OR, June 2015.
    Abstract / PDF [1M]

  • Page Overlays: An Enhanced Virtual Memory Framework to Enable Fine-grained Memory Management. Vivek Seshadri, Gennady Pekhimenko, Olatunji Ruwase, Onur Mutlu, Phillip B. Gibbons, Michael A. Kozuch, Todd C. Mowry, Trishul Chilimbi. Proceedings of the 42nd International Symposium on Computer Architecture (ISCA), Portland, OR, June 2015.
    Abstract / PDF [2.1M]

  • SMPFRAME: A Distributed Framework for Scheduled Model Parallel Machine Learning. Jin Kyu Kim, Qirong Hoy, Seunghak Lee Xun Zheng, Wei Dai, Garth Gibson, Eric Xing. Carnegie Mellon University Parallel Data Lab Technical Report CMU-PDL-15-103, May 2015.
    Abstract / PDF [1.57M]

  • PocketTrend: Timely Identification and Delivery of Trending Search Content to Mobile Users. Gennady Pekhimenko, Dimitrios Lymberopoulos, Oriana Riva, Karin Strauss, Doug Burger. Proceedings of the 24th International World Wide Web Conference (WWW), Florence, Italy, May 2015.
    Abstract / PDF [504K]

  • Using Data Transformations for Low-latency Time Series Analysis. Henggang Cui, Kimberly Keeton, Indrajit Roy Krishnamurthy Viswanathan, Gregory R. Ganger. Carnegie Mellon University Parallel Data Lab Technical Report CMU-PDL-15-106. April 2015. Extended version of the 2015 SoCC paper.
    Abstract / PDF [925K]

  • Caveat-Scriptor: Write Anywhere Shingled Disks. Saurabh Kadekodi, Swapnil Pimpale, Garth Gibson. Proc. Of the Seventh USENIX Workshop on Hot Topics in Storage and File Systems (HotStorage’15), Santa Clara, CA, July 2015. Expanded paper available: Carnegie Mellon University Parallel Data Lab Technical Report CMU-PDL-15-101.
    Abstract / PDF [3.4M]

  • BenchPress: Dynamic Workload Control in the OLTP-Bench Testbed. D. Van Aken, D. E. Difallah, A. Pavlo, C. Curino, and P. Cudré-Mauroux. Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, 2015, pp. 1069-1073.
    Abstract / PDF [1.2M]

  • Let’s Talk About Storage & Recovery Methods for Non-Volatile Memory Database Systems. Joy Arulraj, Andrew Pavlo, Subramanya R. Dulloor. Proceedings ACM SIGMOD, Melbourne, Victoria, Australia, May 31-June 4, 2015.
    Abstract / PDF [1M]

  • Raising the Bar for Using GPUs in Software Packet Processing. Anuj Kalia, Dong Zhou, Michael Kaminsky, David G. Andersen. 12th Usenix Symposium on Networked Systems Design (NSDI'15). May 4-6, 2015, Oakland, CA.
    Abstract / PDF [386K]

  • Efficient Hypervisor Based Malware Detection. Peter Friedrich Klemperer. Ph.D. Dissertation, Carnegie Mellon University, Electrical and Computer Engineering, May 2015.
    Abstract / PDF [1.3M]

  • Optimal Scheduling for Jobs with Progressive Deadlines. Kristen Gardner, Sem Borst, Mor Harchol-Balter. IEEE INFOCOM 15, Hong Kong, April, 2015.
    Abstract / PDF [558K]

  • Managed Communication and Consistency for Fast Data-Parallel Iterative Analytics. Jinliang Wei, Wei Dai, Aurick Qiao, Qirong Ho*, Henggang Cui, Gregory R. Ganger, Phillip B. Gibbons, Garth A. Gibson, Eric P. Xing. Carnegie Mellon University Parallel Data Lab Technical Report CMU-PDL-15-105. April 2015.
    Abstract / PDF [2.62M]

  • ShardFS vs. IndexFS: Replication vs. Caching Strategies for Distributed Metadata Management in Cloud Storage Systems. Lin Xiao, Kai Ren, Qing Zheng, Garth Gibson. Carnegie Mellon University Parallel Data Lab Technical Report CMU-PDL-15-104, April 2015.
    Abstract / PDF [696K]

  • Solving the Straggler Problem for Iterative Convergent Parallel ML. Aaron Harlap, Henggang Cui, Wei Dai, Jinliang Wei Gregory R. Ganger, Phillip B. Gibbons, Garth A. Gibson, Eric P. Xing. Carnegie Mellon University Parallel Data Laboratory Technical Report CMU-PDL-15-102, April 2015.
    Abstract / PDF [519K]

  • A Cloud Computing Course: From Systems To Services. M. Suhail Rehman, Jason Boles, Mohammad Hammoud, Majd F. Sakr. Proceedings of the 46th ACM Special Interest Group on Computer Science Education Conference (SIGCSE 2015), Kansas City, USA, March 2015.
    Abstract / PDF [356K]

  • Exploiting Compressed Block Size as an Indicator of Future Reuse. Gennady Pekhimenko, Tyler Huberty, Rui Cai, Onur Mutlu, Phillip P. Gibbons, Michael A. Kozuch, and Todd C. Mowry. Proceedings of the 21st International Symposium on High-Performance Computer Architecture (HPCA), Bay Area, CA, February 2015.
    Abstract / PDF [2.4M]

  • Data Retention in MLC NAND Flash Memory: Characterization, Optimization and Recovery. Yu Cai, Yixin Luo, Erich F. Haratsch, Ken Mai, Onur Mutlu. HPCA-21, February 7-11, 2015 — Best Paper Runner Up.
    Abstract / PDF [1.6M]

  • Adaptive-Latency DRAM: Optimizing DRAM Timing for the Common-Case. Donghyuk Lee, Yoongu Kim, Gennady Pekhimenko, Samira Khan, Vivek Seshadri, Kevin Chang, Onur Mutlu. Proceedings of the 21st International Symposium on High-Performance Computer Architecture (HPCA), Bay Area, CA, February 2015.
    Abstract / PDF [1.67M]

  • High-Performance Distributed ML at Scale through Parameter Server Consistency Models. Wei Dai, Abhimanu Kumar, Jinliang Wei, Qirong Ho, Garth Gibson, Eric P. Xing. 29th AAAI Conf. on Artificial Intelligence (AAAI-15), Jan 25-29, 2015, Austin, Texas.
    Abstract / PDF [733K]

  • Mitigating Prefetcher-Caused Pollution Using Informed Caching Policies for Prefetched Blocks Vivek Seshadri, Samihan Yedkar, Hongyi Xin, Onur Mutlu, Phillip B. Gibbons, Michael A. Kozuch, Todd C. Mowry. ACM Transactions on Architecture and Code Optimization (TACO), Volume 11 Issue 4, January 2015, Article No. 51.
    Abstract / PDF [1.1M]

  • Research Problems and Opportunities in Memory Systems. Onur Mutlu, Lavanya Subramanian. Invited Article in Supercomputing Frontiers and Innovations (SUPERFRI), 2015.
    Abstract / PDF [1.72M]

  • The Main Memory System: Challenges and Opportunities. Onur Mutlu, Justin Meza, Lavanya Subramanian. Invited Article in Communications of the Korean Institute of Information Scientists and Engineers (KIISE), 2015.
    Abstract / PDF [813K]

  • Main Memory Scaling: Challenges and Solution Directions. Onur Mutlu. Invited Book Chapter in More than Moore Technologies for Next Generation Computer Design, pp. 127-153, Springer, 2015.
    Abstract / PDF [1.02M]