Understanding Latency Variation in Modern DRAM Chips: Experimental Characterization, Analysis, and Optimization
Proceedings of the ACM International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS), Antibes Juan-Les-Pins, France, June 14 - 18, 2016.
Kevin K. Chang1, Abhijith Kashyap1, Hasan Hassan1;2, Saugata Ghose1, Kevin Hsieh1, Donghyuk Lee1, Tianshi Li1;3, Gennady Pekhimenko1, Samira Khan4, Onur Mutlu5;1
1 Carnegie Mellon University
2 TOBB ETÜ
3 Peking University
4 University of Virginia
5 ETH Zürich
Long DRAM latency is a critical performance bottleneck in current systems. DRAM access latency is defined by three fundamental operations that take place within the DRAM cell array: (i) activation of a memory row, which opens the row to perform accesses; (ii) precharge, which prepares the cell array for the next memory access; and (iii) restoration of the row, which restores the values of cells in the row that were destroyed due to activation. There is significant la tency variation for each of these operations across the cells of a single DRAM chip due to irregularity in the manufac turing process. As a result, some cells are inherently faster to access, while others are inherently slower. Unfortunately, existing systems do not exploit this variation.
The goal of this work is to (i) experimentally character ize and understand the latency variation across cells within a DRAM chip for these three fundamental DRAM opera tions, and (ii) develop new mechanisms that exploit our understanding of the latency variation to reliably improve per formance. To this end, we comprehensively characterize 240 DRAM chips from three major vendors, and make several new observations about latency variation within DRAM.We nd that (i) there is large latency variation across the cells for each of the three operations; (ii) variation characteristics exhibit significant spatial locality: slower cells are clustered in certain regions of a DRAM chip; and (iii) the three fundamental operations exhibit different reliability characteristics when the latency of each operation is reduced.
Based on our observations, we propose Flexible-LatencY DRAM (FLY-DRAM), a mechanism that exploits latency variation across DRAM cells within a DRAM chip to improve system performance. The key idea of FLY-DRAM is to exploit the spatial locality of slower cells within DRAM, and access the faster DRAM regions with reduced latencies for the fundamental operations. Our evaluations show that FLY-DRAM improves the performance of a wide range of applications by 13.3%, 17.6%, and 19.5%, on average, for each of the three different vendors' real DRAM chips, in a simulated 8-core system. We conclude that the experimental characterization and analysis of latency variation within modern DRAM, provided by this work, can lead to new techniques that improve DRAM and system performance.
FULL PAPER: pdf