Thursday, May 20, 2004
Noon - 1:30 pm
Intel Seminar (417 S. Craig Street - 3rd Floor)
EVENTS PAGE: http://intel-research.net/pittsburgh/Seminars.asp
SPEC and Beyond
This talk will cover two separate areas of architecture research. The premise of the first part is that the latency of a traditionally sized first level data cache (DL1) is on a growth trend and will continue to grow to many cycles. For the DL1 to continue as an efficient mechanism, the data cache hierarchy must include a cache or caches that are smaller and faster than the traditional DL1. A key problem with the conventional cache hierarchy organization is that all load instructions are treated uniformly, implying that all target data are vying for positions in each level of the cache hierarchy regardless of the importance of that data. Our proposal is a new mechanism, termed Punctual Data Supply, which focuses on providing a load its data on time. This new mechanism classifies loads based on their load-use distance and then directly assigns loads to caches with appropriate latencies. Overall, a cache organization employing punctual data supply outperforms a conventional cache organization across the SPEC benchmark suite.
The second part of the talk goes beyond the study of SPEC benchmarks and analyzes phase behavior of server workloads. Recent studies have shown that most SPEC benchmarks exhibit strong phase behavior as the Cycles per Instruction (CPI) performance metric can be accurately predicted based on a program's control-flow behavior by simply observing the sequencing of the program counters, or extended instruction pointers (EIP). One motivation of this work is to see if server workloads also exhibit such phase behavior. In particular, can EIPs effectively predict CPI in server workloads? In order to accurately quantify these relationships, we propose using regression trees to measure the theoretical upper bound on the accuracy of predicting the CPI using EIPs. We show that server workloads and, surprisingly, even SPEC benchmarks exhibit a wide range of phase behavior. Instead of classifying benchmarks on only phase behavior, workloads are appropriately placed into four quadrants based on two factors: their CPI variance and their phase behavior.
Ryan Rakvic has been a senior research scientist in the Microarchitecture Research Lab (MRL) at Intel Corporation since October of 2000. Ryan's past research spans multiple areas of the microprocessor, including instruction supply, data supply, and other aspects of both in-order and out-of-order microprocessor cores. Currently, he is studying server workloads on parallel systems, including both database and java application servers. Prior to joining Intel, Ryan spent three summers interning at no other than Intel Corporation. He was a part of the Itanium processor team for two of these three summers. In technical preparation, Ryan received his B.S. in Computer Engineering from the University of Michigan and his M.S. and Ph. D. both in Computer Engineering from Carnegie Mellon University.
Contact Kim Kaan, 412-605-1203,
or visit http://www.intel-research.net.
SDI Home: http://www.pdl.cmu.edu/SDI/