PDL CONSORTIUM SPEAKER SERIES A ONE-AFTERNOON SERIES OF SPECIAL SDI TALKS BY DATE: Tuesday, May 8, 2018 SPEAKERS:
BIO: Shao-Wen Yang is a senior staff research scientist at Intel Labs, Intel Corporation. He received his PhD degree in computer science from National Taiwan University in 2011. He joined Intel Labs in Taiwan in 2013 as a resident scientist in the Intel-NTU Connected Context Computing Center and focused on research and development of Internet of Things technology and middleware. In 2016 he moved to California, and started to focus on the Internet of Video Things. He served on the technical program committee of the Design Automation Conference and has served as a guest editor for several international journals. His current research interests include various aspects of visual fog computing, especially e-of-use frameworks for workload creation, and partitioning and orchestration of visual fog workloads. He has over 20 technical publications, and over 60 patents and pending patent applications.
In this talk we will go over the challenges faced in incorporating SPDK within a complex enterprise class application: Oracle RDBMS. There is a significant impedance mismatch in deployment models of classical SPDK enabled applications and Oracle process and memory model. Oracle database consists of a large number of processes (over 10K+ processes) per node and a large System Global Area (SGA) that is symmetrically mapped into each process. Oracle RDBMS implements a comprehensive memory management infrastructure spanning SGA, process private memory (PGA) as well as a Managed Global Area (MGA) optimized for efficient data transfer over high performance networks such as Infiniband and RoCE. Most NVMe SSDs contain a limited number of hardware IO queues. To enable high performance IO dispatch and completion from a large number of processes to local NVMe drives Oracle has implemented a dispatcher model. The IO dispatcher provides light weight dispatch of IOs using shared memory lock free queues and polling for completions. Oracle dispatcher implements various scheduling policies to optimize for overall throughput and latency dependent on workload as well as QoS aware scheduling of IOs. For seamless integration with the memory model and RDMA data transfer optimizations available in Oracle we have worked with the SPDK community to decouple the storage libraries from the underlying DPDK runtime that provides the memory management, threading and messaging primitives. Oracle has implemented a SPDK environment library (ORAENV) that provides similar features to DPDK RTE environment using Oracle runtime services. The ORAENV library provides dynamic allocation of shared memory from SGA, PGA and MGA pools that is optimized for local storage IO as well as remote network IO with NVMeoF. High performance NVMeoF communications using ORAENV is accomplished using optimizations such as Shared Protection Domain which allows a single memory mapping to be registered with the RDMA adapter for use by all Oracle processes. This significantly reduces the Memory Translation Table (MTT) cache thrashing on the RDMA adapter which has been shown to be a bottleneck in IO throughput with increased cache misses. BIO: Zahra Khatami is working as a Member of Technical Staff at Virtual Operating system (VOS) group at Oracle. She has received her PHD and Master degrees from Louisiana State University in the field of computer science. She is currently working on developing a framework for supporting SPDK in Oracle Database.
BIO: Gosia Steinder is (an) IBM Fellow in IBM Research, Yorktown Heights, NY. In the last years she has been working on resource management and orchestration in clouds, container management, and container security.
BIO: Xiaoyong Liu is currently a Senior Staff Engineer and Chief Architect of heterogenous computing for machine learning at the Alibaba Infrastructure Service Group. Prior to joining Alibaba, Xiaoyong was a senior staff engineer at Xilinx, leading SDAccel development environment for enabling FPGA's performance/watt advantage in data-center application acceleration. Before that, he worked for Synopsys, leading the performance effort for its VCS-MX HDL Simulator. Xiaoyong Liu received B.S and M.Sc in Computer Science from University of Science and Technology of China.
BIO: Larry Rudolph is a Senior Research Scientist at Two Sigma, an Affiliate MIT/CSAIL, and academic advisor to the Massachusetts Open Cloud. Larry joined the "Labs" team at Two Sigma where he continues his academic engagements, publishing and co-advising students at various institutions. He has working and published in the areas of parallel processing from theory to practice. He has been on the faculty of CMU, The Hebrew University, and MIT.
BIO: Weiwei Gong leads the Vector Flow Analytic team in the Data and In-Memory Technologies group, which is responsible for designing and developing the data storage and processing engine for the Oracle database. She joined Oracle in 2014, since then she has been working on leveraging new hardware technologies to accelerate SQL processing. She obtained Ph.D. in Computer Science from University of Massachusetts Boston and holds M.Sc. from Renmin University of China.
The talk will conclude by discussing potential future work in space-efficient maps and other potential applications of these data structures. BIO: Rob Johnson is a Senior Researcher at VMware and Research Assistant Professor at Stony Brook University. He developed BetrFS, invented the quotient filter, developed the Squeakr and Mantis tools for large-scale computational biology, created the foundational theory of cache-adaptive analysis, broke the high-bandwidth Digital Content Protection (HDCP) crypto-system, and co-authored CQual, a static analysis tool that has found dozens of bugs in the Linux kernel. SDI / ISTC SEMINAR QUESTIONS? |