INDUSTRY JOB OPPORTUNITY

Oracle: Big-data, In-memory Data Management

Position Type: multiple positions

Overview

We design and develop the data storage & processing engine for the Oracle Database and the Exadata Database Machine. Our focus is on big-data, in-memory data management, high performance OLTP and real-time in-memory analytics. Some of the highlights of this team are as follows:

To learn more about the recent in-memory innovations by this team with the Oracle Database’s In-Memory option, featuring SIMD vector processing, memory-optimized compression and columnar technology, please see:

1. Recorded keynote by Larry Ellison at the July 2014 launch of Database In-Memory at http://www.oracle.com/us/corporate/events/dbim/index.html.

2. Slides from a keynote presentation at the ADMS workshop during VLDB 2014 at http://www.adms-conf.org/2014-slides/VLDB-2014-Oracle-ADMS-Keynote-Public.pdf.

We are one of the fastest innovating teams in the Database Industry, with our technology regularly featured within the Top 5 features in the database by external sources for multiple releases. Our innovations extend the Oracle Database to be far more than just a data processing framework – for example, Flashback / Temporal technology allows the Oracle database user to travel back in time, Secure Files and the Database File System allow Oracle to serve as a very high performance unstructured content repository, Advanced Compression and Advanced Data Optimization allow Oracle Database to facilitate the role of an automatic broker for information lifecycle management. Our largest customers entrust their data management needs to the Oracle Database because of technologies built by this team.
The Data Technologies group covers four functional areas.

1. Transaction Processing

The Transaction Processing team develops the engine for transaction processing for the database and the mid-tier. We are responsible for transaction locking, multi-versioning concurrency control, parallel and distributed coordination protocols, cache fusion protocols for clusters, self-learning undo management, and transaction recovery. We are leveraging our infrastructure and competencies in these core areas to build next-generation technologies such as columnar storage, transactional storage web service, continuous query notification, flashback technologies, active database history, heterogeneous standby, and cluster transaction fusion. Ongoing and future projects are in the areas of in-memory column store maintenance, continuous query notifications, cluster-wide distributed transactions (transaction fusion), historical data store, flashback transaction, auto correcting undo management, and application integration.

Work in this area is systems-related: a mix of operating systems, databases, parallel and distributed systems, as well as mid-tier infrastructure.

2. Data Storage

The Data Storage group designs storage and access structures for the entire database: from the upcoming in-memory columnar data format, b-tree and bitmap indexes, heap tables, index-organized-tables, secure files and lobs, hybrid columnar storage, etc. From the constantly changing landscape of processor and storage technology to new application requirements (such as XML, JSON, media streaming, indexing), this is a pivotal group that has the charter of providing technology leadership for the Oracle database server. We are building the world's fastest and most-feature-rich database storage engine and indexing technology. Some of our recent efforts have been in areas related to: In-Memory Columnar Databases, compression, encryption, sliding-inserts for efficient XML storage, snapshots, filesystem caching and performance, and scalability in clustered server environments.

The Data Storage layer for the Oracle database is responsible for the storage and retrieval of all data stored in Oracle (relational, XML, text, spatial and graph, OLAP warehouses, unstructured files etc). We organize data both within memory for in-memory columnar and row processing, as well as inside disk blocks, create and manage efficient structures from which those blocks are accessed e.g. a B-tree, bitmap index, LOB, clusters etc., and methods for accessing data from these transactional data structures. Our group has some very hard and interesting problems to solve in the area of distributed systems where database span clusters comprising hundreds of nodes but must provide a single-system image to the end user.

We are also working on providing an intelligent storage subsystem to which the database can push predicate evaluation, projection, aggregation etc. to the storage layer, effectively pushing evaluation logic down to the data as opposed to pulling massive amounts of data to the cpu. Additionally, there are new row-major and column-major storage schemes that we need to design to cater to the business intelligence world for processing petabytes of information while optimizing for the new generation of multi-byte SIMD register instructions provided by present and upcoming CPUs.

3. Space Management

People want to store everything on persistent media: books, pictures, health records, music, videos, everything. Disks are getting bigger and cheaper, but they are not getting much faster, or easier to manage. At the core of Oracle's business is its ability to store data and do it in a high-performance, scalable, reliable, and manageable way. Now consider that we need to do this just as well for an Exabyte of data. Space truly is the final frontier!

Space management is a fundamental component of the RDBMS that provides an abstraction to the database storage subsystem. Space requirements for the database are primarily of two kinds - temporary scratch space required to store intermediate results generated in the database and persistent storage used to store user data. From managing the temporary space for sorting a terabyte of data to finding the best slot in a petabyte volume disk for storing the next piece of employee payroll information, intelligent space management is one of the foundations for high-performance OLTP and data warehouse systems. Space management needs vary with different kinds of data stored in the database. Storing a streaming video has different requirements than storing the product item names. Developing an efficient storage management component that works for all data types and also scales for several hundred thousand concurrent users will be one of the toughest challenges we face.

4. Database File System (DBFS) and SecureFiles

In Oracle Database 11g, the Data Storage group introduced Oracle SecureFiles LOBs. SecureFiles LOBs provide high performance storage for files, comparable to the performance of traditional file systems. SecureFiles LOBs support advanced features of compression, deduplication and encryption to files.
Along with SecureFile LOBs, the Data Storage group conceived of and delivered the Oracle Database File System (DBFS) which leverages the features of the database to store files, and the strengths of the database in efficiently managing relational data, to implement a standard file system interface for files stored in the database. DBFS provides an out-of-the-box POSIX-conformant filesystem backed by the full feature-set and robustness of the Oracle RDBMS. Files in the database can be transparently accessed using any operating system (OS) program that acts on files. For example, ETL (Extract, Transform and Load) tools can transparently store staging files in the database.

Oracle customers have embraced both SecureFiles LOBs and DBFS and the adoption rate continues to grow. We have challenging projects in the area of DBFS and SecureFiles performance improvements and in integration with the new Oracle Database In-Memory Option.

To explore these and many more challenges in developing the next generation data management platform at Oracle, please come and check us out. If you are excited by complex problems spanning computer systems, algorithms, and theoretical computer science as well as the opportunity to work in a fun, creative, and fast-paced team, this is the right group for you

How to Apply

If interested, please contact :

Tirthankar Lahiri
Vice President, Data and In-Memory Technologies
tirthankar.lahiri@oracle.com
650 506 6279

 

 

 

 

 

© 2024. Legal Info.
Last updated 14 May, 2015