PARALLEL DATA LAB 

PDL Abstract

Workload Analysis and Caching Strategies for Search Advertising Systems

SoCC ’17, September 24–27, 2017, Santa Clara, CA, USA..

Conglong Li, David G. Andersen, Qiang Fu*, Sameh Elnikety*, Yuxiong He*

Carnegie Mellon University
* Microsoft Research

http://www.pdl.cmu.edu

Search advertising depends on accurate predictions of user behavior and interest, accomplished today using complex and computationally expensive machine learning algorithms that estimate the potential revenue gain of thousands of candidate advertisements per search query. The accuracy of this estimation is important for revenue, but the cost of these computations represents a substantial expense, e.g., 10% to 30% of the total gross revenue. Caching the results of previous computations is a potential path to reducing this expense, but traditional domain-agnostic and revenue-agnostic approaches to do so result in substantial revenue loss. This paper presents three domain-specific caching mechanisms that successfully optimize for both factors. Simulations on a trace from the Bing advertising system show that a traditional cache can reduce cost by up to 27.7% but has negative revenue impact as bad as −14.1%. On the other hand, the proposed mechanisms can reduce cost by up to 20.6% while capping revenue impact between −1.3% and 0%. Based on Microsoft’s earnings release for FY16 Q4, the traditional cache would reduce the net profit of Bing Ads by $84.9 to $166.1 million in the quarter, while our proposed cache could increase the net profit by $11.1 to $71.5 million.

FULL PAPER: pdf