PARALLEL DATA LAB 

PDL Abstract

Connections: Using Context to Enhance File Search

Carnegie Mellon University Parallel Data Lab Technical Report CMU-PDL-05-105, April 2005. Superceded by SOSP'05, October 23–26, 2005, Brighton, United Kingdom.

Craig A.N. Soules, Gregory R. Ganger

Parallel Data Laboratory, Carnegie Mellon University.
Pittsburgh, PA 15213

http://www.pdl.cmu.edu/

The continued growth of personal file systems demands a shift from manual file organization to effective on-demand search tools. Today’s best search tools use content analysis techniques to provide targeted, ranked results for user queries. However, these tools are missing a key way that users remember and search for their data: context. Context is the set of external events that a user associates with a file’s use: the user’s current task, other files being accessed, the time of day, etc. This paper presents Connections, a search system that combines content analysis with context information using temporal locality of file accesses. Through this combination, Connections improves both the false-negative rate (recall) and false-positive rate (precision) over content analysis alone. That is, by adding context information, our system finds more of the desired files and ranks them more accurately.

KEYWORDS: file search, contextual search, successor models

FULL PAPER (TR VERSION): pdf
FULL PAPER (CONFERENCE VERSION): pdf