PARALLEL DATA LAB 

PDL Abstract

The Read-Only Semi-External Model

APOCS 2021, January 13, 2021 Virtual Conference, Alexandria, Virginia, U.S.

Guy E. Blelloch1, Laxman Dhulipala3, Phillip B. Gibbons1, Yan Gu2, Charles McGuffey1, Julian Shun3

1Carnegie Mellon University
2University of California, Riverside
3Massachusetts Institute of Technology

http://www.pdl.cmu.edu

We introduce the Read-Only Semi-External (ROSE) Model for the design and analysis of algorithms on large graphs. As in the well-studied semi-external model for graph algorithms, we assume that the vertices but not the edges fit in a small fast (shared) random-access memory, the edges reside in an unbounded (shared) external memory, and transfers between the two memories are done in blocks of size B. A key difference in ROSE, however, is that the external memory can be read from but not written to. This difference is motivated by important practical considerations: because the graph is not modified, a single instance of the graph can be shared among parallel processors and even among unrelated concurrent graph algorithms without synchronization, that instance can be stored compressed without the need for re-compression, the graph can be accessed without cache coherence issues, and the wear-out problems of non-volatile memory, such as Optane NVRAM, can be avoided.

Using ROSE, we analyze parallel algorithms (some existing, some new) for 18 fundamental graph problems. We show that these algorithms are work-efficient, highly parallel, and read the external memory using only a block-friendly (and compression-friendly) primitive: fetch all the edges for a given vertex. Analyzing the maximum times this primitive is called for any vertex yields an (often tight) bound on the (low) I/O cost of our algorithms. We present new, speciallydesigned ROSE algorithms for triangle counting, FRT trees, and strongly connected components, devising new parallel algorithm techniques for ROSE and beyond.

FULL PAPER: pdf