Parallel Data Laboratory

Freeblock Scheduling for Busy Disks

Freeblock scheduling is a new approach to utilizing more of disks' potential media bandwidths. By interleaving low priority disk activity with the normal workload (here referred to as background and foreground, respectively), one can replace many foreground rotational latency delays with useful background media transfers. With appropriate freeblock scheduling, background tasks can receive 20--50\% of a disk's potential media bandwidth without any increase in foreground request service times. Thus, this background disk activity is completed ``for free'' in the context of mechanical positioning for foreground requests.

There are many disk-intensive background tasks that are designed to occur during otherwise idle time. Examples include disk reorganization, file system cleaning, back-up, prefetching, write-back, integrity checking, virus detection, tamper detection, report generation, and index reorganization. When idle time does not present itself, these tasks either compete with foreground tasks or are simply not completed. Further, when they do compete with other tasks, these background tasks do not take full advantage of their relatively loose time constraints and paucity of sequencing requirements. As a result, these ``idle time'' tasks often cause performance or functionality problems in busy systems. With freeblock scheduling, these background tasks can operate continuously and efficiently, even when they do not have the system to themselves.

In developing and exploring freeblock scheduling, we have demonstrated its value with concrete examples of its use for storage system management and disk-intensive applications. The first example shows that cleaning in a log-structured file system can be done for free even when there is no truly idle time, resulting in up to a 300% speedup. The second example explores the use of free bandwidth for data mining on an active on-line transaction processing (OLTP) system, showing that over 47 full scans per day of a 9GB disk can be made with no impact on OLTP performance.

People

FACULTY

Greg Ganger
David Nagle

STUDENTS

Chris Lumb
Jiri Schindler
Eno Thereska

INDUSTRY COLLABORATORS

Erik Riedel, Seagate

Publications

Design and Implementation of a Freeblock Subsystem. Eno Thereska, Jiri Schindler, Christopher R. Lumb, John Bucy, Brandon Salmon, Gregory R. Ganger. Carnegie Mellon University Parallel Data Lab Technical Report CMU-PDL-03-107, December, 2003.
Abstract / Postscript [6.5M] / PDF [165K]
A Framework for Building Unobtrusive Disk Maintenance Applications. Eno Thereska, Jiri Schindler, John Bucy, Brandon Salmon, Christopher R. Lumb, Gregory R. Ganger. Proceedings of the 3rd USENIX Conference on File and Storage Technologies (FAST '04). San Francisco, CA. March 31, 2004. Supercedes Carnegie Mellon University Technical Report CMU-CS-03-192, October 2003.
Abstract / Postscript [5.1M] / PDF [148K]
Freeblock Scheduling Outside of Disk Firmware. Christopher R. Lumb, Jiri Schindler, Gregory R. Ganger. Conference on File and Storage Technologies (FAST) January 28-30, 2002. Monterey, CA. Also available as CMU SCS Technical Report CMU-CS-01-149.
Abstract / Postscript [643K] / PDF [150K]
Towards Higher Disk Head Utilization: Extracting "Free" Bandwidth From Busy Disk Drives. Lumb, C., Schindler, J., Ganger, G.R., Nagle, D.F. and Riedel, E. Appears in Proc. of the 4th Symposium on Operating Systems Design and Implementation, 2000. Also available as CMU SCS Technical Report CMU-CS-00-130.
Abstract / Postscript [2.3M] / PDF [422K]
Data Mining on an OLTP System (Nearly) for Free. Riedel, E., Faloutsos, C., Ganger, G.R. and Nagle, D.F. Proc. of the 2000 ACM SIGMOD International Conference on Management of Data, Dallas, Texas, May 14-19, 2000. Also available as CMU SCS Technical Report CMU-CS-99-151.
Abstract / Postscript [1.0M] / PDF [171K]

Acknowledgements

We thank the members and companies of the PDL Consortium: Amazon, Bloomberg LP, Datadog, Google, Honda, Intel Corporation, Jane Street, LayerZero Research, Meta, Microsoft Research, Oracle Corporation, Oracle Cloud Infrastructure, Pure Storage, Salesforce, Samsung Semiconductor Inc., and Western Digital for their interest, insights, feedback, and support.