DOCC Lab Reading Group

DBOS: A DBMS-oriented Operating System

Paper Link

Research Question: Is it feasible to implement a (distributed) operating system on top of a relational database (RDMS)? Can we improve analytics/observability of cluster/cloud computing software while simplifying code complexity by doing this?

Key Contributions: This paper presents a vision for a new operating system stack approach, motivated by the shift to large-scale cloud/distributed computing and the increasing complexity of system design that comes with scale. The core insight at the heart of this idea is that a general-purpose, distributed DBMS must internally solve many if not all the major problems facing an operating system at cluster scale, so why not leverage that engineering effort as much as possible? Implicit in this is an assumption that we should be willing to break compatibility with existing system designs and interfaces, provided the net benefits are sufficiently high. Within the scope of this larger project and its assumptions, the goal of this paper is to offer an empirical basis for the feasibility of a DBOS in terms of performance and resource efficiency.

The authors spend much more time on the question of feasibility than potential benefit. On the question of benefit, they cite Kubernetes, XTrace, Dapper, and Prometheus: “Today’s distributed computing stacks … provide few abstractions [to developers]… [requiring them] to build or cobble together a wide range of external tools to monitor application metrics, collect logs, and enforce security policies. In contrast, in DBOS, the entire state of the OS and the application is available in structured tables that can simply be queried.” (2.1) The other benefit, reducing system complexity via a set of common abstractions (data model), interfaces (SQL), and mechanisms, is motivated by attempting to show how common OS concerns (task scheduling, IPC/RPC, and file system) can be mapped onto relational data models and SQL-like queries.

There is an intrinsic “chicken-and-egg” problem here: the effort to create a production quality DBOS can only be justified through applications, and applications require the existence of an operating system. This paper offers two mitigations to this dilemma: 1. A paradigmatic break in operating system design, with particular emphasis on a single organizing motif/principle is not without precendent: Unix (“everything is a file”) was similarly revolutionary at its conception, but it succeeded nonetheless, and 2. A three-step incremental roadmap (“Straw,” “Wood,” and “Brick” - the naysayers (big bad wolves) will be proven wrong!) which at each milestone requires increasing implementation effort, but also promises more functionality, such that future investment is justified by the success of the previous iteration. For this paper, DBOS-straw is implemented and evaluated.

“Straw” uses VoltDB, a distributed in-memory “New-SQL” relational database optimized for small online transactions. Applications are represented as task graphs (DAGs), whose structure represents functional/data dependencies. Many tables are partitioned by a primary key in such a way that this key determines the node on which operations for that row will execute. One interesting exception is the “parallel filesystem” which instead shards/partitions by block number to achieve higher throughput. Overall, the evaluation section does a good job at showing that, for the basic OS functionality chosen, the DBOS approach performs within an order of magnitude of current solutions. This is of course not good enough to supplant the current paradigm, but perhaps argues against DBOS being a completely crazy idea in practical (performance) terms.

Even though most of this paper focuses on the question of feasibility, it is worth asking: is it a good idea to “fuse” the mechanisms, data models, and interfaces used for OS and application functionality with those used for observability and debugging? Is the DBOS concept likely to replace the current landscape? If not, what are some ways that current mechanisms might be influenced by the DBOS philosophy on system design?

Opportunities for Future Work:

Presenter: Tony Astolfi