Data Structure Fusion

Peter Hawkins, Alex Aiken, Kathleen Fisher, Martin Rinard, and Mooly Sagiv

Stanford University, AT&T Labs Research, MIT, Tel Aviv University

Abstract. We consider the problem of specifying data structures with

complex sharing in a manner that is both declarative and results in

provably correct code. In our approach, abstract data types are speci-

fied using relational algebra and functional dependencies; a novel fuse

operation on relational indexes specifies where the underlying physical

data structure representation has sharing. We permit the user to specify

different concrete shared representations for relations, and show that the

semantics of the relational specification are preserved.

1 Introduction

Consider the data structure used in an operating system kernel to represent the

set of available file systems. There are two kinds of objects: file systems and

files. Each file system has a list of its files, and each file may be in one of two

states, either currently in use or currently unused. Figure 1 sketches the data

structure typically used:1each file system is the head of a linked list of its files,

and two other linked lists maintain the set of files in use and files not in use.

Thus, every file participates in two lists: the list of files in its file system, and

one of the in-use or not-in-use lists. A characteristic feature of this example is

the sharing: the files participate in multiple data structures. Sharing usually

implies that there are non-trivial high-level invariants to be maintained when

the structure is updated. For example, in Figure 1, if a file is removed from a file

system, it should be removed from the in-use or not-in-use list as well. A second

characteristic is that the structure is highly optimized for a particular expected

usage pattern. In Figure 1, it is easy to enumerate all of the files in a file system,

but without adding a parent pointer to the file objects we have only a very slow

way to discover which file system owns a particular file.

We are interested in the problem of how to support high-level, declarative

specification of complex data structures with sharing while also achieving ef-

ficient and safe low-level implementations. Existing languages provide at most

one or the other. Modern functional languages provide excellent support for

inductive data structures, which are all essentially trees of some flavor. When

multiple such data structures overlap (i.e., when there is more than one induc-

tive structure and they are not separate), functional languages do not provide

any support beyond what is available in conventional object-oriented and pro-

cedural languages. All of these languages require the programmer to build and

maintain mutable structures with sharing by using explicit pointers or reference

1This example is a simplified version of the file system representation in Linux, where

file systems are called superblocks and files are inodes.

Data Structure Fusion, Slides of Data Structures and Algorithms

Related documents

Partial preview of the text

Download Data Structure Fusion and more Slides Data Structures and Algorithms in PDF only on Docsity!

1 Introduction

3 Abstraction, Well-formedness, and Adequacy

(A \ C) → (B \ C) | (A → B) ∈ ∆

5 Relational Operations

7 Conclusion

Bibliography