Introduction to Theseus

Note: for general info about Theseus and a quick start guide, see the top-level README.

Theseus is a new OS written from scratch in Rust to experiment with novel OS structure, better state management, and how to leverage intralingual design principles to shift OS responsibilities like resource management into the compiler.

Continue to the next chapter to learn more about Theseus, or feel free to check out our published academic papers for a deep dive into the research and design concepts behind Theseus.

What's in a name?

The ship wherein Theseus and the youth of Athens returned from Crete had thirty oars, and was preserved by the Athenians down even to the time of Demetrius Phalereus, for they took away the old planks as they decayed, putting in new and stronger timber in their places, in so much that this ship became a standing example among the philosophers, for the logical question of things that grow; one side holding that the ship remained the same, and the other contending that it was not the same. — Plutarch, Theseus

The name "Theseus" was inspired by The Ship of Theseus, an ancient Greek metaphysical paradox and thought experiment that pondered: "if you iteratively replace every individual piece of an object, is that re-built object still the same object?"

Though we do not attempt to answer this question, we do wish to enable any and every OS component to be replaced — across all layers of the system — at runtime without rebooting. This goal of easy and arbitrary live evolution was (and still is) one of the original motivating factors behind Theseus.

Theseus's Design and Structure

Theseus is a safe-language OS, in which everything runs in a single address space (SAS) and single privilege level (SPL). This includes everything from low-level kernel components to higher-level OS services, drivers, libraries, and more, all the way up to user applications. Protection and isolation are provided by means of compiler- and language-ensured type safety and memory safety, as explained in a later section.

Structure of many small Cells

Theseus is implemented as a collection of many small entities called cells, a software-defined unit of modularity that acts as the core building block of Theseus. The cell concept is a term we coined to represent an individual entity of code and/or data that can be loaded into Theseus. A cell is not a thread of execution, nor is it related to Rust's std::cell types.

The Biological Cell Analogy

Cells in Theseus are inspired by and akin to biological cells in an organism, as they both have many attributes in common:

Cells are the basic structural unit
Cells are tiny parts of a greater whole, yet remain distinct despite complex interactions and hierarchies
Cells each have widely differing roles, but can all be viewed under a single uniform abstraction
Cells have an identifiable boundary (cell membrane = public interface) that explicitly regulates what enters and exits (selective permeability = naming visibility)
Cells can be arbitrarily "refactored" into multiple different units (meiosis/mitosis = live evolution)
Cells can be replaced independently after failing or dying (cell motility = fault recovery)

As such, we sometimes refer to Theseus as a cytokernel, in that it is composed of cells. This reinforces the distinction between the design of Theseus and that of other kernels, e.g., monolithic kernels, microkernels, multikernels, etc. Read more here.

Cell ≈ Crate

Currently, there is a one-to-one relationship between a cell and a Rust crate. The crate is Rust's project container that consists of source code and a dependency manifest file. The crate also serves as Rust's translation unit (elementary unit of compilation); in Theseus we configure each Rust crate to be built into a single .o object file (a relocatable ELF file).

Thus, the cell abstraction is always present in Theseus, but takes different forms as shown in the below diagram.

At implementation time, a cell is a crate.
After compile (build) time, a cell is a single .o object file.
At runtime, a cell 🄲 is a structure that contains the set of sections 🅂 from its crate object file, which have been dynamically loaded and linked into memory, as well as metadata about the inter-dependencies between it and others.

Theseus's cell abstraction is present across implementation, build, and runtime

In Theseus, the metadata stored for each cell is defined by the kernel/crate_metadata crate, which includes two main types:

LoadedCrate, which represents a single crate loaded into memory and linked against other loaded crates. The LoadedCrate owns the memory regions holding its sections, along with other metadata about sections and symbols in that crate.
LoadedSection, which represents an individual section within a loaded crate, as specified in its object file. A LoadedSection comprises several main items:
- The section type, e.g., .text (an executable function), .rodata (constant data), .data/.bss (read-write data)
- Outgoing dependencies: the list of other sections from other crates that this section depends on (and links against).
- Incoming dependencies: the list of other sections from other crates that depend on (link against) this section.
- References to its containing "parent" crate and location within that crate's memory region where this section is loaded.

Note that dependencies are tracked on a fine-grained, per-section basis in order to facilitate challenging OS goals like live evolution at runtime, system flexibility, fault recovery, and more. Dependencies are derived from relocation entries specified in the .rela.* sections in the ELF object file. This is much more precise than deriving dependencies from crate-level Cargo.toml manifests.

Each cell is loaded and linked into a namespace, which we refer to as a CellNamespace or CrateNamespace, which represents a true namespace of all of the publicly-visible symbols that are exposed by the cells within it. Namespaces are useful for quick dependency (symbol) resolution during dynamic linking, and also play a key role in the above system goals, especially flexibility, as they can be used to efficiently realize multiple distinct OS personalities to serve different applications with disparate needs.

A simple CrateNamespace showing three crates with sections that depend on each other

The above diagram depicts a simple set of three crates whose sections depend upon each other and are thus linked into a single namespace. The MappedPages (MP) objects are Theseus's abstraction of owned memory regions.

Comparison with other OS designs

The below figure shows the distinction between the structure of existing OS/kernel designs and Theseus.

Existing OS designs vs. Theseus

Monolithic OSes are the most common, including Linux and other Unix-like OSes and most commercial systems like Windows and macOS. In a monolithic OS, all kernel components exist and run in a single kernel address space, meaning that intra-kernel communication is fast and efficient: simply use function calls and shared memory accesses. However, monolithic OSes are less resilient to failures in the kernel, as any crash in kernel space (such as a buggy driver) can bring down the entire system. Applications must use system calls to ask the kernel to perform privileged operations on their behalf, requiring a privilege mode switch.

Microkernel OSes are less common, but still widespread in certain computing domains where reliability is key, such as embedded systems. Microkernels move as much kernel functionality as possible into separate user space "system server" processes, leaving the kernel itself very small. This improves resiliency, as each kernel entity executes in user space in its own address space; if one crashes, the rest of the system can continue execution by restarting the failed system process. However, microkernels are less efficient: all inter-entity functionality requires Inter-Process Communication (IPC), requiring costly context switches and mode switches.

Multikernel OSes offer high scalability to manycore hardware architectures by running a separate instance of a small kernel replicated across each hardware core. Depending on the underlying hardware, system service processes may also be replicated redundantly across (subsets of) cores to improve performance by reducing contention. They typically borrow standard OS interfaces and abstractions from monolithic and microkernel systems, though presenting a standard shared memory abstraction can harm performance.

Theseus OS does not base its structure on any aspect of the underlying hardware, unlike the above three system designs. Everything, including applications, system services, and core kernel components, exists and runs in a single address space and a single privilege level (in "kernel space"). The structure of Theseus is purely software-defined and based on the modularity concept of cells. Thus, communication and shared memory access is efficient because isolation and protection are ensured by the compiler. However, everything must be written in a safe language like Rust. See this section for more about Theseus's safe-language OS design.

Source code organization

Source code in the Theseus repository is categorized into three main folders:

kernel/: components that implement the core functionality of the OS
applications/: user applications, tests, benchmarks, etc that can be invoked to run in Theseus.
libs/: components that act as standalone libraries usable outside of Theseus.

1. Kernel

Crates in the kernel/ folder are considered to be "first-party" or "privileged" components that can use unsafe code if necessary, e.g., for directly interacting with hardware at the lowest levels of the OS. That being said, we go to great lengths to avoid unsafe code throughout all of Theseus.

Kernel crates cannot depend on any application crates; if they did, the application crate would be erroneously included and built into the kernel image. Kernel crates can depend on libs crates.

2. Applications

Crates in the applications/ folder are user applications that cannot use any unsafe code. Currently this consists mostly of simple utilities and command-line tools that are developed specifically for Theseus, as well as various small apps used to test functionality or run benchmarks for performance measurements.

Application crates can depend on both kernel crates, libs crates, and even other application crates, though the latter is not recommended. See this section for more details about how applications work and how to develop one.

In the future, we expect to restrict applications to depend only upon functions and types explicitly exposed through a dedicated library, i.e., libtheseus, but this is a future development.

3. Libs

Crates in the libs/ folder are standalone projects that must not depend on anything else in Theseus, such as kernel or application crates. They are intended to be re-used by other software projects and may eventually be refactored out of the Theseus repository. The libs/ folder also includes some other repositories that we may have forked and modified for use by Theseus, often included as git submodules.

Other folders

The other folders in the root of the repository are mostly build/configuration tools and scripts. Here's a quick rundown:

book: contains the source code of this Theseus book, which you're currently reading.
cfg: contains a project-wide configuration Makefile Config.mk and JSON files that specify the compiler target platform for various Theseus builds.
docker: contains scripts and config files required to set up a basic Docker image that can be used to build and run Theseus.
scripts: contains miscellaneous scripts for setting up a build environment, testing, debugging, etc.
tools: contains custom Rust programs that run as part of Theseus's build process. See the tools/README file for more.

Project Workspace

All of the crates in the main kernel and applications folders are organized into a single-top level workspace, a way of using cargo (Rust's package manager and build tool) to build them all together into the same target/ directory. This ensures they can all be directly linked together against each other and that the dependencies between them will be resolved properly by the compiler and linker toolchains.

You can see how the members of this workspace are defined in the root Cargo.toml file, and how all other folders are ignored. This uses cargo's virtual manifest feature.

Read more about Theseus's build process here.

External dependencies

Theseus does depend on many external, third-party crates that are hosted on the official crates.io registry or on GitHub. This is currently allowed for all crates, no matter whether they are kernel, application, or libs crates. In the future, we may restrict or forbid which kinds of crates can be used by applications and whether they can expose unsafe code or certain underlying assembly instructions.

Booting Process and Flow of Execution

Initial assembly code

The Theseus kernel takes over from the bootloader and first executes code in 32-bit protected mode, which corresponds to the start function in kernel/nano_core/src/boot/arch_x86_64/boot.asm. Currently we use GRUB configured as a legacy bootloader (non-UEFI) and Theseus expects to be booted by a Multiboot2-compliant bootloader. In the future, we intend to add support for booting via the UEFI standard, especially on other architectures without a legacy BIOS.

After initializing a very simple page table and other miscellaneous hardware features, the assembly file boot.asm jumps to long_mode_start, which now runs 64-bit code in long mode. Then, it jumps to start_high, such that we're not running the base kernel image in the higher half (see more about higher-half kernels here). We then set up a new Global Descriptor Table (GDT), segmentation registers, and finally call the Rust code entry point nano_core_start() with the proper arguments. After calling nano_core_start, the assembly files are no longer used, and nano_core_start should never return.

Initial Rust code: the `nano_core`

The nano_core, specifically nano_core_start(), is the first Rust code to run in Theseus. It performs a very minimal bootstrap/setup procedure, in which it performs the following duties:

Initializes logging and a basic VGA text display, for the purpose of debugging.
Sets up simple CPU exception handlers, for the purpose of catching early errors.
Sets up a basic virtual memory environment.
- This creates the first and only virtual address space and remaps all of the bootloader-loaded sections into that new single address space.
- Importantly, Theseus doesn't depend on anything else from the bootloader after this point.
Initializes the mod_mgmt subsystem, which creates the first CrateNamespace and allows other crates to be dynamically loaded.
Loads the invokes the captain, which handles the rest of the OS initialization procedures.

The nano_core is quite general and minimalistic; it rarely needs to change. The majority of the OS-specific configuration and initialization happens in the captain, so changes should likely be made there.

Main Initialization routine: the `captain`

The captain "steers the ship" of Theseus, meaning that it contains basic logic for initializing all of the other subsystems in the proper order and with the proper flow of data between them.

Currently, there is a single captain implementation in Theseus (for a standard x86_64 machine), which does the following:

Initializes ACPI and APIC to discover multicore and other hardware configuration,
Sets up interrupt and exception handlers,
Sets up basic device drivers,
Spawns event handling threads,
Initializes the window manager and graphics subsystem,
Starts the first user application, which is currently a single terminal window.

At the end, the captain must enable interrupts to allow the system to schedule other tasks. It then falls into an idle loop that does nothing and will never be run again by the scheduler.

Note: in the future, Theseus will add additional architecture-specific captains for different platforms.

Theseus Ideas and Inspiration

Theseus is a safe-language OS, meaning that it relies on type safety and memory safety guarantees from the Rust language and compiler to enforce protection and isolation between tasks and components. As such, it foregoes hardware protection, which generally results in higher efficiency due to the ability to bypass overhead stemming from switching privilege modes and address spaces.

This is possible only when all applications are written in safe Rust, which prevents them from circumventing any type-based restrictions to cause unintended or undefined behavior.

Check out this presentation slide deck to learn more about how we ensure protection and isolation in Theseus based on the foundation of Rust's type and memory safety guarantees.

For more details about Theseus's research merit and novel design principles, see our selected list of papers and presentations here.

P.I.E. Principle

The P.I.E. principle is one of the guiding lights in the design of Theseus and much of our other systems software research. The main idea is that there are three pillars of computing goals, of which the hardware should be responsible for only two:

Performance
Isolation
Efficiency

Traditionally, systems software designers have looked to hardware to provide all three -- high performance, strong isolation, and efficiency (low overhead). We believe that hardware cannot fully realize all three. The P.I.E. principle asserts that hardware should only be responsible for performance and efficiency, but should have no role (or a minimal role) in providing isolation, safety, and security. Isolation should be the responsibility of software alone.

We sometimes refer to this as the PHIS principle: Performance in Hardware, Isolation in Software.

But why?

For one, speculative execution exploits like Meltdown and Spectre have shown that hardware-ensured isolation does not protect kernel data from untrusted user space applications to the extent we once thought. It is difficult if not impossible to verify the true behavior of closed-source hardware (CPU architectures), so we turn to open-source software instead, where the OS, compiler, language libraries, and more are subject to scrutiny and even formal verification.

In addition, modern languages like Rust are able to ensure type safety and memory safety at compile time, without the overhead of traditional safe/managed languages that rely upon inefficient garbage collection and transparent heap-based object management. Thus, we can leverage these safety guarantees to ensure that compiled code does not violate isolation between tasks (threads of execution) and software modules without the need for significant runtime checks.

Theseus transcends the reliance on hardware to provide isolation, and completely foregoes hardware privilege levels (x86's Ring 0 vs. Ring 3 distinction) and multiple address spaces. Instead, we run all code at Ring 0 in a single virtual address space, including user applications that are written in purely safe Rust. This maximizes efficiency whilst preserving protection, because we can guarantee at compile time that a given application or kernel component cannot violate isolation between modules, rendering hardware privilege levels obsolete. Theseus still does use virtual memory translation provided by the MMU, but simply for convenience and ease of memory management; it can be very difficult and inefficient to directly handle and allocate physical memory for applications, and also to find large contiguous chunks of physical memory.

Going beyond safety

We show that it's possible to leverage safe languages and compilers to go much further than just basic isolation and memory safety. For more details, read about Theseus's novel concept of intralingual design (coming soon).

Application Support and Development

One of the unusual features of Theseus, compared to mainstream operating systems like Linux, is that safe applications are loaded into the same single address space as the rest of the OS and run at the same kernel privilege level. Below, we provide information about how such apps are supported by Theseus and how you can develop a new app.

Dynamic Linking and Loading of Application Crates

Applications are simply object files that are loaded into the address space, just like any other kernel crate. The only real distinction is that they must use only safe code (unsafe code is forbidden), and they must expose a public entry point function named main, shown below. If the main function is not pub, it may be removed by compiler optimizations or undiscoverable by the application loader code, causing the application crate to be non-runnable.

pub fn main(args: Vec<String>) -> isize { ... }

Note that application-level libraries do not need to expose a main function; only applications that intend to be run as binary executables do.

If you forget to include a main() function in your application, the crate manager in Theseus will load and link it successfully but fail to run it; a runtime error will be thrown.

Creating and building a new application

Theseus's build system will automatically build any crates in the applications/ directory, so all you have to do is place your new application crate there. The name of the directory holding your crate files must be the same as the name of the crate as specified in its Cargo.toml name field.

So, for example, you could create a new application crate called my_app with the following file structure:

applications/
├── my_app
│   ├── Cargo.toml
│   └── src
│       └── lib.rs
├── ...

The applications/my_app/src/lib.rs file contains the application code with at least a fn main() body (as shown above). The applications/my_app/Cargo.toml file must specify the same name as the containing directory:

[package]
name = "my_app"
...

After building and running Theseus, you can type my_app into the Theseus shell to run the application as expected.

Examples

See the many examples in the applications/ directory. The example application is designed to serve as a starting point for your new application that you can easily duplicate. We offer a ported version of getopts to help parse command-line arguments.

Dependencies: how to use OS functionality

Currently, applications can use any Theseus kernel crate as a direct dependency (via its Cargo.toml file). This is a temporary design choice to bridge the lack of a real standard library.

In the future, this will be replaced with libtheseus in combination with Rust's standard library, in which applications can only access the kernel functionality re-exported by libtheseus and any functionality offered by the Rust standard library, which has two benefits:

Applications will not be able to access public but "sensitive" kernel functions unless they are explicitly made visible to applications via the libtheseus library.
Applications will not have to know which kernel crate provides a specific feature; they can simply depend on the single libtheseus crate to access any OS feature. Their dependency management will be very simple.

Theseus's Build Process

Cargo

Theseus uses cargo, Rust's package manager and build tool, to automatically manage dependencies and invoke the actual Rust compiler for us. We utilize cargo's workspace feature with a virtual manifest to group all of the main crates together into a single top-level meta project, which significantly speeds up build times. As such, the crates from the main repository folders (kernel/ and applications/) and all of their dependencies are all compiled into a single target/ folder.

The members of this workspace are defined in the root Cargo.toml manifest file, plus the list of other folders that should be ignored by cargo.

Makefiles

Although we use cargo to build all Rust code, we still use make and Makefiles to handle high-level build tasks. You should never need to directly run cargo or rustc commands; go through make instead.

The top-level Makefile essentially just invokes the Rust toolchain and compiler via cargo, then copies the compiled object files from the appropriate target/ directory into the top-level build/ directory, and finally generates a bootable .iso image using various bootloader tools, e.g., GRUB.

The only special build action the Makefile takes is to use the nasm assembler to compile the architecture-specific assembly code in nano_core/boot/, and then fully link that against the nano_core into a separate static binary.

Configuring Theseus

Continue on to the next section to read more about configuring Theseus's build.

Configuring Theseus

Theseus source code uses the standard Rust-provided cfg options for conditional compilation via build-time configuration.

We expose the ability to set this via the THESEUS_CONFIG environment variable, which can be set on the command line, in the Makefile itself, or in a Rust build script.

To set one or more cfg options on the command line, all cfg options must be specified in one quoted string, with each individual cfg option separated by whitespace. For example:

make run THESEUS_CONFIG="cfg_option_1 cfg_option_2"

Here's how you would set cfg options in the Makefile. In this case, we set the same cfg_option_1 whenever make my_target is executed:

my_target : export override THESEUS_CONFIG += cfg_option_1
my_target:
    $(MAKE) run

Using cfg options in Rust source code

In Rust, you can use cfg statements in one of two main ways:

As attributes on code blocks, which enable conditional compilation.

Below, foo() will be compiled as the top block if cfg_option_1 was set, otherwise foo() will be compiled as the bottom block.


#![allow(unused)]
fn main() {
#[cfg(cfg_option_1)]
fn foo() {
  println!("cfg_option_1 was enabled!");
}

#[cfg(not(cfg_option_1))]
fn foo() {
  println!("cfg_option_1 was disabled!");
}
}

As runtime if-conditionals, which enable runtime use and checking of a statically-known cfg option via the cfg!() macro, which returns a boolean value.


#![allow(unused)]
fn main() {
fn foo() {
   if cfg!("cfg_option_1") {
      println!("cfg_option_1 was enabled!");
   } else {
      println!("cfg_option_1 was disabled!");
   }
}
}

Follow the links below to read more about how Rust supports cfg options:

How `THESEUS_CONFIG` and `cfg` work together

A single line in the top-level Makefile converts the cfg options specified by THESEUS_CONFIG into Rust-known --cfg values that can be used with the standard Rust cfg attributes. That line, shown below, simply adds the "--cfg" prefix onto each whitespace-separated value in THESEUS_CONFIG and appends them all to the RUSTFLAGS environment variable.

cargo : export override RUSTFLAGS += $(patsubst %,--cfg %, $(THESEUS_CONFIG))

Note: you can also add --cfg XYZ directly to RUSTFLAGS rather than use THESEUS_CONFIG.

This is a standard approach that avoids the overhead of our prior approach based on a global build script. Read more here.

Other Configuration Options

TODO: describe the new theseus_features crate and how it allows one to include optional crates in a Theseus build.

Debug vs. Release Mode

Theseus can be built in a variety of modes, but offers two presets: debug and release build modes. By default, Theseus is built in release mode for usable performance within an emulator like QEMU. To build in debug mode, set the BUILD_MODE environment variable when running make, like so:

make run  BUILD_MODE=debug  [host=yes]

As with most languages, release mode in Rust is way faster, but can be difficult to debug with GDB.

Note: Theseus runs extremely slowly when built in debug mode; only use it when necessary to debug tricky problems. You will definitely want to run debug builds of Theseus in QEMU using KVM, e.g., by using the above host=yes argument on the make command.

There is a special Makefile cfg/Config.mk that contains the build mode options as well as other configuration options used in the kernel Makefile.

Static Build-time Linking vs. Dynamic Runtime Linking

Theseus offers two primary forms of linking and packaging its compiled crates into an ISO image. As depicted in the image below, the first (left side) is a conventional fully statically-linked build, as used in all other OSes, while the second (right side) is a novel dynamic linking approach used for Theseus research.

Standard Build-time Static Linking (left) vs. Theseus Dynamic Linking (right)

Standard build-time (static) linking

By default, Theseus is built into a single kernel binary just like a regular OS, in which all kernel crates are linked into a single static library and then packaged into a bootable .iso file. This is what happens when you run make as usual.

Dynamic runtime linking (`loadable` mode)

However, the research version of Theseus uses full dynamic loading and linking for all crates (except the nano_core) to achieve its various goals of live evolution, availability through fault tolerance, flexibility, etc. Loading and linking of crates at runtime is precisely how Theseus achieves runtime-persistent bounds; the crate management subsystem knows where it loaded a given crate into memory and can therefore maintain metadata about each loaded crate to track its bounds and dependencies.

We refer to this as loadable mode, because crates are loaded at runtime (as cells) instead of being linked into a single static kernel binary.

To enable this, set the THESEUS_CONFIG="loadable" option (or run make loadable, which sets this for you). This causes the following to occur:

Builds each crate into its own separate object file as normal, but does not link them all together.
Copies each crate's object file into the top-level build directory's module subdirectory (build/grub-isofiles/modules) such that each crate is a separate object file in the final .iso image.
- The bootloader loads these object files into memory for us, which we discover and map into memory when initializing the crate management subsystem from the nano_core. This allows Theseus to see and load available crate object files at the very beginning of the OS boot-up without needing full support for a filesystem.
Sets the loadable config option, which as seen in the nano_core (and other crates), will enable the #![cfg(loadable)] code blocks that dynamically load other crates rather than include them as static dependencies.
- In loadable code blocks, the caller dynamically looks up the symbol for a given callee function and invokes it dynamically instead of directly calling it via a regular function call. This produces a "soft dependency" in the source code rather than a "hard dependency" that actually requires the callee crate to be statically linked to the caller crate.
  - This is somewhat similar to dlopen() and dlsym() for loading shared objects on Linux, at least conceptually.
- Search the code base for cfg(loadable) and cfg(not(loadable)) to see where else it is used.

Built-in Rust cfg and target options

The #[cfg()] attribute and cfg!() macro can also be used with built-in cfg options set by the Rust compiler, for example, target-specific values.

For example, Theseus frequently uses options like:

#[cfg(target_arch = "x86_64")]
#[cfg(target_feature = "sse2")]

The advantage of these features is that they can also be used in Cargo.toml manifest files to conditionally set dependencies. For example:

## Only include the `core_simd` crate as a dependency when "sse2" is enabled.
[target.'cfg(target_feature = "sse2")'.dependencies.core_simd]
...

Unfortunately, you cannot use non-built-in cfg options to conditionally specify dependencies in Cargo.toml files, such as anything that comes from THESEUS_CONFIG values.

Using cargo `features`

Another option for configuration is to expose features from a given crate; read more about features here.

Theseus does not use features extensively because it is structured as many small crates in one overarching virtual workspace. In this form, you cannot easily set one feature for a target crate across multiple dependent crates at the same time, e.g., using a single command-line argument; instead, you must individually change the Cargo.toml specification of every single crate that depends on that target crate.

Thus, we use the cfg blocks instead of features.

That being said, Theseus does choose which features it wants to use when bringing in dependencies on third-party crates, but this is minimal and only occurs for a few dependencies. Typically, features are only specified in order to choose a no_std version of a crate, i.e., telling that crate to use the Rust core library instead of the standard library.

Building Out-of-Tree Rust Crates Safely

Background: The Problem

Because Rust currently lacks a stable ABI, there is no easy, stable, or safe way to integrate two or more separately-compiled Rust binaries together. By integrate, we mean the ability to have one binary depend upon or invoke another pre-built binary, such as an executable, statically-linked library, dynamically-linked shared object, etc.

There is another related problem that stems from how the Rust compiler appends unique IDs (metadata used by the compiler) to each compiled crate and each (non-mangled) symbol in those crates; this issue presents itself even in unlinked object files.

As an example, the page_allocator crate in Theseus will be compiled into an object file with a name like page_allocator-c55b593144fe8446.o, and the function page_allocator::allocate_pages_at() implemented and exposed by that crate will be emitted as the symbol _ZN14page_allocator17allocate_pages_at17heb9fd5c4948b3ccfE.

The values of both the crate's unique ID (c55b593144fe8446) and every symbol's unique ID (e.g., heb9fd5c4948b3ccfE) are deterministic, but depend on many factors. Those factors include the compiler version, the source directory, the target directory, and more. We sometimes refer to both of these unique IDs as a hash value since the compiler creates them by hashing together these various factors; how this hash is generated is considered opaque and liable to change, thus we treat it as a black box.

Theseus loads and links crate object files dynamically at runtime. When we build all of the Theseus kernel crates together into a single target directory (read more here), the unique IDs/hash values appended to every crate name and symbol are based on the build machine's source and target directories (among other factors). A running instance of Theseus will have a single instance of the page_allocator crate loaded into memory and expect all other crates to depend upon that instance, meaning that they should be compiled to expect linkage against its specifically-hashed symbols, e.g., _ZN14page_allocator17allocate_pages_at17heb9fd5c4948b3ccfE.

If you separately compile another crate my_crate that depends on the exact same set of Theseus kernel crates, cargo will recompile all Theseus crates from source into that new target directory, resulting in the recompiled object files and their symbols having completely different unique ID hashes from the original Theseus instance. As such, when attempting to load my_crate into that already-running prebuilt instance of Theseus, it will fail to load and link because that version of my_crate will depend on differently-hashed crates/symbols, e.g., it may depend upon the symbol _ZN14page_allocator17allocate_pages_at17hd64cba3bd66ea729E instead of _ZN14page_allocator17allocate_pages_at17heb9fd5c4948b3ccfE (note the different appended hash values).

Therefore, the real problem is that there is no supported method to tell cargo that it should build a crate against a prebuilt set of dependencies. See this GitHub issue for more about why this feature would be useful, but why it still isn't supported (hint: no stable Rust ABI).

A Bad, Unsafe Solution

Technically, we could solve this by using an existing non-Rust stable ABI, like the C language ABI. This would entail defining/exposing Rust functions, data, and types in a C-compatible way such that they are compatible with the C ABI (its expected struct memory layout and calling convention). Unfortunately, this necessitates the usage of unsafe FFI code blocks (via C-style extern functions) to connect two separate bodies of fully-safe Rust code, which is both dumb and tedious.

In the above example, instead of simply invoking page_allocator::allocate_pages_at() directly, we would need to export the appropriate wrapper functions like so:


#![allow(unused)]
fn main() {
// in `page_allocator`
#[no_mangle]
pub extern "C" fn allocate_pages_at(num_pages: usize, ...) -> ... {
    page_allocator::allocate_pages_at(num_pages, ...)
    ...
}
}

and then invoke it using unsafe FFI code blocks like so:

// in `my_crate` 
extern "C" {
    fn allocate_pages_at(num_pages: usize, ...);
}
fn main() {
    unsafe {
        allocate_pages_at(15, ...);
        ...
    }
}

Note that many details are omitted above; while these code wrappers and bindings can be autogenerated, unsafety cannot be avoided.

Surely we can do better!

Solution: `theseus_cargo` for out-of-tree builds

A superior solution is to "trick" the Rust compiler into using the prebuilt crates from an existing build of Theseus. To this end, we've created theseus_cargo, a custom build tool and wrapper around cargo that resolves an out-of-tree crate's dependencies on in-tree Theseus crates using their prebuilt objects instead of rebuilding them from source.

This is realized in two parts:

Generating the prebuilt dependencies while building Theseus's (in-tree) kernel crates,
Correctly building the out-of-tree crate(s) against those prebuilt Theseus crates.

1. Generating the set of prebuilt dependencies

To create a set of dependency files understood by Rust's compiler toolchain, the main top-level Makefile invokes another custom build tool, a Rust program called copy_latest_crate_objects found in the tools/ directory. It is invoked like so (details omitted):

cargo run ... tools/copy_latest_crate_objects --  \
    --input  "target/.../deps"                    \
    --output-deps  "build/deps/"                  \
    --output-sysroot  "build/deps/sysroot/"       \
    ...

The arguments above specify that we wish to

Use the Rust-produced build artifacts (compiled crates) in the target/.../deps directory as input, and then
Copy them into the build/deps/ output directory.
Also, copy the prebuilt Rust fundamental libraries (core, alloc) that were cross-compiled into the Theseus platform-specific sysroot folder into build/deps/sysroot/.

Afterwards, the build/deps/ directory contains all prebuilt dependencies needed to compile an out-of-tree crate against the existing build of Theseus, with all the properly versioned (correctly hashed) crates and symbols. This tool also generates a TheseusBuild.toml file that describes the parameters of this build of Theseus, such that it can be replicated by theseus_cargo. For example:

target = "x86_64-unknown-theseus"
rustflags = "--emit=obj -C debuginfo=2 -C code-model=large -C relocation-model=static -D unused-must-use -Z merge-functions=disabled -Z share-generics=no"
cargoflags = "--release"
host_deps = "./host_deps"

2. Building other Rust code against the prebuilt Theseus dependencies

With the contents of build/deps/ described above, we can invoke theseus_cargo to build the new out-of-tree crate in a separate compilation instance. The theseus_cargo tool is a Rust program that invokes cargo, captures its verbose output, and then modifies and re-runs the rustc commands issued by cargo to use the prebuilt crates to fulfill the out-of-tree crate's dependencies. Those prebuilt crates are a set of dependencies, namely .rmeta and .rlib files, that are understood by the Rust compiler's internal metadata parsers.

The main modifications theseus_cargo makes to rustc commands is to replace the <values> of the following arguments with the paths and names of the prebuilt Theseus crates in /build/deps/:

-L dependency=<dir>
--extern <crate_name>=<crate_file>.rmeta

If a given rustc command has any arguments that need to be changed, theseus_cargo reissues that command.

Currently, to use theseus_cargo, it must be compiled and installed from source:

cargo install --path="tools/theseus_cargo" --root=$INSTALL_DIR

Then, it can be invoked just like cargo build, e.g., to build a crate against the prebuilt dependencies in the input folder:

$INSTALL_DIR/theseus_cargo --input "build/deps/"  build

Currently, theseus_cargo prints very verbose output and will show a lot of irrelevant warning and log statements describing what it is doing. If the out-of-tree crate was successfully built, it will finally print something like "Ran rustc command (modified for Theseus) successfully" before exiting successfully with exit code 0.

See the tools/theseus_cargo source code for more details.

The approach of capturing and modifying rustc commands using verbose output from cargo is obviously not ideal, but there is currently no other supported way to obtain this info because cargo is removing its --build-plan option.

A Stable Modular ABI for Rust (Rust Internals Forum)
The Rust ABI wiki
The abi_stable crate, which provides "safe" traits, macros, and wrappers around underlying Rust-to-Rust FFI.

Building and Running C programs on Theseus

Warning: Support for building C programs atop Theseus is experimental and liable to change at any moment.

As Theseus is a safe-language OS that runs all code in a single address space (SAS) and single privilege level (SPL), there is no guarantee of safety, protection, or isolation when running any other unsafe or non-Rust code directly atop Theseus.

Nevertheless, we have introduced experimental support for building C programs atop Theseus; proceed at your own risk.

Prerequisites

You must have a version of GCC and Binutils cross-compiled for Theseus, e.g., the x86_64-elf target with red-zone usage disabled.

To make things easy, we have written an automated script and a guide on how to build and install all necessary tools.

Note that the x86_64-elf-* binaries must be on your system PATH before running any of the following gcc commands.

Quickstart: Building a C program

See the c_test directory for an example dummy C program. All it does is run a simple main() function that returns a constant value.

In short, building a C program requires the following steps:

make         # 1. Build Theseus OS itself
make tlibc   # 2. Build tlibc, Theseus's libc
make c_test  # 3. Build a sample C program
make orun    # 4. Run Theseus in QEMU (without rebuilding anything)

Running a C program

Once the C program's executable ELF file has been packaged into Theseus's ISO image, you can execute it in Theseus using the loadc application. Executables are automatically placed in the _executable namespace folder by default, so run the following in Theseus's shell:

loadc /namespaces/_executable/dummy_works

You should observe the return value displayed on the shell GUI, as well as various log messages that show output from tlibc alongside those from Theseus's kernel.

How does it all work?

The following sections describe how to set up the toolchain, how tlibc is built, and how C programs are compiled and linked.

Building GCC and Binutils to target Theseus (x86_64-elf)

We provide a script that does all of this for you; see scripts/install_x86_64-elf-gcc.sh,

./scripts/install_x86_64-elf-gcc.sh $HOME/src $HOME/opt

In this document, we refer to two directories, both of which can be set to the location of your choice:

$SRC: the directory that contains the source for gcc and binutils
- For example, $HOME/src/
$DEST: the directory that will hold our compiled gcc and binutils packages and libraries (where they will be installed)
- For example, $HOME/opt/

(Instructions were taken from this tutorial on the OS dev wiki.)

1. Build a standalone version of GCC & Binutils

Install the required packages:

sudo apt-get install gcc build-essential bison flex libgmp3-dev libmpc-dev libmpfr-dev texinfo gcc-multilib

Download and build GCC/Binutils

For this tutorial, we'll use gcc version 10.2.0, released July 20, 2020, and binutils version 2.35.1, released September 19, 2020.

You can obtain it from many mirrors online, such as these:

gcc: https://mirrors.kernel.org/gnu/gcc/gcc-10.2.0/
binutils: https://mirrors.kernel.org/gnu/binutils/

Create a destination directory for the newly-built packages to be installed into:

mkdir $DEST
export PREFIX="$DEST/gcc-10.2.0"

Extract each source code package into the directory of your choice ($SRC), and then build binutils:

mkdir build-binutils
cd build-binutils
../binutils-2.35.1/configure --prefix="$PREFIX" --disable-nls --disable-werror
make -j$(nproc)
make install

Then go back to the $SRC directory and build gcc:

# Use a GCC script to download all necessary prerequisites
cd gcc-10.2.0
./contrib/download_prerequisites
cd ../

mkdir build-gcc
cd build-gcc
../gcc-10.2.0/configure --prefix="$PREFIX" --disable-nls --enable-languages=c,c++
make -j$(nproc)
make install

2. Build GCC and Binutils again, to cross-target Theseus (x86_64-elf)

Now that we have a standalone build of gcc/binutils that is independent from the one installed by your host system's package manager, we can use that to build a version of gcc that inherently performs cross-compilation for a specific target, in this case, our Theseus x86_64-elf target.

Note: these instructions are based on this tutorial from the OS dev wiki.

First, create a directory for the cross compiler to be built and installed into, e.g., $DEST/cross.

mkdir $DEST/cross
export PREFIX="$DEST/cross"
export TARGET=x86_64-elf
export PATH="$PREFIX/bin:$PATH"

Second, re-build the same binutils package as above, but in a way that configures it to target Theseus.

../binutils-2.35.1/configure --target=$TARGET --prefix="$PREFIX" --with-sysroot --disable-nls --disable-werror
make -j$(nproc)
make install

Confirm that your new cross-compiler binutils package exists and is on your system PATH:

which --$TARGET-as

should output something like:

/home/my_username/opt/cross/bin/x86_64-elf-as

Then go back to the $SRC directory and build a version of gcc that cross compiles C/C++ programs to Theseus.

mkdir cross-build-gcc
cd cross-build-gcc
../gcc-10.2.0/configure --target=$TARGET --prefix="$PREFIX" --disable-nls --enable-languages=c,c++ --without-headers
make all-gcc -j$(nproc)
make all-target-libgcc -j$(nproc) 
make install-gcc
make install-target-libgcc

Before moving on, let's check to make sure our cross-compiled gcc is working.

$DEST/cross/bin/$TARGET-gcc --version

This should print out some information about your newly-built gcc. Add the -v flag to dump out even more info.

3. Re-building GCC without the default `red-zone` usage

Importantly, we must disable the red zone in gcc entirely. When invoking gcc itself, we can simply pass the -mno-red-zone argument on the command line, but that doesn't affect the cross-compiled version of libgcc itself. Thus, in order to avoid libgcc functions invalidly using the non-existing red zone in Theseus, we have to build a no-red-zone version of libgcc in order to successfully build and link C programs for Theseus, without libgcc's methods trying to write to the red zone.

Note: instructions were adapted from this tutorial.

Adjusting the GCC config

First, create a new file within the gcc source tree at $SRC/gcc-10.2.0/gcc/config/i386.
Add the following lines to that new file and save it:

MULTILIB_OPTIONS += mno-red-zone
MULTILIB_DIRNAMES += no-red-zone

Yes, even though we're building for x86_64, we put it in the original x86 architecture config folder called i386.

Then, instruct gcc's build process to use that new multilib configuration. Open the file $SRC/gcc-10.2.0/gcc/config and search for the following configuration lines, which starts on Line 1867 (for gcc-10.2.0):

x86_64-*-elf*)
	tm_file="${tm_file} i386/unix.h i386/att.h dbxelf.h elfos.h newlib-stdint.h i386/i386elf.h i386/x86-64.h"
	;;

Add a line such that it looks like this:

x86_64-*-elf*)
	tmake_file="${tmake_file} i386/t-x86_64-elf"
	tm_file="${tm_file} i386/unix.h i386/att.h dbxelf.h elfos.h newlib-stdint.h i386/i386elf.h i386/x86-64.h"
	;;

Note: the indentation before tmake_file must be a TAB, not spaces.

Building GCC again with no red zone

Go back to the build directory and reconfigure and re-make libgcc:

cd $SRC/cross-build-gcc
../gcc-10.2.0/configure --target=$TARGET --prefix="$PREFIX" --disable-nls --enable-languages=c,c++ --without-headers
make all-gcc -j$(nproc)
make all-target-libgcc -j$(nproc) 
make install-gcc
make install-target-libgcc

To check that it worked, run the following two commands:

x86_64-elf-gcc -print-libgcc-file-name
x86_64-elf-gcc -mno-red-zone -print-libgcc-file-name

The first one should output a path to libgcc.a, and the second should output a similar path with no-red-zone as the containing directory:

$DEST/cross/lib/gcc/x86_64-elf/10.2.0/libgcc.a
$DEST/cross/lib/gcc/x86_64-elf/10.2.0/no-red-zone/libgcc.a

Appendix: How to use the no-red-zone version of GCC

To properly use this new version of GCC that cross-compiles to the Theseus target and disables the red zone, make sure you:

use the x86_64-elf-gcc executable that now resides in $DEST/cross
specify the -mno-red-zone flag, either on the command line or as part of LDFLAGS

`tlibc`: Compiling and Linking Theseus's libc

Warning: Support for building C programs atop Theseus is experimental and liable to change at any moment.

Theseus's libc implementation, tlibc, is a work in progress and currently a proof-of-concept library that's missing most standard libc functionality.

Building tlibc in a Theseus-compatible way

Most standard library and libc implementations are built as fully-linked static or dynamic libraries; in Rust terms, this corresponds to the staticlib or cdylib crate type (see more about crate types and linkage here).

This doesn't work well for Theseus for a few reasons. First, because Theseus runs everything in a single privilege level, there is no clear point of separation between the lowest level of user code and the highest level of kernel code. In conventional OSes, standard libraries use the system call interface to separate their code from the rest of the OS. Therein, building against a specific OS platform is easy -- you simply define the system call interface and compile against any necessary header files. There is no complex linking that needs to occur, since the lowest level of the dependency chain ends at the syscall assembly instruction, which makes the library self-contained from the linker's point of view.

Second, Theseus dynamically links raw object files at runtime, so we cannot easily create a fully statically-linked binary for a standalone C library because it won't know where its dependencies will exist in memory. Again, this is not a problem for standard libc implementations since it doesn't need to directly link against each specific syscall handler function.

Thus, we use the standard rlib crate type for tlibc and perform partial linking of the raw compiled object files ourselves.

ld -r -o tlibc/target/.../tlibc.o  tlibc/target/.../deps/*.o

Alternatively, we could also use ar to create an archive of all of the object files, as shown below; there's not much of a functional difference between the two approaches, but some build tools prefer .a archives instead of a .o object files.

ar -rcs tlibc/target/.../libtlibc.a  tlibc/target/.../deps/*.o

We use the theseus_cargo tool (as described here) to ensure that tlibc is compiled against and depends on the correct version of crates and symbols from an existing Theseus build.

Using tlibc

Once we have the tlibc.o (or .a) file, we can use that to satisfy any C program's dependencies on basic libc functions/data.

The next section describes how we use the tlibc file to build a standalone C executable that can run atop Theseus.

Compiling and Linking C programs

Warning: Support for building C programs atop Theseus is experimental and liable to change at any moment.

Currently, we use a custom form of "split" linking that is both fully-static at compile time and partially-dynamic at runtime.

1. Full static linking at build time

This procedure can be broken down into the following steps:

Compile tlibc as a standard Rust crate into a series of object files,
Combine or link those tlibc-related as a single archive or object file,
Compile the basic C program, e.g., dummy.c,
Statically link the C program to the prebuilt tlibc object to produce a standalone executable.

The first two steps were described previously; the latter two are described below.

To build and link a dummy C program to tlibc, the invocation of gcc currently requires several arguments to customize the compiler's behavior as well as its usage of the ld linker. The key arguments are shown and described below:

x86_64-elf-gcc                             \
    -mno-red-zone -nostdlib -nostartfiles  \
    -ffunction-sections -fdata-sections    \
    -mcmodel=large                         \
    -Wl,-gc-sections                       \
    -Wl,--emit-relocs                      \
    -o dummy_works                         \
    path/to/crtbegin.o                     \
    dummy.c                                \
    path/to/tlibc.o                        \
    path/to/crtend.o

The most important arguments are:

-mno-red-zone, -mcmodel=large: match Theseus's Rust-side compiler configuration so the C code can properly invoke the Rust code.
-Wl,--emit-relocs: include details in the ELF executable (.rela sections) about the specific relocation actions the linker performed.

After the above gcc command, we have a standalone executable ELF file dummy_works.

2. Partial dynamic re-linking at runtime

The ELF executable dummy_works can immediately be loaded and executed atop Theseus, but it won't necessarily work properly and execute as expected. This is because it was fully statically linked, meaning that the executable includes duplicate instances of the same data and function sections that already exist in the loaded instances of Theseus crates in memory (cells).

Most importantly, those data sections represent system-wide singleton states (static variables in Rust) that have already been initialized and are in active use by all other Theseus components. Thus, the data instances packaged into the executable have not been initialized and can't safely be used. Using those sections would result in multiple copies of data that's supposed to be a system-wide singleton; this would be bad news for most Theseus components, e.g., frame allocator's system-wide list of free physical memory.

To solve this problem, we re-perform (overwrite) all of the relocations in the executable ELF file such that they refer to the existing sections already loaded into Theseus instead of the new uninitialized/unused ones in the executable itself. This only applies for sections that already exist in Theseus; references to new sections that are unique to the executable are kept intact, of course. The relocation information is encoded into the ELF file itself as standard .rela.* sections via the --emit-relocs linker argument shown above.

This procedure is currently performed by the loadc application; it also handles loading the ELF executable segments (program headers) and managing their metadata.

Overview of Key Subsystems

Like every OS, Theseus's functionality can be categorized into multiple subsystems, which each implement a different core OS feature. This section covers the design and implementation of key subsystems in Theseus, such as memory management, multitasking support, graphical displays, etc.

Memory Management in Theseus

Memory management is one of the most unique aspects of Theseus's design, compared to how other OSes manage memory.

Single Virtual Address Space

As previously mentioned, Theseus is a Single Address Space (SAS) OS, meaning that all kernel entities, libraries, and applications are loaded into and execute within a single address space. This is possible due to careful design combined with isolation based on Rust's type and memory safety instead of hardware-based memory protection.

That being said, Theseus's single address space is a virtual address space, not a physical address space; all Rust-level pointers that are dereferenced and addresses that are accessed are virtual addresses. Although Theseus could technically operate directly on physical memory addresses without using virtual memory at all, the use of virtual addresses affords us many benefits, e.g., easier contiguous memory allocation, guard pages to catch stack overflow, etc.

Types and Terminology

Theseus uses precise, specific terminology and dedicated types to avoid confusion and correctness errors related to mixing up physical and virtual memory. The following table concisely describes the basic memory types with links to their source-level documentation:

Description of Type	Virtual Memory Type	Physical Memory Type
A memory address	`VirtualAddress`	`PhysicalAddress`
A chunk of memory	`Page`	`Frame`
A range of contiguous chunks	`PageRange`	`FrameRange`
Allocator for memory chunks	`page_allocator`	`frame_allocator`

Addresses

In Theseus, virtual and physical addresses are given dedicated, separate types that are not interoperable. This is to ensure that programmers are intentional about which type of address they are using and cannot accidentally mix them up. The constructors for VirtualAddress and PhysicalAddress also ensure that you cannot create an invalid address and that all addresses used across the system are canonical in form, which is based on the hardware architecture's expectations.

For 64-bit architectures, the set of possible VirtualAddresses goes from 0 to 0xFFFFFFFFFFFFFFFF, and all canonical addresses in that range can be used. However, while the set of possible PhysicalAddress has the same range, there are large "holes" across the physical address space that correspond to non-existent, unusable physical addresses; the locations of specific holes is hardware-dependent and generally not known until the OS asks the bootloader at runtime. Thus, you can be guaranteed that every canonical virtual address actually exists and is usable, but not every canonical physical address.

`Page`s, `Frame`s, and ranges

A chunk of virtual memory is called a Page, while a chunk of physical memory is called a Frame. Pages and Frames have the same size, typically 4KiB (4096 bytes) but ultimately dependent upon hardware. These chunks are the smallest elementary unit that the hardware's Memory Management Unit (MMU) can operate on, i.e., they are indivisible from the hardware's point of view. In other words, the MMU hardware cannot map any chunk of memory smaller than a single Page to any chunk of memory smaller than a single Frame.

A Page has a starting VirtualAddress and an ending VirtualAddress; for example, a Page may start (inclusively) at address v0x5000 and end (exclusively) at v0x6000 Similarly, a Frame has a starting PhysicalAddress and an ending PhysicalAddress, for example from p0x101000 to p0x102000. A Page can be said to contain a VirtualAddress within its bounds, and likewise a Frame can be said to contain a PhysicalAddress. Although Pages and Frames have internal numbers, we typically identify them by their starting address, e.g., "the page starting at v0x9000" instead of "page 9". Intrinsically, Pages have no relation to PhysicalAddresses, and similarly, Frames have no relation to VirtualAddresses.

For convenience, Theseus provides dedicated "range" types to represent a contiguous range of virtual Pages or physical Frames. They are inclusive ranges on both sides; see our custom RangeInclusive type that replaces Rust's built-in core::ops::RangeInclusive type for more information. These types implement the standard Rust [IntoIterator] trait, allowing for easy iteration over all pages or frames in a range.

Theseus employs macros to generate the implementation of the above basic types, as they are symmetric across the virtual memory and physical memory categories. This ensures that the VirtualAddress and PhysicalAddress have the same interface (public methods); the same is true for Page andFrame, and so on.

Page and Frame allocators

Theseus's page allocator and frame allocator are effectively identical in their design and implementation, but the former allocates virtual memory Pages while the latter allocates physical memory Frames.

While the underlying implementation may change over time, the general interface for both allocators is the same.

You can request a new allocation of one or more Pages or Frames at any address, which will only fail if there is no remaining virtual or physical memory.
You can request that allocation to start at the Page or Frame containing a specific address, but it will fail if the Page or Frame containing that address has already been allocated.
You can also specify the minimum number of bytes that the new allocation must cover, which will be rounded up to the nearest Page or Frame granularity.
- These allocators cannot allocate anything smaller than a single Page or Frame; for smaller allocations, you would want to use dynamically-allocated heap memory.
Both allocators support the concept of "reserved" regions of memory, which are only usable by specific kernel entities, e.g., memory handling functions that run at early boot/init time.
- The frame_allocator uses reserved regions to preserve some ranges of physical memory for specific usage, e.g., memory that contains early boot information from the bootloader, or memory that contains the actual executable kernel code.
Both allocators also support early memory allocation before the heap is set up, allowing Theseus to bootstrap its dynamic memory management system with a series of small statically-tracked allocations in a static array.

The page allocator and frame allocator in Theseus do not directly allow you to access memory; you still must map the virtual Page(s) to the physical Frame(s) in order to access the memory therein. We use more dedicated types to represent this, described below.

In Theseus, like other OSes, there is a single frame allocator because there is only one set of physical memory -- the actual system RAM connected to your computer's motherboard. It would be logically invalid and unsound to have multiple frame allocators that could independently allocate chunks of physical memory from the same single physical address space. Unlike other multi-address space OSes, Theseus also has a single page allocator, because we only have one virtual address space. All pages must be allocated from that one space of virtual addresses, therefore only a single page allocator is needed.

Advanced memory types

While the above "basic" types focus on preventing simple programmer mistakes through type-enforced clarity, the below "advanced" types strive to prevent much more complex errors through type-specific invariants.

AllocatedPages: a range of Pages contiguous in virtual memory that have a single exclusive owner.
- This can only be obtained by requesting an allocation from the page_allocator.
AllocatedFrames: a range of Frames contiguous in physical memory that have a single exclusive owner.
- This can only be obtained by requesting an allocation from the frame_allocator.
MappedPages: a range of virtually-contiguous pages that are mapped to physical frames and have a single exclusive owner.
- We will discuss MappedPages in the next section about memory mapping.

The main difference between the basic types and advanced types is that the advanced types guarantee exclusivity. As such, if you have a PageRange from v0x6000 to v0x8000, there is nothing preventing another entity from creating a similar PageRange that overlaps it, e.g., from v0x7000 to v0x9000. However, if you have an AllocatedPages from v0x6000 to v0x8000, you are guaranteed that no other entity across the entire system has an AllocatedPages object that contains any of those pages. That is a powerful guarantee that allows us to build stronger isolation and safety invariants when allocating, mapping, and accessing memory, as shown in the next section.

Mapping virtual memory to physical memory

In order to actually use and access memory, you must map a virtual address to a physical address. You cannot access a physical address directly because the CPU expects to work with virtual addresses that will be auto-translated to physical addresses by the MMU and other memory controller hardware¹. Also, a virtual address is useless by itself; it doesn't have any content or purpose until you first map it to a real underlying physical address.

Memory mapping involves setting up a page table and establishing page table entries to represent the virtual-to-physical address mapping in a way that the MMU can understand. As long as a mapping exists in the currently-active page table, you can access the contents in real physical memory by accessing a virtual address that is mapped to it. Accessing memory simply means dereferencing a pointer at a particular virtual address; a pointer is effectively a typed virtual addresses.

Attempting to access a virtual address that is not currently mapped will result in a page fault, a CPU exception that temporarily halts normal execution flow to allow a page fault handler in the OS to attempt to set up an appropriate page table entry for the non-mapped virtual address. Using page faults is how on-demand paging is realized; Theseus currently does not do this because it goes against the PHIS principle.

Theseus's memory subsystem specifies three key types for mapping memory:

Mapper: provides functions to map virtual memory to physical memory, which create and return MappedPages objects.
PageTable: a top-level page table, which owns the root frame of the page table and a Mapper instance that uses that page table frame to create new page table entries.
- This auto-dereferences into a Mapper, allowing all Mapper functions to be called on a PageTable object.
MappedPages: an object that represents a range of virtually-contiguous pages that are mapped to physical frames and have a single exclusive owner.

The Mapper type implements the core two functions for mapping memory in Theseus:

map_allocated_pages(): maps a specific range of AllocatedPages to frames that are allocated on demand by the system at a random physical address.
- Useful if you have some virtual pages and just want to use them for general purposes, and you don't care which physical memory gets mapped to it.
map_allocated_pages_to(): maps a specific range of AllocatedPages to a specific range of AllocatedFrames.
- Useful if you need to access a hardware device or other memory that exists at a specific physical address.

The astute reader will notice that you can only map a range of exclusively-owned AllocatedPages to a range of exclusively-owned AllocatedFrames. You cannot simply map a raw virtual address or page to a raw physical address or frame, they have to be specifically requested from the corresponding allocator and then mapped using the current active page table. This is part of Theseus's solution to ensure that accessing any arbitrary region of memory is guaranteed safe at compile time.

This assumes the CPU is in a standard mode like protected mode or long mode with paging enabled. The MMU can be disabled, after which virtual addresses do not exist and physical addresses can be directly used, but Theseus (and most other OSes) do not disable the MMU.

The `MappedPages` type

The MappedPages type represents a region of virtually-contiguous pages that are statically guaranteed to be mapped to real physical frames (which may or may not be contiguous). A MappedPages instance includes the following items:

The range of AllocatedPages that it owns and are currently mapped to
The permissions flags with which it was mapped, e.g., whether it is read-only, writable, cacheable, etc.
The root page table under which it was mapped, for extra safety and sanity checks.

MappedPages is the fundamental, sole way to represent and access mapped memory in Theseus; it serves as the backing representation for stacks, heaps, and other arbitrary memory regions, e.g., device MMIO and loaded cells.

Invariants and Safety Guarantees at Compile-time

The design of MappedPages empowers the compiler's type system to enforce the following key invariants, extending Rust's memory safety checks to all OS-known memory regions, not just the compiler-known stack and heap.

The mapping from virtual pages to physical frames must be one-to-one, or bijective.
- Each Page can only be mapped to one Frame, and each Frame can only be mapped by a single Page, system-wide.
A memory region must be unmapped exactly once, only after no outstanding references to it remain.
A memory region must not be accessible beyond its bounds.
A memory region can only be referenced as mutable or executable if mapped as such.

These invariants integrate nicely with Rust's existing memory safety rules, such as preventing multiple invalid aliases (aliasing XOR mutability), out-of-bounds access, use after free and double free, and forbidden mutable access.

How to use `MappedPages`

A key aspect of MappedPages is its "access methods" that allow callers to safely reinterpret the underlying mapped memory region as a particular type. Reinterpreting untyped memory is a crucial feature for any memory management subsystem; Theseus provides fully-safe interfaces to do so, while existing OSes do not. Reinterpretive casting is sometimes also referred to as "struct overlay", as you're overlaying a struct on top of an existing memory region.

Access method name	Return type	Description
`as_type()`	`&T`	returns an immutable reference to a generic type `T` starting at a particular offset into the memory region.
`as_type_mut()`	`&mut T`	same as `as_type()`, but returns a mutable reference.
`as_slice()`	`&[T]`	returns a reference to a slice (dynamic-sized array) of `N` elements of type `T`, starting at a particular offset.
`as_slice_mut()`	`&mut [T]`	same as `as_slice()`, but returns a mutable reference to a slice of type `T`.
`LoadedSection::as_func()`	`& <impl Fn(...)>`	returns a reference to a function that exists within a `LoadedSection`'s memory region, which must be an executable `.text` section.

These access methods ensure the aforementioned invariants of the MappedPages type.

The size of the generic type T, which must be known at compile time (T: Sized), plus the offset must not exceed the bounds of the memory region.
- The same is true for slices: the number of elements of a sized type T: Sized plus the offset must not exceed the region's bounds.
If a mutable reference is requested, the underlying memory region must have been mapped as writable.
- The same is true for functions and executable memory regions.
These methods all return references to the requested type or slice, in which the lifetime of the returned reference (&T) is dependent upon the lifetime of the MappedPages object, in order to statically prevent use-after-free errors.
- One cannot obtain an owned instance of a type T from an underlying MappedPages memory region, because that would remove the semantic connection between the type T and the existence of the underlying memory mapping.

In comparison, other OSes typically return raw virtual address values from a memory mapping operation, which you must then unsafely cast to a typed pointer of your choice. With raw addresses, there is no lifetime guarantee to ensure that the mapping persists for as long as those virtual addresses are used. As such, Theseus removes at compile time the potential to easily cause unsafe, undefined behavior by using a raw virtual address after it has been unmapped.

For more details, see the Theseus paper from OSDI 2020, or Kevin Boos's dissertation, both available here.

The MappedPages type also exposes other convenient utility methods:

remap(): change the permissions flags for the virtual pages, which are still mapped to the same physical frames.
merge(): combine multiple contiguous MappedPages objects into one.
unmap_into_parts(): unmap the memory region and re-take ownership of its constituent parts (AllocatedPages, etc) for future use.
- Without calling this, a MappedPages object will be auto-unmapped upon drop and its constituent parts deallocated for future use, but that will happen behind the scenes without you being able to directly access them.
You can also call any of the methods from PageRange as well.

Deallocating frames

Deallocating virtual pages is easy because the range of AllocatedPages is directly stored in and owned by a MappedPages object, so it is simply a matter of deallocating them once they are dropped.

However, deallocating a range of AllocatedFrames is much more difficult because each page in a range of virtually-contiguous pages may likely be mapped to a different, non-contiguous set of frames. This means we may have to deallocate many sets of AllocatedFrames, up to one per page.

In existing OSes, there is no way to easily and immediately determine which frames are mapped to which virtual pages; this requires a reverse mapping from 1 frame to N pages, which is prohibitively expensive to maintain. As such, OS kernels typically run a periodic "garbage collection" thread on idle CPUs that sweeps the page tables to determine which frames can be reclaimed.

However, Theseus's design vastly simplifies the procedure of reclaiming unused physical frames for deallocation. The single address space design and guaranteed bijective (1-to-1) mappings mean that a frame is mapped exclusively by a single page; when that page is no longer mapped to that frame, the frame can be deallocated. We refer to this as exclusive mappings, and they are realized via a combination of several crates:

When frame(s) are unmapped in the page_table_entry crate, it creates an UnmapResult that may contain a set of UnmappedFrames.
- The primary function of interest is PageTableEntry::set_unmapped().
Using strong type safety, the frame_allocator is able to accept a set of UnmappedFrames as a trusted "token" stating that the included frames cannot possibly still be mapped by any pages. It can therefore safely deallocate them.
- Deallocation occurs seamlessly because an UnmappedFrames object can be converted into an AllocatedFrames object, see here for details and source.

Heap: dynamic memory allocation

Heaps are used to dynamically allocate chunks of memory smaller than the granularity of one page.

Note: One can request a large allocation from the heap, but in Theseus it will be backed by an individually-created MappedPages object of newly-allocated pages and frames that are mapped to one another, so it's generally less efficient to use the heap for large allocations.

In Theseus, the primary purpose of the heap is to enable the usage of Rust's alloc types, e.g., Box, Arc, Vec, and other dynamically-allocated collections types. Heap allocators must implement Rust's GlobalAlloc trait in order to be used as the backing allocator behind these alloc types.

Overview of Relevant Crates

heap: the default heap implementation that offers a static singleton fixed-size block allocator.
- This is the first heap initialized and created during early OS boot.
- block_allocator: the underlying allocator implementation that optimizes allocations of common power-of-two sizes, e.g., 8 bytes, 32 bytes, etc.
  - Uses the linked_list_allocator crate as a fallback for uncommon allocation sizes.
multiple_heaps: a more complex allocator that implements multiple heaps of arbitrary sizes and usage patterns.
- Each internal heap instance is based on a zone allocator, which are modified versions of slab allocators from the slabmalloc crate.
- Unused heap space can easily be transferred among different internal heap instances for rapid, efficient heap growth.
- Currently, one internal heap is created for each CPU core, with the core ID being used to identify and select which heap should be used for allocation.
- It is trivially easy to use multiple_heaps in a different way, such as per-task heaps or per-namespace heaps.

One unique aspect of Theseus's "combination" heap design is that the early heap, fully-featured heap, and per-core dedicated heaps are all combined into a single heap abstraction that can be accessed via a singleton global heap instance. It starts out with the simple block allocator described above, and then, once more key system functionality has been initialized during OS boot, the switch_to_multiple_heaps() function is invoked to transparently activate the more complex, per-core heap allocators.

Another unique aspect of heaps in Theseus is that all entities across the system use and share the same set of global heaps. This allows allocations to seamlessly flow and be passed among applications, libraries, and kernel entities without the need for inefficient and complex exchange heaps used in other SAS OSes.

Note: Theseus's combination heap design was implemented before Rust's alloc types supported non-global allocators and placement constructors.

We haven't yet investigated how to break these heap instances down into individual allocators that can be used with specific allocation placement functions like Box::new_in().

If you're interested in working on this, please file an issue on GitHub or otherwise contact the Theseus maintainers.

Tasking Subsystem in Theseus

The tasking subsystem in Theseus implements full support for multitasking, in which multiple different tasks can execute concurrently¹ atop a single set of shared resources, i.e., CPUs and memory.

Because Theseus is a single address space (SAS) OS, it does not have a dedicated address space for each task, and thus does not follow the classic POSIX/Unix-like "process" abstraction where a process is a group of threads that all execute the same program in the same address space. One could consider tasks in Theseus to be the same as threads in other systems; the terms "task" and "thread" can be used interchangeably. One could also consider the entirety of Theseus to be a single "process" in that all Tasks execute within the same address space, but the analogous similarities between a process and Theseus ends there.

In general, the interfaces exposed by the task management subsystem in Theseus follows the Rust standard library's model for threading, with several similarities:

You can spawn (create) a new task with a function or a closure as the entry point.
You can customize a new task using a convenient builder pattern.
You can wait on a task to exit by joining it.
You can use any standard synchronization types for inter-task communication, e.g., shared memory or channels.
You can catch the action of stack unwinding after a panic or exception occurs in a task.

In this way, tasks in Theseus are effectively a combination of the concept of language-level green threads and OS-level native threads.

The `Task` struct

There is one instance of the Task struct for each task that currently exists in the system. A task is often thought of as an execution context, and the task struct includes key information about the execution of a given program's code. Compared to that of other OSes, the Task struct in Theseus is quite minimal in size and scope, because our state management philosophy strives to keep only states relevant to a given subsystem in that subsystem. For example, scheduler-related states are not present in Theseus's task struct; rather, they are found in the relevant scheduler crate in which they are used. In other words, Theseus's task struct is not monolithic and all-encompassing.

Theseus's task struct includes several key items:

The actual Context of the task's on-CPU execution.
- This holds the values of CPU registers (e.g., the stack pointer and other registers) that are saved and restored when context switching between this task and others.
The name and unique ID of that task.
- These are used primarily for human readability and debugging purposes.
The runnability (RunState) of that task, i.e., whether it can be scheduled in, and its current running status.
- This includes the task's exit value, if it has exited.
The namespace that task is running within; see CrateNamespace.
- Running tasks in different CrateNamespaces is one way to partially mimic the separation offered by the standard process abstraction; see here for more info.
The task's stack.
The task's environment, e.g., it's current working directory.
A variety of states related to handling execution failures and the cleanup of a failed task.
- These are used for fault tolerance purposes like unwinding, task cleanup, and auto-restarting critical tasks upon failure.

Note: the source-level documentation for the Task struct describes each member field of the Task struct in much greater detail.

The Task struct itself is split into two main parts:

Immutable state: things that cannot change after the initial creation of the Task.
Mutable state: things that can change over the lifetime of the Task.

The immutable states are contained in the Task struct itself, while the mutable states are contained within the TaskInner struct. Each Task struct contains a TaskInner instance protected by a lock. This design serves to significantly reduce locking overhead and contention when accessing the states of a Task in a read-only manner. Note that certain mutable states, primarily runstate information, have been moved into Task itself because they can be safely modified atomically, but in general, all other mutable states are placed within TaskInner. We are also careful to correctly specify the public visibility of state items within the Task and TaskInner structures to ensure that they cannot be modified maliciously or accidentally by other crates.

The `TaskRef` type

Tasks frequently need to be shared across many entities and subsystems throughout Theseus. To accommodate this, Theseus offers the TaskRef type, which is effectively a shared reference to a Task, i.e., an Arc<Task>. There are several reasons that we introduce a dedicated newtype instead of using an Arc<Task> directly:

To clarify all code related to task management.
To control the visibility (public vs. private) of items in the Task struct.
To guarantee that the task-local data area (per-task data) is properly set up when a new Task is created.
- A circular reference from a Task to its enclosing TaskRef is necessary to realize task-local data; see the TaskLocalData type for more details.
To expose a limited set of functions that allow foreign crates to modify or query only certain task states, e.g., joining a task.
To greatly optimize comparison and equality tests between two Tasks.
- Two TaskRefs are considered equal if they point to the same underlying Task struct.
- This avoids having to apply comparison tests against all fields in the Task struct, a very expensive operation.
To prevent other entities from obtaining direct access to a Task struct, or wrapping a Task in a non-Arc shared pointer type.

Note: while TaskLocalData offers a very basic form of per-task data, commonly known as Thread-Local Storage (TLS), full support for standard compiler-known TLS areas in the generated object code is a work in progress.

The global task list

Like all other OSes, Theseus maintains a global list of all tasks in the system. Currently, this task list is stored as a map from a numeric task ID to a TaskRef.

Tasks are added to the task list when they are initially spawned, and will remain in the task list for the entirety of their lifecycle. It is important to note that the presence of a task in the task list is not indicative of that task's runnability or execution status. A task is only removed from the task list once it has been reaped, i.e., it has completely exited and its exit value has been "taken" by another task; for example, a "parent" task may reap a "child" task that it has spawned.

Context switching

In OS terminology, the term "context switch" is often incorrectly overloaded and casually used to refer to any number of related topics:

Switching from one thread to another thread in the same address space.
Switching from one thread to another thread in a different address space.
Switching from user mode to kernel mode (e.g., Ring 3 to Ring 0) during a system call.
Switching from a user thread to a kernel thread upon an interrupt being triggered.

Only number 1 above is what we consider to be a true context switch, and that is what we refer to here when we say "context switch." Number 2 above is an address space switch, e.g., switching page tables, a different action that could potentially occur when switching to a different task, if the next task is in a different process/address space than the current task. Number 3 above is a mode switch that generally does not result in the full execution context being saved or restored; only some registers may be pushed or popped onto the stack, depending on the calling convention employed by a given platform's system calls. Number 4 above is similar to number 3, but is triggered by the hardware rather than userspace software, so more execution context states may need to be saved/restored.

One key aspect of context switching is that it is transparent to the actual code that is currently executing, as the lower layers of the OS kernel will save and restore execution context as needed before resuming. Thus, context switching (with preemption) allows for multiple untrusted and uncooperative tasks to transparently share the CPU, while maintaining the idealistic model that they each are the only task executing in the entire system and have exclusive access to the CPU.

Implementing context switching

The implementation for context switching in Theseus is split across several crates, with each crate corresponding to a particular subset of SIMD instructions being enabled. The top-level context_switch crate automatically selects the correct version based on which subset of SIMD functionality is chosen by the target hardware platform's specification. For example, if SSE2 was enabled, #[cfg(target_feature = "sse2")] would be true and the context_switch_sse2 crate would be used as the context switching implementation. Currently, one can select this target by using the x86_64-unknown-theseus-sse target while building Theseus:

make run TARGET=x86_64-unknown-theseus-sse

Theseus supports both SSE2 and AVX, but its default target x86_64-unknown-theseus disables both. This tells the compiler to generate soft floating-point instructions instead of SIMD instructions, meaning that the SIMD register set is not used at all. Thus, disabling SIMD results in the simplest and fastest version of context switching, as only the basic set of general-purpose CPU registers must be saved and restored; all SIMD registers can be ignored.

Context switching is inherently an unsafe operation, hence why the standard context_switch() function is marked unsafe. It must be implemented in assembly in order to ensure that the compiler doesn't insert any instructions that modify register values in between our instructions that save/restore them. It is only invoked by the task_switch() function, as only that function has the proper information — saved register values and the destination for the restored register values — to correctly invoke it.

Pre-emptive vs. Cooperative Multitasking

Theseus implements full support for preemptive multitasking, in which a given task is interrupted at a certain periodic time interval in order to allow other tasks to take over execution. This prevents one greedy task from hogging all system resources and/or starving other tasks from execution time. In Theseus, like most other systems, we implement this using a timer interrupt that fires every few milliseconds on each CPU core. The length of this time period is called the timeslice, and is a configurable setting. Currently, this timer interrupt is set up in the LocalApic initialization routine, which runs when each CPU core is discovered and initialized. The interrupt handler itself is very simple, and is currently found in a function called lapic_timer_handler() in kernel/interrupts/src/lib.rs.

Cooperative multitasking is also possible, but at the moment Theseus does not offer an easy way to disable preemption to use only cooperative multitasking; however, it wouldn't be difficult to add. Currently, tasks can choose to yield the processor to other tasks by invoking schedule(), which will select another task to run next and then switch to it. This scheduling function is the same function that is invoked by the aforementioned timer interrupt handlers to preempt the current task every few milliseconds.

The Task lifecycle

Theseus tasks follow a typical task lifecycle, which is in-part demonstrated by the possible variants of the RunState enum.

Initializing: the task is being created.
- After the task is finished being created and is fully spawned, its runstate will be set to Runnable.
Runnable: the task is able to be scheduled in (but is not necessarily currently executing).
- A runnable task may be blocked to prevent it from being scheduled in until it is subsequently unblocked.
Note: "running" and "runnable" are not the same thing.
- A task is considering runnable after it has spawned and before it has exited, as long as it is not blocked.
- A task is only said to be running while it is currently executing on a given CPU core, and is merely considered runnable when it is waiting to be scheduled in whilst other tasks execute.
Blocked: the task is blocked on some other condition and is not able to be scheduled in.
- A blocked task may be unblocked to mark it as runnable again.
Exited: the task is no longer executing and will never execute again.
- An exited task has an ExitValue, which is one of:
  - Completed -- the task ran to completion normally and finished as expected.
  - Killed -- the task stopped executing prematurely as a result of a crash (language-level panic or machine-level exception) or as a result of a kill request.
- An exited task must not cleaned up until ist has been "reaped" (see below).
Reaped: the task has exited and its ExitValue has been taken by another Task.
- A task is cleaned up and removed from the system once it has been reaped.
- An exited task will be auto-reaped by the system if no other task is waiting on it to exit. See JoinableTaskRef for more info on how this works.

Spawning new Tasks

The functionality to spawn (create) a new task is implemented in a separate spawn crate. Theseus provides a TaskBuilder interface to allow one to customize the creation of a new task, which starts with a call to new_task_builder(). The caller must pass a function and argument when spawning a task: the function is the entry point for the task, and the argument will be passed to that entry point function. Once the task builder has been used to suitably customize the new task, one must invoke the spawn() method on the TaskBuilder object to actually create the new Task instance and add it to one or more runqueues. This does not immediately execute the task, but rather only makes it eligible to be scheduled in during the next task switch period.

The function passed into the spawn routine is actually not the first function to run when the task is first switched to. All new tasks have the same entry point, the task_wrapper() function, which simplifies the procedure of jumping into a new task:

Setting up the proper stack contents
Invoking the task's entry function with the proper argument
Catching panics and exceptions during unwinding
Handling the task's states after it has exited, whether a completion or failure.

Cleaning up Tasks

The Task struct implements Rust's Drop trait, meaning that once all references to a Task have ended, the Task object itself can be dropped and cleaned up. Because Tasks are more complex than most structures, the drop handler alone isn't sufficient to properly clean up and remove all traces of that task and the effects that it has had on the system.

The clean up procedure begins after a task exits, either by running to completion or being killed upon request or after encountering a panic or exception.

If the task ran to completion, the task_wrapper() will call task_cleanup_success(), a function that marks the task as having successfully exited and stores the returned ExitValue in its Task struct.
If the task did not finish, the task_wrapper() will call task_cleanup_failure(), a function that marks the task as having encountered a failure, and stores the reason that it was prematurely killed in its Task struct.

After handling the successful or failed exit condition, the last piece of the task lifecycle is the task_cleanup_final() function, which removes the task from any runqueues it may be on, drops the final reference to that task, and then re-enables interrupts and yields the processor. That final task reference, when dropped, triggers the aforementioned drop handler for the Task struct, which automatically releases all of its acquired states and allocated resources. Note that the procedure of stack unwinding accomplishes the release of most resources and allocations, since those are represented by owned values that exist on the program's stack.

Note: there are a separate set of similar task lifecycle functions for critical system tasks that were spawned as restartable. The primary difference is that after the cleanup actions are completed, a new task is spawned with the same entry point and initial argument as the failed task.

Multitasking on a uniprocessor machine (with a single CPU core) is not truly concurrent, it just appears to be concurrent over a given time interval because the execution of multiple tasks is quickly interleaved. True concurrency can be achieved on a multiprocessor machine, in which different tasks execute simultaneously on different cores.

Invariants Upheld in Task Management

Theseus enforces several key invariants related to task management in order to empower the compiler to uphold memory safety and prevent resource leaks throughout the task lifecycle.

All task lifecycle functions leverage the same set of generic type parameters. The trait bounds on these three type parameters <F, A, R> are a key aspect of task-related invariants.

F: the type of the entry function, i.e., its signature including arguments and return type.
A: the type of the single argument passed into the entry function F.
R: the return type of the entry function F.

Invariant 1. Spawning a new task must not violate memory safety.

Rust already ensures this for multiple concurrent userspace threads, as long as they were created using its standard library thread type. Instead of using the standard library, Theseus provides its own task abstraction, overcoming the standard library’s need to extralingually accommodate unsafe, platform-specific thread interfaces, e.g. fork(). Theseus does not offer fork because it is known to be unsafe and unsuitable for SAS systems¹, as it extralingually duplicates task context, states, and underlying memory regions without reflecting that aliasing at the language level.

Theseus’s task abstraction preserves safety similarly to and as an extension of Rust threads. The interfaces to spawn a new task (in the spawn crate) require specifying the exact type of the entry function F, its argument A, and its return type R, with the following constraints:

The entry function F must be runnable only once, meaning it must satisfy the FnOnce trait bound.
The argument type A and return type R must be safe to transfer between threads, meaning they must satisfy the Send trait bound.
The lifetime of all three types must outlast the duration of the task itself, meaning they must have a 'static lifetime.

All task lifecycle management functions are fully type-safe and parameterized with the same generic type parameters, <F,A,R>. This ensures both the compile-time availability of type information and the provenance of that type information from head to tail (spawn to cleanup) across all stages in the task's lifecycle. Theseus thus empowers the compiler to statically prevent invalidly-typed task entry functions with arguments, return values, or execution semantics that violate type safety or memory safety.

Invariant 2: All task states must be released in all possible execution paths.

Releasing task states requires special consideration beyond simply dropping a Task object to prevent resource leakage, as mentioned in the previous chapter. There are several examples of how the multiple stages of task cleanup each permit varying levels of resource release:

The task's stack is used during unwinding and can only be cleaned up once unwinding is complete.
The saved execution Context can be released when a task has exited.
A task's runstate and exit value must persist until it has been reaped.

The task cleanup functions described in the previous chapter demonstrate the lengths to which Theseus goes to ensure that task states and resources are fully released in both normal and exceptional execution paths. In addition, as mentioned above, all cleanup functions are parameterized with the same <F, A, R> generic type parameters, which is crucial for realizing restartable tasks because the failure handler for a restartable task must know its specific type parameters for the entry function, argument, and return type in order to re-spawn a new instance of the failed task.

Invariant 3: All memory transitively reachable from a task’s entry function must outlive that task.

Although all memory regions in Theseus are represented by MappedPages, which prevents use-after-free via lifetime invariants, it is difficult to use Rust lifetimes to sufficiently express the relationship between a task and arbitrary memory regions it accesses. The Rust language does not and cannot specify a task-based lifetime, e.g., &'task, to indicate that the lifetime of a given reference is tied to the lifetime of the current task.

Furthermore, a Rust program running as a task cannot specify in its code that its variables bound to objects in memory are tied to the lifetime of an underlying MappedPages instance, as they are hidden beneath abstractions like stacks, heaps, or static program sections (.text, .rodata, etc). Even if possible, this would be highly unergonomic, inconvenient, and render ownership useless. For example, all local stack variables would need to be defined as borrowed references with lifetimes derived from that of the MappedPages object representing the stack.

Thus, to uphold this invariant, we instead establish a clear chain of ownership: each task owns the LoadedCrate that contains its entry function, and that LoadedCrate owns any other LoadedCrates it depends on by means of the per-section dependencies in the crate metadata. As such, the MappedPages regions containing all functions and data reachable from a task’s entry function are guaranteed to outlive that task itself. This avoids the unsavory solution of littering lifetime constraints across all program variables, allowing Rust code to be written normally with the standard assumption that the stack, heap, data, and text sections will always exist.

Andrew Baumann, Jonathan Appavoo, Orran Krieger, and Timothy Roscoe. "A fork() in the road". In Proceedings of HotOS, 2019.

Display Subsystem

Warning: the display subsystem in Theseus is in need of complete redesign. It is inefficient and poorly implemented, as it was simply a means to the end of being able to interact with the system, and unfortunately has not been a focus of any significant effort.

How the Window Manager works

Design

Typically, both the application that owns/creates a window and the window manager that controls that window need to access it jointly. The application needs to display its content into the main part of the window, and the window manager needs information about the location and depth ordering of all windows to render them.

To share a window between an application and the window manager, the application holds a strong reference (Arc) to the window, while the window manager holds a weak reference (Weak) to that same window. This allows the window manager to control an manage a window without conceptually owning it.

We use a Mutex to wrap each window to allow the application task and window manager task to safely access it jointly. However, Mutex introduces the possibility of deadlock: when an application wants to access its window, it must acquire the Mutex lock, operate on the window, and then release the lock. If the application doesn't release the lock on its window, the window manager will be forced to block until the lock is released, preventing it from performing typical operations like switching between windows, delivering events, or deleting windows.

To solve this problem, we define two structures: Window and WindowInner. WindowInner only contains the information required by the window manager. The window manager holds a list of references to WindowInner objects, while only the application owns the outer Window object (which itself does contain a reference to the underlying WM-owned WindowInner object. The Window struct also contains other application-relevant states that describe the window.

The `WindowInner` structure

The window_inner crate defines a WindowInner structure. It has states and methods of displaying the window on the screen.

A WindowInner has a framebuffer to which it can display the content of the window. The framebuffer takes a type parameter of pixels it consists of. When the window is rendered to the screen, a compositor may composite every pixel with different principles according to the type. Currently, we have implemented a normal RGB pixel and a pixel of an alpha channel.

Both an application's window and the window manager has a reference to the same WindowInner object. The application can configure and draw in the framebuffer and the manager can display and composite the window with others.

This structure also has an event producer. The window manager gets events from I/O devices such as keyboards and push them to the corresponding producer.

Window

A Window object represents a window and is owned by an application. It contains its profile, a title, a consumer and a list of displayables. The consumer can get events pushed to the producer in its profile by the manager.

A Window provides methods to display the displayables in it and render itself to the screen. The window manager is responsible for compositing it with other windows through a framebuffer compositor.

Displayables

The displayable crate defines a Displayable trait. A Displayable is an item which can display itself onto a framebuffer. It usually consists of basic graphs and acts as a component of a window such as a button or a text box. Currently, we have implemented a TextDisplay which is a block of text. In the future, we will implement other kinds of displayables.

An application can own multiple displayables and display any type of Displayable in its window.

The WindowManager

The window_manager crate defines a WindowManager structure. This structure consists of the profiles of an active window, a list of shown windows and a list of hidden windows. The hidden ones are totally overlapped by others. The structure implements basic methods to manipulate the list such as adding or deleting a window.

The WindowManager structure contains a bottom framebuffer which represents the background image and a final framebuffer of a floating window border and a mouse arrow. In refreshing an area, it renders the framebuffers in order background -> hidden list -> shown list -> active -> top. It provides several methods to update a rectangle area or several pixels for better performance.

The structure defines a loop for generic events, a loop for keyboard events and a loop for mouse events. Theseus will initialize them as tasks to handle inputs. The window manager structure provides methods to operate on the window list as reactions to these inputs. It can move a window when we drag it with mouse or pass other events to the active window. The owner application of the active window can handle these events.

The window_manager crate owns a WINDOW_MANAGER instance which contains all the existing windows. It invokes the methods of WindowManager to manage these windows.

How to Create Windows and Display Content

Create a Window

An application invokes the Window::new() function in the window crate to create a new window. The function would create a new Window object and add a weak reference of its WindowInner to the WINDOW_MANAGER instance in window_manager. It then returns the window to the application. Once the application terminates, the window it owns would be dropped automatically, and the weak reference in the window manager would be deleted.

Display in a Window

An application can create a Displayable and invoke Window.display() to display it. This method is generic and works for all kinds of displayables.

After display a displayable in its framebuffer, the window would invoke its render() method to render the updates to the screen. A framebuffer compositor will composite a list of framebuffers and forward the result to a final framebuffer which is mapped to the screen.

Handle Key Inputs

An application invokes Window.handle_event() to handle the events sent to it. For example, an active window will receive all the key input events. An application can invoke Window.handle_event() in a loop to handle these inputs from the keyboard.

Running Theseus on Virtual or Real Hardware

We have tested whether Theseus runs properly in a variety of environments, currently on x86_64 only:

Virtual machine emulators: QEMU, bochs, VirtualBox, VMware Workstation Player.
Real hardware: Intel NUC devices, Supermicro servers, various Thinkpad laptops, PCs with Gigabyte motherboards.

Currently, the primary limiting factor is that the device support booting via USB or PXE using traditional BIOS rather than UEFI; support for UEFI is a work-in-progress.

Note that as Theseus is not fully mature, booting on your own hardware is done at your own risk. Be sure that you have a backup of all of your important files before doing so.

If you experience a problem booting Theseus on any virtual or real hardware platform, please take a look at the open issues on GitHub to see if someone has already reported your problem or attempted to fix it. If so, leave a comment describing your experience or open a new issue to help the Theseus developers work towards supporting your hardware environment!

Running Theseus in a Virtual Machine

Using a virtual machine emulator is by far the easiest way to develop, test, and run Theseus.

QEMU

Our primary test environment and recommended emulator is QEMU, which is the default choice for running Theseus using our built-in Makefile commands. For example, the make run target automatically runs Theseus in a QEMU virtual machine after it finishes the build process.

The top-level Makefile specifies the configuration parameters for Theseus in QEMU, such as system memory, attached storage devices, serial log output, and more. All of these parameters start with QEMU_ and can be overridden on the command line, or indirectly by setting environment variables such as net or host, or by editing the Makefile itself.

Bochs

In older versions of Theseus, we used both Bochs and QEMU for testing. Bochs is supported but its configuration may be out of date; the configuration is found in the bochsrc.txt (direct link) file in the root repository directory.

Bochs runs quite slowly and supports virtualization of far fewer hardware devices than QEMU; thus, we do not recommend using it. However, you can try running Theseus in Bochs using the Makefile target for it:

make bochs

VMware Workstation Player

We have tested Theseus on VMWare Workstation and it generally works out of the box. However, there are some options that you may wish to enable to improve performance and offer access to more devices in Theseus.

First, download VMware Workstation Player here, which can be installed and used for free for non-commercial work.

On Linux, you will download a .bundle file, which then needs to executed in a terminal. For example:

chmod +x VMware-Player-<...>.bundle
sudo ./VMware-Player-<...>.bundle

After opening VMware Workstation Player, do the following:

Click Create A New Virtual Machine.
In the New Virtual Machine Wizard window, choose Use ISO image: and browse to select the Theseus ISO image, which should be located in the build directory, e.g., build/theseus-x86_64.iso. Then, click Next.
Under Guest Operating System, choose the Other button and then Other 64-bit from the drop-down menu. Then, click Next.
Set the Name: field to "Theseus" or whatever you prefer. Click Next.
Disk size doesn't matter; click Next.
Click Customize Hardware, and then select the following settings:
- 512MB of memory (less may work, but the minimum is on the order of 10-20 MB).
- 2 or more processor cores.
- Select Virtualize CPU performance counters if you want to use them (not required).
- If you want to obtain Theseus's log output, then you need to add a serial port connection:
  1. Click Add... on the bottom left to add a new hardware type, then Serial Port.
  2. Under Connection, select Use output file: and then choose a destination file name for the serial log to be written to. For example, /home/your_user/theseus_vmware.log.
  3. Click Save.
Click Finish, and then Close.

Theseus should boot up after a few seconds. You can view the serial log output by catting or opening the file:

cat /home/your_user/theseus_vmware.log

VirtualBox

We have tested Theseus on VirtualBox and it generally works out of the box. However, there are some options that you may wish to enable to improve performance and offer access to more devices in Theseus.

First, download VirtualBox here and install it on your system. On Ubuntu and other Debian-based Linux distributions, you will download a .deb file that you can open in the Software Installer or install on the command line like so:

sudo dpkg -i virtualbox-<...>.deb

After opening VirtualBox, do the following:

Click New.
In the Create Virtual Machine window, set Type to Other and Version to Other/Unknown (64-bit), choose a name, and then click Next.
In the next window, choose 512MB of memory (less may work, but the minimum is on the order of 10-20 MB).
Continue clicking next through all of the storage disk options, those do not matter.
Back at the main window, right click on your new Theseus machine and choose Settings.
In the left sidebar, click Storage and then select the 💿 Empty option to choose an image for the optical disk.
Click on the 💿▾ button on the right side of the Optical Drive: option, select Choose a disk file, and then navigate to the Theseus ISO image in the build/ directory, e.g., build/theseus-x86_64.iso.
Under System in the left sidebar, go to the Processor tab and select 2 (or more) processors.
If you want to obtain Theseus's log output, then you need to add a serial port connection:
1. Click Serial Ports in the left sidebar, under the Port 1 tab, select the Enable Serial Port checkbox.
2. Under the Port Mode drop-down menu, select Raw File option.
3. In the Path/Address text box, type the destination file name for the serial log to be written to. For example, /home/your_user/theseus_vbox.log.
4. Click Ok.
In the main window, select the Theseus VM entry from the left sidebar and then click Start on the top bar.

Theseus should boot up after a few seconds. You can view the serial log output by catting or opening the file:

cat /home/your_user/theseus_vbox.log

PCI passthrough of devices with QEMU

PCI passthrough can be used to allow a guest OS to directly access a physical device. The following instructions are a combination of this guide on host setup for VFIO passthrough devices and this kernel documentation on VFIO.

There are three main steps to prepare a device for PCI passthrough:

Find device information
Detach device from current driver
Attach device to VFIO driver

Once these steps are completed, the device slot information can be passed to QEMU using the vfio flag. For example, for device 59:00.0, we run:

make run vfio=59:00.0

Finding device information

First, run lspci -vnn to find the slot information, the kernel driver in use for the device, and the vendor ID and device code for the device you want to use. Below is sample output for a Mellanox ethernet card we'd like to access using PCI passthrough:

59:00.0 Ethernet controller [0200]: Mellanox Technologies MT28800 Family [ConnectX-5 Ex] [15b3:1019]
	Subsystem: Mellanox Technologies MT28800 Family [ConnectX-5 Ex] [15b3:0008]
	Flags: bus master, fast devsel, latency 0, IRQ 719, NUMA node 1
	Memory at 39bffe000000 (64-bit, prefetchable) [size=32M]
	Expansion ROM at bf200000 [disabled] [size=1M]
	Capabilities: <access denied>
	Kernel driver in use: mlx5_core
	Kernel modules: mlx5_core

Detach device from current driver

To detach the device from the kernel driver, run the following command, filling in the slot_info and driver_name with values you retrieved in the previous step.

echo $slot_info > /sys/bus/pci/drivers/$driver_name/unbind

In the above example, this would look like:

echo 0000:59:00.0 > /sys/bus/pci/drivers/mlx5_core/unbind

If you run lspci -v now, you'll see that a kernel driver is no longer attached to this device.

Attach device to VFIO driver

First, load the VFIO driver by doing:

modprobe vfio-pci

To attach the new driver, run the following command, filling in the vendor_id and device_code with values you retrieved in the first step.

echo $vendor_id $device_code > /sys/bus/pci/drivers/vfio-pci/new_id

e.g. echo 15b3 1019 > /sys/bus/pci/drivers/vfio-pci/new_id

Now, QEMU can be launched with direct access to the device.

Return device to the Host OS

To reset the device, you can either reboot the system or return the device to the host OS using the following commands (replacing $slot_info with the value previously retrieved):

echo 1 > /sys/bus/pci/devices/$slot_info/remove    
echo 1 > /sys/bus/pci/rescan

Note: access for unprivileged users

To give access to an unprivileged user to this VFIO device, find the IOMMU group the device belongs to:

readlink /sys/bus/pci/devices/$slot_info/iommu_group

for example:

readlink /sys/bus/pci/devices/0000:59:00.0/iommu_group

for which we obtain the output below, in which 74 is the group number:

../../../../kernel/iommu_groups/74

Finally, give access to the current user via this command:

chown $USER /dev/vfio/$group_number

Running Theseus on Headless Systems Interactively

By default, Theseus expects to run in a standard desktop environment with a basic graphical display (monitor) and a real keyboard (and optionally, a mouse). For interacting with the system, Theseus uses the keyboard as its primary input and the graphical display as its primary output.

Theseus can also run in headless mode, in which the "head" of the computer (its monitor and keyboard) are nonexistent, and I/O is done over the network or a serial port connection. This is useful for system environments like servers, embedded microcontrollers, certain virtual machine environments, and anything else without a monitor display.

The current version of Theseus listens for incoming connections on serial ports only (COM1 and COM2). Upon receiving data (like a key press) on a serial port, Theseus will spawn a terminal emulator to handle I/O on that port.

Note: headless interactive mode can coexist simultaneously with regular graphical display mode.

TODO: describe the various options for disabling hardware graphical displays

Connecting a Terminal Emulator to Theseus

Currently, we have tested this with only virtual serial ports connected to Theseus through a VMM like QEMU, but it should also work for real hardware serial ports.

While Theseus is running in a VM or on another physical machine, we recommend using a terminal emulator program on the host machine to connect to the serial port device on the Theseus machine. Examples include:

screen
picocom
minicom

By default, the Theseus Makefile starts QEMU with two serial ports, COM1 and COM2, that the host machine can connect to and exchange data over. The first serial port, COM1, is connected to the stdio streams of the terminal that spawned the QEMU process. This stream is used for the system log and for controlling QEMU, so it is best to use COM2 to interact with Theseus separately such that the headless virtual terminal is not polluted with log statements. The second serial port, COM2, is connected to a dynamically-allocated pseudo-terminal (PTY) that will be allocated by QEMU. To connect to it, inspect the QEMU output when it first starts to find a line like this:

char device redirected to /dev/pts/3 (label serial1-base)

This tells you that QEMU connected the second serial port (COM2) on the guest Theseus VM to the Linux PTY device at /dev/pts/3. Note that QEMU uses 0-based indexing for serial ports, so its "serial1" label refers to the second serial port, our "SERIAL2" (COM2 on x86).

Now, once QEMU is running, you can connect a terminal emulator on the host to the serial port in Theseus, and Theseus will issue interactive commands to that terminal emulator. To do so, run any of the following commands and then press any key to start the terminal prompt:

screen /dev/pts/3
picocom /dev/pts/3
minicom -D /dev/pts/3

Note that some programs (namely minicom) do not necessarily send the expected value when pressing the Backspace key. Thus, if you are experiencing unexpected behavior when pressing Backspace, you need to ensure that your program is sending the correct ASCII DEL (0x7F) character value when pressing Backspace, instead of an ASCII BS (0x08), which will only move the cursor to the left by one character. In our experience, screen and picocom work as expected, but minicom does not. To change minicom's default behavior, you can do the following:

Press Ctrl + A twice to open the meta control bar at the bottom of the screen
Press T to open the "Terminal Settings" menu
Press B to toggle the "Backspace key sends" setting.
- You want this to be DEL, not BS.

Either of these serial ports can be changed in QEMU using the environment variables SERIAL1 and SERIAL2 respectively, though again, we recommend only using SERIAL2 in virtual environments. In real hardware, where there is only one serial port and therefore COM1 must be used, you can disable the log or initialize it with a different serial port, e.g., COM2, to avoid polluting the terminal emulator connected to COM1 with system log printouts.

Booting Theseus from a USB drive

To boot over USB, simply run

make boot usb=sdc

in which sdc is the device node for the USB disk itself (not a partition like sdc2). The OS image (.iso file) will be written to that USB drive.

On WSL or other host environments where /dev device nodes don't exist, you can simply run make iso and burn the .iso file in the build/ directory to a USB drive. For example, on Windows we recommend using Rufus to burn ISOs.

Then, once the bootable USB is ready, plug it into your PC, restart or power on the machine, and choose that USB device from the BIOS or legacy boot device screen.

Booting Theseus on Real Hardware via PXE

The following instructions are a combination of this guide on OSTechNix to set up PXE for Ubuntu and this guide by Andrew Wells on how to use any ISO with PXE.

PXE can be used to load Rust onto a target computer that is connected by LAN to the host machine used for development. To set up the host machine for PXE, first make the Theseus ISO by navigating to the directory Theseus is in and running: make iso

Then, you will need to set up a TFTP and DHCP server which the test machine will access.

Setting up the TFTP Server

First, install all necessary packages and dependencies for TFTP: sudo apt-get install apache2 tftpd-hpa inetutils-inetd nasm Edit the tftp-hpa configuration file: sudo nano /etc/default/tftpd-hpa Add the following lines:

RUN_DAEMON="yes"
OPTIONS="-l -s /var/lib/tftpboot"

Then, edit the inetd configuration file by opening the editor: sudo nano /etc/inetd.conf And adding: tftp dgram udp wait root /usr/sbin/in.tftpd /usr/sbin/in.tftpd -s /var/lib/tftpboot

Restart the TFTP server and check to see if it's running:

sudo systemctl restart tftpd-hpa
sudo systemctl status tftpd-hpa

If the TFTP server is unable to start and mentions an in-use socket, reopen the tftp-hpa configuration file,set the line that has TFTP_ADDRESS=":69" to be equal to 6969 instead and restart the TFTP server.

Setting up the DHCP Server

First, install package for DHCP server: sudo apt-get install isc-dhcp-server

Then run ifconfig to view available networking devices and find the network device name, e.g., eth0.

Edit the /etc/default/isc-dhcp-server configuration file and add the network device name from the step above to "INTERFACES". For me, this looks like INTERFACES="eth0".

Configure an arbitrary IP address that will be used in the next step: sudo ifconfig <network-device-name> 192.168.1.105 This command might have to be done each time the computer being used as a server is restarted.

Edit the /etc/dhcp/dhcpd.conf file by uncommenting the line authoritative; and adding a subnet configuration such as the one below:

subnet 192.168.1.0 netmask 255.255.255.0 {
  range 192.168.1.20 192.168.1.30;
  option routers 192.168.1.1;
  option broadcast-address 192.168.1.255;
  default-lease-time 600;
  max-lease-time 7200;
}

allow booting;
allow bootp;
option option-128 code 128 = string;
option option-129 code 129 = text;
next-server 192.168.1.105;
filename "pxelinux.0";

Restart the DHCP server and check to see if it's running:

sudo systemctl restart isc-dhcp-server
sudo systemctl status isc-dhcp-server

Loading the Theseus ISO Into the TFTP Server

In order for the TFTP server to load Theseus, we need the Theseus ISO and a memdisk file in the boot folder. To get the memdisk file first download syslinux which contains it.

wget https://www.kernel.org/pub/linux/utils/boot/syslinux/syslinux-5.10.tar.gz
tar -xzvf syslinux-*.tar.gz

Then navigate to the memdisk folder and compile.

cd syslinux-*/memdisk
make memdisk

Next, make a TFTP boot folder for Theseus and copy the memdisk binary into it along with the Theseus ISO:

sudo mkdir /var/lib/tftpboot/theseus
sudo cp /root/syslinux-*/memdisk/memdisk /var/lib/tftpboot/theseus/
sudo cp /Theseus/build/theseus-x86_64.iso /var/lib/tftpboot/theseus/

Navigate to the PXE configuration file: sudo nano /var/lib/tftpboot/pxelinux.cfg/default And add Theseus as a menu option by adding the following:

label theseus
    menu label Theseus
    root (hd0,0)
    kernel theseus/memdisk
    append iso initrd=theseus/theseus-x86_64.iso raw

Finally, restart the DHCP server one more time and make sure it's running:

sudo systemctl restart isc-dhcp-server
sudo systemctl status isc-dhcp-server

On the target computer, boot into the BIOS, turn on Legacy boot mode, and select network booting as the top boot option. Once the target computer is restarted, it should boot into a menu which displays booting into Theseus as an option.

Subsequent PXE Uses

After setting up PXE the first time, you can run make pxe to make an updated ISO, remove the old one, and copy the new one over into the TFTP boot folder. At that point, you should be able to boot that new version of Theseus by restarting the target computer. If there are issues restarting the DHCP server after it worked the first time, one possible solution may be to confirm that the IP address is the one you intended it to be with the command from earlier: sudo ifconfig <network-device-name> 192.168.1.105

The Golden Rule of Software Development

Code for others how you wish they would code for you.

What does this mean? You should adhere to the following principles.

Good abstractions. Another developer using your code should never have to study the internals of the code itself, but rather be able to fully understand how to use your code simply from its struct/function names and documentation. Use intuitive names and try to design an interface that makes sense, is simple and easy to use, and doesn't surprise anyone with unnecessary trickery.
Be clean. Write well-crafted, concise code with sufficient features to be useful, but without bloat. Adhere to code style conventions, including proper spacing, doc comments, naming conventions, etc.
Foolproof code. Think carefully about how others will use your code, and design it thoughtfully to prevent others from making mistakes when using your code, ideally prevented at compile time instead of runtime.
Errors are important! Handle errors gracefully and thoroughly, and return detailed error messages that clearly describe the issue. Don't ever let something fail silently!

Below are some other good practices.

Accurate metadata. In addition to good code and documentation, make sure to fill in additional metadata, such as the details present in each crate's Cargo.toml file: description, keywords, authors, etc.
No "magic" numbers. Do not use literal number values that have no documentation or explanation of why they exist. For example, instead of just writing a value like 4096 in the code, create a const that accurately describes the semantic meaning of that value, e.g., const PAGE_SIZE: usize = 4096;. Magic numbers are terrible to maintain and don't help anyone who looks at your code in the future.
Minimize global states. Remove static (global) states as much as possible, and rethink how the same data sharing can be done without globals.

Rust-specific Guidelines

Rust style. Follow proper Rust coding style and naming conventions. Use correct spacing, indentation, and alignment that matches the existing style. Make your code visually appealing, with spaces between operators like equal signs, addition, spaces after a comma, etc. Punctuation is important for legibility!
Rust documentation. Use proper rustdoc-style documentation for all structs, functions, and types. Make sure all of your documentation links are correct, and that you're using the correct rustdoc formatting for doc comments. Triple slashes /// should be used above function and struct definitions, double slashes // for C-style inline comments (or block comments like /* */), and //! for crate top-level documentation. Use Markdown formatting to describe function arguments, return values, and include usage examples, in a way consistent with Rust's official libraries.
Options and Results. Use Options and Results properly. Don't use special values that have overloaded meanings, e.g., an integer in which 0 means no value, or something like that. Here's a good resource for better understanding error handling in Rust.

Options should be returned when an operation might fail, but that failure condition doesn't affect the rest of the system. For example, if you're searching for an element in a list, then an Option is the suitable choice because the caller of your getter function would only call it in order to get and use the return value.

Results should be returned if something can fail or succeed, and the caller needs to know whether it succeeded, but potentially need the actual return value, e.g., an init function that returns void. In this case, Result is the best choice because we want to force the caller to acknowledge that the init function succeeded, or handle its error if it failed. In Theseus, Results are mandatory when a function has some side effect, such as setting a parameter or value that might not exist or be initialized yet. In that case, a result must be used to indicate whether the function succeeded.

Theseus-specific Guidelines

Handle Results properly and fully. Don't ignore a result error, instead, log that error and then handle it if possible. If you cannot handle it, return that error to the caller so they can attempt to handle it. NEVER SILENCE OR LAZILY HIDE ERRORS.
Never use unsafe code. If you absolutely cannot avoid it, then you should review your case on an individual basis with the maintainers of Theseus. In most cases, unsafe code is not necessary and can be rewritten in safe code.

Adding New Functionality to Theseus

The easiest way to add new functionality is just to create a new crate by duplicating an existing crate and changing the details in its new Cargo.toml file. At the very least, you'll need to change the name entry under the [package] heading at the top of the Cargo.toml file, and you'll need to change the dependencies for your new crate.

If your new kernel crate needs to be initialized, you can invoke it from the captain::init() function, although there may be more appropriate places to do so, such as the device_manager's functions for initializing device drivers.

If you want to create a new application for Theseus, see those instructions here.

Advice for Contributing and using git

The main Theseus repository, theseus-os/Theseus is what we call the upstream. To contribute, you should create your own fork of that repository through the GitHub website, and then check out your own fork. That way, your fork will be the origin remote by default, and then you can add the upstream as another remote by running:

git remote add upstream https://github.com/theseus-os/Theseus

Never push to the main branch

Currently, the main branch on the upstream theseus-os/Theseus/theseus_main is protected from a direct push. This is true even for GitHub users who are in the theseus-os organization and have write access to the Theseus repo. The only way to contribute to it is by merging a pull request into the main branch, which only authorized users can do. Instead, checkout your own fork as above, create a new branch with a descriptive name, e.g., kevin/logging_typo, develop your feature on that branch, and then submit a pull request. This is a standard Git workflow that allows people can review your code, check for pitfalls and compatibility problems, and make comments and suggestions before the code makes its way into the main branch. You must do this for all changes, even tiny ones that may seem insignificant.

Submitting a pull request

To submit a pull request (PR), go to the GitHub page of your forked Theseus repo, select the branch that you created from the drop down menu, and then click "New pull request". By default, GitHub will create a new PR that wants to merge your branch into the upstream theseus_main branch, which is usually what you want to do. Now, give your PR a good title and description, scroll down to review the commits and files changed, and if everything looks good, click "Create pull request" to notify the maintainers that you have contributions that they should review.

Review your own work

Perform an initial review of your own code before submitting a pull request. Kindly don't place the whole burden of fixing a bunch of tiny problems on others that must review your code too. This includes building the documentation and reviewing it in HTML form in a browser (make view-doc) to make sure everything is formatted correctly and that hyperlinks work correctly.

Double-check commit contents

When making a commit, review your changes with git status and git diff, as well as on the GitHub comparison page, to ensure that you're not committing accidental modifications or editing files that you shouldn't be. This makes the maintainers' lives a lot easier, meaning your PR is more likely to be accepted.

You don't need to worry about having too many small commits, as we will squash (combine) all of your PR's commits into a single large commit when merging it into the upstream main branch.

Papers and Presentations about Theseus

Over the years, Kevin Boos, Ramla Ijaz, and other Theseus collaborators have given many presentations about Theseus OS and related topics. This page offers a selected collection of the slide decks from those talks (including some video recordings), as well as a list of selected peer-reviewed academic publications and theses.

Selected Papers and Theses

[OSDI 2020] Theseus: an Experiment in Operating System Structure and State Management
- The main paper describing Theseus's design principles, implementation, evaluation, and limitations.
- Paper (PDF) — OSDI 2020 Video Talk — OSDI 2020 Short Video — Slides (PDF)
[KISV 2023] Leveraging Rust for Lightweight OS Correctness
- A paper about extending intralingual design and applying a hybrid approach (type systems plus formal verification) for proving lightweight correctness guarantees about Theseus's memory management subsystem.
- Paper (PDF)
Kevin Boos PhD Dissertation: Theseus: Rethinking Operating Systems Structure and State Management
Ramla Ijaz Master's Thesis: Exploring Intralingual Design in Operating Systems

Other published works

[OSDI 2022] Poster: Correct and Performant Device Drivers via Intralingual Design
- An overview of in-progress work to use formal verification + intralingual design for better device drivers.
- Poster PDF
[PLOS 2017] Theseus: A State Spill-free Operating System
- Paper PDF — a shorter, outdated ideas paper about early Theseus design.
- Superseded by the OSDI 2020 paper.
[EuroSys 2017] A Characterization of State Spill in Modern Operating Systems
- Introduces and studies the concept of state spill.
- Motivation for our future work on Theseus.

Selected Presentations and Slide Decks

Theseus README + Quick Start

Click here to see the main Theseus README for quick start instructions.

The Theseus OS Book