Writing a Docker Engine Client from Scratch in Rust

There is a moment in every systems programming project where you realise the easy path does not exist. For sh0.dev, that moment came when we tried to talk to Docker.

The Docker Engine exposes a REST API. It accepts JSON, returns JSON, and behaves like any other HTTP service -- with one critical difference. It listens on a Unix domain socket at /var/run/docker.sock, not on a TCP port. This single architectural detail forced us to write our own Docker client from the ground up, and the result became one of the most satisfying pieces of code in the entire sh0 codebase.

Why Not Just Shell Out?

The obvious approach: call the docker CLI from Rust using std::process::Command. Run docker ps, parse the output, done.

This approach is a trap.

The Docker CLI is designed for humans. Its output format changes between versions. Its JSON output mode (--format '{{json .}}') is inconsistent across commands. Error handling becomes string parsing. And every CLI invocation spawns a new process, connects to the daemon, authenticates, runs the command, serialises the output, and exits. When you are managing dozens of containers, pulling images, streaming logs, and collecting stats in real time, the overhead adds up.

More fundamentally, shelling out means your PaaS depends on the Docker CLI being installed, being the right version, and being on the PATH. sh0 is a single binary. We do not want to tell users "also install Docker CLI version 24.0.7 or later."

The Docker Engine API is the right interface. It is versioned, stable, documented, and returns structured JSON. The only question was how to call it over a Unix socket.

Why Not reqwest?

The Rust ecosystem's default HTTP client is reqwest. It is excellent for calling web APIs over TCP. But it does not support Unix domain sockets. There is no configuration option, no feature flag, no workaround. reqwest uses hyper under the hood, and its connection layer is hard-coded to TCP.

We could have used the bollard crate, a Rust Docker client library. But bollard brings its own abstraction layer, its own type system, its own opinion about how container management should work. When you are building a PaaS, you need precise control over every Docker API call -- timeouts, streaming, error handling, retry logic. Another library's abstractions become constraints.

So we went one level deeper: hyper 1.x, the HTTP implementation that reqwest itself is built on, with a custom connector that speaks Unix sockets.

The UnixConnector: 40 Lines That Made Everything Work

The entire Unix socket integration is a single struct implementing tower::Service<Uri>. Here is the core of it:

rustuse hyper::Uri;
use hyper_util::rt::TokioIo;
use tokio::net::UnixStream;
use tower::Service;
use std::future::Future;
use std::pin::Pin;
use std::task::{Context, Poll};

#[derive(Clone)]
pub struct UnixConnector {
    path: String,
}

impl UnixConnector {
    pub fn new(path: impl Into<String>) -> Self {
        Self { path: path.into() }
    }
}

impl Service<Uri> for UnixConnector {
    type Response = TokioIo<UnixStream>;
    type Error = std::io::Error;
    type Future = Pin<Box<dyn Future<Output = Result<Self::Response, Self::Error>> + Send>>;

    fn poll_ready(&mut self, _cx: &mut Context<'_>) -> Poll<Result<(), Self::Error>> {
        Poll::Ready(Ok(()))
    }

    fn call(&mut self, _uri: Uri) -> Self::Future {
        let path = self.path.clone();
        Box::pin(async move {
            let stream = UnixStream::connect(&path).await?;
            Ok(TokioIo::new(stream))
        })
    }
}

That is the entire connector. It ignores the URI (because there is no DNS resolution or port to deal with -- we always connect to the same socket path) and returns a tokio UnixStream wrapped in hyper's TokioIo adapter.

The DockerClient then uses this connector with hyper-util's connection pool:

rustpub struct DockerClient {
    client: Client<UnixConnector, Full<Bytes>>,
    base: String,
}

impl DockerClient {
    pub fn new() -> Self {
        let connector = UnixConnector::new("/var/run/docker.sock");
        let client = Client::builder(TokioExecutor::new())
            .pool_idle_timeout(Duration::from_secs(30))
            .build(connector);

        Self {
            client,
            base: "http://localhost/v1.44".to_string(),
        }
    }
}

The base URL uses http://localhost as a dummy host -- the actual routing happens through the Unix socket, so the host is irrelevant. The /v1.44 suffix pins us to Docker Engine API version 1.44, ensuring consistent behaviour regardless of the Docker version installed on the host.

The Internal HTTP Helpers

Every Docker API call goes through a small set of internal methods on DockerClient:

rustimpl DockerClient {
    async fn get(&self, path: &str) -> Result<Bytes, DockerError> {
        let uri = format!("{}{}", self.base, path).parse::<Uri>()?;
        let req = Request::builder()
            .method(Method::GET)
            .uri(uri)
            .header("Host", "localhost")
            .body(Full::new(Bytes::new()))?;

        let resp = self.client.request(req).await?;
        let status = resp.status();
        let body = resp.into_body().collect().await?.to_bytes();

        if !status.is_success() {
            return Err(DockerError::Api {
                status: status.as_u16(),
                message: String::from_utf8_lossy(&body).to_string(),
            });
        }
        Ok(body)
    }

    async fn post<T: Serialize>(&self, path: &str, body: &T) -> Result<Bytes, DockerError> {
        let json = serde_json::to_vec(body)?;
        let uri = format!("{}{}", self.base, path).parse::<Uri>()?;
        let req = Request::builder()
            .method(Method::POST)
            .uri(uri)
            .header("Host", "localhost")
            .header("Content-Type", "application/json")
            .body(Full::new(Bytes::from(json)))?;

        let resp = self.client.request(req).await?;
        // ... same status check pattern
    }
}

Simple, explicit, no magic. Every request includes the Host header (required by HTTP/1.1), and every response checks the status code before returning the body. The error type carries both the HTTP status and the Docker daemon's error message, so callers get actionable diagnostics.

Multiplexed Stream Parsing: The Hard Part

Docker's API has a subtle complexity that catches everyone. When you request container logs or exec output, the response body is not plain text. It is a multiplexed stream where stdout and stderr are interleaved, each chunk prefixed with an 8-byte header:

[stream_type: 1 byte] [0x00: 3 bytes] [size: 4 bytes big-endian] [payload: size bytes]

Stream type 1 is stdout. Stream type 2 is stderr. If you try to read this as plain text, you get binary garbage mixed in with your log output.

The parsing logic handles this frame-by-frame:

rustpub fn parse_multiplexed_stream(raw: &[u8]) -> (String, String) {
    let mut stdout = String::new();
    let mut stderr = String::new();
    let mut pos = 0;

    while pos + 8 <= raw.len() {
        let stream_type = raw[pos];
        let size = u32::from_be_bytes([
            raw[pos + 4],
            raw[pos + 5],
            raw[pos + 6],
            raw[pos + 7],
        ]) as usize;

        pos += 8; // skip header

        if pos + size > raw.len() {
            break; // incomplete frame
        }

        let payload = String::from_utf8_lossy(&raw[pos..pos + size]);
        match stream_type {
            1 => stdout.push_str(&payload),
            2 => stderr.push_str(&payload),
            _ => {} // ignore other stream types
        }

        pos += size;
    }

    (stdout, stderr)
}

This function is pure -- no I/O, no async, no state. It takes a byte slice and returns two strings. That made it trivially testable:

rust#[test]
fn test_parse_multiplexed_stdout_stderr() {
    let mut data = Vec::new();
    // stdout frame: "hello"
    data.push(1); // stream type
    data.extend_from_slice(&[0, 0, 0]); // padding
    data.extend_from_slice(&5u32.to_be_bytes()); // size
    data.extend_from_slice(b"hello");
    // stderr frame: "error"
    data.push(2);
    data.extend_from_slice(&[0, 0, 0]);
    data.extend_from_slice(&5u32.to_be_bytes());
    data.extend_from_slice(b"error");

    let (out, err) = parse_multiplexed_stream(&data);
    assert_eq!(out, "hello");
    assert_eq!(err, "error");
}

No Docker daemon required. No containers running. Just bytes in, strings out. This test runs in microseconds and will never flake.

CPU Percentage: Matching Docker's Own Formula

Container stats are another area where Docker's API is deceptively complex. The /containers/{id}/stats endpoint returns a JSON blob with CPU counters -- but these are cumulative nanosecond values since the container started, not percentages. Converting them to a human-readable CPU percentage requires the same delta-based calculation that Docker's own CLI uses:

rustpub fn compute_cpu_percent(stats: &ContainerStats) -> f64 {
    let cpu_delta = stats.cpu_stats.cpu_usage.total_usage as f64
        - stats.precpu_stats.cpu_usage.total_usage as f64;

    let system_delta = stats.cpu_stats.system_cpu_usage.unwrap_or(0) as f64
        - stats.precpu_stats.system_cpu_usage.unwrap_or(0) as f64;

    if system_delta <= 0.0 || cpu_delta < 0.0 {
        return 0.0;
    }

    let num_cpus = stats.cpu_stats.online_cpus
        .or_else(|| {
            stats.cpu_stats.cpu_usage.percpu_usage
                .as_ref()
                .map(|v| v.len() as u64)
        })
        .unwrap_or(1) as f64;

    (cpu_delta / system_delta) * num_cpus * 100.0
}

The formula is: (cpu_delta / system_delta) <em> num_cpus </em> 100. The deltas are between the current stats snapshot and the previous one (Docker provides both in a single response as cpu_stats and precpu_stats). The edge case handling -- zero deltas, missing online_cpus, fallback to percpu_usage length -- mirrors exactly what docker stats does internally.

We tested the edge cases explicitly:

rust#[test]
fn test_cpu_percent_zero_delta() {
    // When system_delta is zero, CPU% should be 0.0, not NaN or infinity
    let stats = make_stats(1000, 1000, 5000, 5000, 4);
    assert_eq!(compute_cpu_percent(&stats), 0.0);
}

Division by zero in a monitoring system means NaN propagating through your dashboards. We made sure that cannot happen.

Container Lifecycle: The Full API Surface

With the low-level plumbing in place, the container management API was straightforward but thorough. The full surface:

create -- Accept a container configuration, POST to /containers/create, return the container ID
start / stop / restart -- POST to the appropriate endpoint with timeout parameters
remove -- DELETE with force and volume removal options
inspect -- GET the full container state (running, exit code, IP address, mounts)
list -- GET all containers with optional filters (status, label)
wait -- Block until a container exits and return the exit code
logs -- Fetch logs with multiplexed stream parsing
exec -- Create an exec instance, start it, capture stdout/stderr separately

The sh0_container_config() helper deserves mention. It generates a Docker container configuration with sh0's standard defaults:

rustpub struct Sh0ContainerParams {
    pub image: String,
    pub name: String,
    pub port: u16,
    pub env: Vec<String>,
    pub network: Option<String>,
    pub memory_limit: Option<u64>,
    pub cpu_quota: Option<u64>,
}

This struct exists because the raw Docker API container configuration has dozens of fields, and Clippy rightly complains about functions with 8+ parameters. The struct is the builder pattern without the ceremony.

Network and Volume Management

Two smaller but essential modules rounded out the Docker client.

The network module's centrepiece is ensure_sh0_network() -- an idempotent function that creates the sh0 bridge network if it does not already exist. Every sh0-managed container joins this network, giving them DNS-based service discovery (container A can reach container B by name) without exposing anything to the host network.

The volume module manages persistent storage with sh0.managed labels, so sh0 can distinguish its own volumes from user-created ones and clean up appropriately.

The Type System: 40 Serde Structs

The Docker Engine API returns deeply nested JSON. We defined approximately 40 Rust structs with serde's derive macros to deserialize every response type we needed:

rust#[derive(Debug, Deserialize)]
#[serde(rename_all = "PascalCase")]
pub struct ContainerInspect {
    pub id: String,
    pub name: String,
    pub state: ContainerState,
    pub config: ContainerConfig,
    pub network_settings: NetworkSettings,
}

#[derive(Debug, Deserialize)]
#[serde(rename_all = "PascalCase")]
pub struct ContainerState {
    pub status: String,
    pub running: bool,
    pub exit_code: i64,
    pub started_at: String,
    pub finished_at: String,
}

The #[serde(rename_all = "PascalCase")] attribute handles Docker's convention of PascalCase JSON keys (an artefact of Docker being written in Go, where exported fields are capitalised). Without it, every field would need an individual #[serde(rename = "...")] annotation.

Testing Without Docker

A key design goal: the unit tests must run without Docker installed. We achieved this by keeping pure logic -- stream parsing, CPU computation, configuration generation -- in separate functions from I/O operations.

The six unit tests cover:

Multiplexed stream parsing (stdout/stderr separation)
Log stream parsing
CPU percentage computation (normal case)
CPU percentage with zero delta (edge case)
Network I/O aggregation across multiple interfaces
sh0_container_config default values

These tests run in cargo test on any machine, including CI environments without Docker.

The five integration test files are feature-gated behind #[cfg(feature = "docker-tests")] and require a running Docker daemon. They test the real API: pinging the daemon, creating and destroying containers, pulling images, creating and removing networks, managing volumes. You run them explicitly with cargo test --features docker-tests when you have Docker available.

What We Learned

Writing a Docker client from scratch taught us three things.

Unix sockets are not special. The UnixConnector is 40 lines because connecting to a Unix socket is, fundamentally, the same as connecting to a TCP port -- you get a bidirectional byte stream. hyper does not care what kind of stream it writes HTTP frames to. The abstraction boundary is exactly right.

Docker's API is better than its CLI. Structured JSON responses, proper HTTP status codes, streaming support, versioned endpoints. The CLI is a convenience layer for humans; the API is the real interface for machines.

Multiplexed streams are a real problem. Every Docker client library has to solve this. Many get it wrong, especially around incomplete frames and error stream separation. By writing our own parser, we understood exactly what the bytes meant, and we could test every edge case.

The Docker client was the hardest piece of code we wrote on Day Zero. But it was also the foundation for everything that followed: the deploy pipeline builds images through it, the monitoring system collects stats through it, the proxy manager inspects container networks through it. Every container sh0 manages passes through these 40 lines of Unix socket code.

This is Part 2 of the "How We Built sh0.dev" series.

Series Navigation: - [1] Day Zero: 10 Rust Crates in 24 Hours - [2] Writing a Docker Engine Client from Scratch in Rust (you are here) - [3] Auto-Detecting 19 Tech Stacks from Source Code - [4] 34 Rules to Catch Deployment Mistakes Before They Happen