#185 -- Integration Tests Complete

Unit tests verify that individual functions work correctly in isolation. Integration tests verify that the functions work correctly together. The distinction matters because most production bugs are not caused by individual functions failing -- they are caused by the interactions between functions producing unexpected results. A string parser works perfectly. A database serializer works perfectly. But when the parser's output is fed to the serializer, an edge case in Unicode handling corrupts the data.

Session 199 marked the completion of FLIN's integration test suite: 617 end-to-end tests that exercise every major subsystem interaction. Combined with 2,608 unit tests (at that point), the total test count reached 3,225. Every test passing.

Why Integration Tests Are Different

A unit test for FLIN's entity save operation might look like this:

rust#[test]
fn test_save_entity_basic() {
    let mut store = EntityStore::new_in_memory();
    store.register_entity("Todo", vec![
        Field::new("title", FieldType::Text),
        Field::new("done", FieldType::Bool),
    ]);

    let values = map! {
        "title" => Value::Text("Buy milk".into()),
        "done" => Value::Bool(false),
    };

    let id = store.save("Todo", &values).unwrap();
    assert!(id > 0);

    let loaded = store.get("Todo", id).unwrap();
    assert_eq!(loaded.get("title"), Some(&Value::Text("Buy milk".into())));
}

This test verifies that saving and loading an entity works. But it tests the storage engine in isolation -- no HTTP server, no VM, no template rendering. In production, an entity is saved through a chain of subsystems:

An HTTP request arrives with a JSON body.
The HTTP parser extracts the body.
The VM executes the route handler.
The route handler validates the input.
The entity is saved to the database.
Foreign key constraints are checked.
Search indexes are updated.
Cache is invalidated.
A response is constructed and sent.

An integration test exercises this entire chain:

rust#[test]
fn test_create_todo_via_http() {
    let app = TestApp::new("tests/fixtures/todo-app/");

    let response = app.post("/api/todos")
        .json(&json!({
            "title": "Buy milk",
            "done": false,
        }))
        .send();

    assert_eq!(response.status(), 201);

    let body: serde_json::Value = response.json();
    assert_eq!(body["title"], "Buy milk");
    assert_eq!(body["done"], false);
    assert!(body["id"].is_number());

    // Verify the entity was actually persisted
    let get_response = app.get(&format!("/api/todos/{}", body["id"])).send();
    assert_eq!(get_response.status(), 200);
    assert_eq!(get_response.json::<serde_json::Value>()["title"], "Buy milk");
}

This test starts a real FLIN HTTP server, sends a real HTTP request, and verifies the response. If any subsystem in the chain fails -- HTTP parsing, VM execution, entity validation, database save, response serialization -- the test fails.

The Test Infrastructure

Running integration tests requires a complete FLIN environment: a compiler, a VM, an HTTP server, and a database. We built a TestApp harness that automates this setup:

rustpub struct TestApp {
    server_addr: SocketAddr,
    db_dir: TempDir,
    client: reqwest::blocking::Client,
}

impl TestApp {
    pub fn new(project_dir: &str) -> Self {
        let db_dir = TempDir::new().unwrap();

        // Compile the FLIN project
        let compiled = compile_project(Path::new(project_dir))
            .expect("Test project compilation failed");

        // Start the server on a random port
        let server = HttpServer::new(compiled, ServerConfig {
            host: "127.0.0.1".into(),
            port: 0,  // OS assigns a random available port
            db_path: db_dir.path().to_path_buf(),
        });

        let addr = server.start_background();

        Self {
            server_addr: addr,
            db_dir,
            client: reqwest::blocking::Client::new(),
        }
    }

    pub fn get(&self, path: &str) -> RequestBuilder {
        self.client.get(format!("http://{}{}", self.server_addr, path))
    }

    pub fn post(&self, path: &str) -> RequestBuilder {
        self.client.post(format!("http://{}{}", self.server_addr, path))
    }
}

impl Drop for TestApp {
    fn drop(&mut self) {
        // Server shuts down when TestApp is dropped
        // TempDir is automatically cleaned up
    }
}

Each integration test gets its own server instance on a random port, its own temporary database, and its own compiled project. Tests run in parallel without interfering with each other. The TempDir is cleaned up when the test completes, leaving no artifacts.

Test Fixture Applications

Integration tests need FLIN applications to test against. We created a set of fixture applications in tests/fixtures/, each designed to exercise specific subsystem interactions:

todo-app/ -- A basic CRUD application with entities, routes, and templates. Tests entity lifecycle, HTTP routing, and response formatting.

blog-app/ -- A content application with foreign key relationships (Post belongs to Author), full-text search, and pagination. Tests relational queries, search indexing, and multi-entity operations.

auth-app/ -- An application with login, registration, guards, and JWT tokens. Tests the security subsystem end-to-end.

file-app/ -- An application with file uploads, image processing, and blob storage. Tests the file management pipeline.

search-app/ -- An application with semantic search, hybrid search, and analytics. Tests the search subsystem with real embedding generation and vector indexing.

Each fixture application is a minimal but complete FLIN project -- entities, routes, and templates that together exercise the subsystems under test.

The Test Categories

The 617 integration tests were organized into categories:

HTTP Routing Tests (89 tests)

Testing every route pattern: static paths, dynamic parameters, query parameters, nested routes, wildcard routes, and method-based dispatching.

rust#[test]
fn test_dynamic_route_parameter() {
    let app = TestApp::new("tests/fixtures/todo-app/");

    // Create a todo first
    let create = app.post("/api/todos")
        .json(&json!({ "title": "Test" }))
        .send();
    let id = create.json::<Value>()["id"].as_i64().unwrap();

    // Access via dynamic route
    let response = app.get(&format!("/api/todos/{}", id)).send();
    assert_eq!(response.status(), 200);
    assert_eq!(response.json::<Value>()["title"], "Test");
}

#[test]
fn test_route_not_found() {
    let app = TestApp::new("tests/fixtures/todo-app/");
    let response = app.get("/nonexistent").send();
    assert_eq!(response.status(), 404);
}

Entity CRUD Tests (142 tests)

Testing create, read, update, delete, and destroy operations through the HTTP layer. Including validation errors, constraint violations, and concurrent access.

rust#[test]
fn test_entity_validation_error() {
    let app = TestApp::new("tests/fixtures/todo-app/");

    // Missing required field
    let response = app.post("/api/todos")
        .json(&json!({ "done": false }))
        .send();

    assert_eq!(response.status(), 400);
    let body = response.json::<Value>();
    assert!(body["error"].as_str().unwrap().contains("title"));
}

Foreign Key Tests (73 tests)

Testing relational operations: eager loading with .with(), cascading deletes, restrict constraints, and set-null behavior.

rust#[test]
fn test_cascade_delete_removes_children() {
    let app = TestApp::new("tests/fixtures/blog-app/");

    // Create author with posts
    let author = app.post("/api/authors")
        .json(&json!({ "name": "Juste" }))
        .send().json::<Value>();

    app.post("/api/posts")
        .json(&json!({
            "title": "First Post",
            "author_id": author["id"]
        }))
        .send();

    // Delete author (cascade)
    let delete = app.delete(&format!("/api/authors/{}", author["id"])).send();
    assert_eq!(delete.status(), 200);

    // Posts should be gone
    let posts = app.get("/api/posts").send().json::<Value>();
    assert_eq!(posts["data"].as_array().unwrap().len(), 0);
}

Search Tests (94 tests)

Testing BM25 keyword search, semantic search, hybrid search, and search analytics through the HTTP layer.

Authentication Tests (67 tests)

Testing login flows, JWT token generation and validation, guard enforcement, and role-based access control.

File Upload Tests (48 tests)

Testing multipart file uploads, storage backend selection, file retrieval, and garbage collection.

WebSocket Tests (34 tests)

Testing WebSocket connection, room joining, message broadcasting, and binary frame handling.

Template Rendering Tests (41 tests)

Testing server-side template rendering with dynamic data, conditional blocks, loops, component composition, and slot injection.

Error Handling Tests (29 tests)

Testing error responses for every category: 400 (bad request), 401 (unauthorized), 403 (forbidden), 404 (not found), 409 (conflict), 422 (validation), 500 (internal error).

What the Tests Found

Integration testing is not just verification -- it is discovery. Running the full suite against real HTTP traffic exposed bugs that unit tests could never catch:

Bug 1: JSON serialization of nested entities. When an entity had a foreign key relationship and was loaded with .with(), the JSON response included the related entity as a nested object. But the nested entity's id field was serialized as a string instead of a number, because the value passed through a template rendering path that converts all values to strings.

Bug 2: Search index not updated on entity update. The BM25 index was updated on entity creation and deletion, but not on update. Editing a blog post's title would not update its search index until the server restarted.

Bug 3: File upload race condition. When two concurrent requests uploaded files to the same entity field, the second upload could overwrite the first without triggering garbage collection for the orphaned blob.

Bug 4: WebSocket room cleanup. When a WebSocket connection was closed by the client, the connection was removed from the active connections list but not from its room membership. Subsequent broadcasts to the room would attempt to send to a closed connection, causing a logged error on every broadcast.

Bug 5: Guard ordering sensitivity. When multiple guards were applied to a route (guard auth and guard role("admin")), the execution order depended on the file system's directory iteration order, which is not guaranteed to be deterministic. On some operating systems, the role guard would execute before the auth guard, causing a confusing "role check failed" error instead of "authentication required."

Each bug was fixed and a regression test was added. The integration test suite grew not just by design but by discovery -- every bug found in integration testing became a permanent test case.

Running the Suite

The complete integration test suite runs in approximately 45 seconds on an 8-core machine. Each test starts its own server, compiles the fixture application, and makes HTTP requests. Parallelism is managed by Rust's test runner, which distributes tests across threads.

bash# Run all integration tests
cargo test --test integration_e2e

# Run a specific category
cargo test --test integration_e2e -- test_entity

# Run with output (see request/response details)
cargo test --test integration_e2e -- --nocapture

The 45-second runtime was intentional. Integration tests that take minutes to run do not get run. Developers skip them, CI pipelines time out, and bugs slip through. We invested significant effort in test startup optimization -- pre-compiling fixture applications, reusing compiled bytecode across tests in the same category, and minimizing database initialization time.

The Confidence Multiplier

After Session 199, the test suite stood at 3,225 tests: 2,608 unit tests and 617 integration tests. The ratio -- roughly 4:1 unit to integration -- reflects the testing pyramid: many fast, focused unit tests at the base, and fewer but more comprehensive integration tests at the top.

The integration tests provided something that unit tests alone could not: confidence that the system works as a whole. When all 617 integration tests pass, it means that a FLIN application can be compiled, served over HTTP, queried via the database, searched via BM25 and semantic indexes, protected by authentication guards, and rendered through templates -- all working together correctly.

That confidence is what separates a project from a product.

This is Part 185 of the "How We Built FLIN" series, documenting how a CEO in Abidjan and an AI CTO designed and built a programming language from scratch.

Series Navigation: - [184] MVP Status Review: What's Ready and What's Not - [185] Integration Tests Complete (you are here) - [186] Error Resilience Patterns