#019 -- Error Diagnostics That Actually Help

Most compiler errors are written for compiler engineers. FLIN's are written for humans.

Here is a typical error message from a C compiler:

error: use of undeclared identifier 'count'

Here is the same error from FLIN:

error[E0001]: undefined variable 'count'
  --> app.flin:12:15
   |
12 |     total = count + tax
   |             ^^^^^ variable not declared in this scope
   |
   = help: Did you mean 'cost'? (declared on line 3)
   = note: Variables must be declared before use in FLIN

The difference is not cosmetic. The C error tells you what happened. The FLIN error tells you what happened, where it happened, what you probably meant, and how the language works. One stops you. The other teaches you.

This article covers Session 018, where we built FLIN's error diagnostic system: the Diagnostic struct, source context formatting, colored terminal output, the suggestion engine, and the DiagnosticBag for collecting multiple errors.

The Diagnostic Architecture

The diagnostic system has four components: the Diagnostic struct (the error data), the DiagnosticReporter (the formatter), the DiagnosticBag (the collector), and the integration points with the lexer, parser, and type checker.

The Diagnostic struct captures everything about a single error:

rustpub struct Diagnostic {
    pub level: DiagnosticLevel,
    pub code: Option<String>,
    pub message: String,
    pub span: Option<Span>,
    pub details: Vec<String>,
    pub suggestions: Vec<String>,
    pub notes: Vec<Diagnostic>,
}

pub enum DiagnosticLevel {
    Error,
    Warning,
    Note,
    Help,
}

The design is recursive: a Diagnostic can contain sub-diagnostics in its notes field. This enables rich error chains. A type mismatch error can include a Note diagnostic pointing to the original declaration ("declared here"), and a Help diagnostic suggesting the fix.

Construction uses a builder pattern:

rustlet diagnostic = Diagnostic::error("undefined variable 'foo'")
    .with_code("E0001")
    .at_span(span)
    .with_detail("Variables must be declared before use")
    .with_suggestion("Did you mean 'bar'?")
    .with_note(Diagnostic::note("'bar' declared here").at_span(bar_span));

Each method returns self, allowing chained construction. This is deliberately verbose -- every piece of information in the diagnostic is explicitly specified, and the compiler's code makes clear what information the user will see. There is no magic formatting, no implicit message generation. If a suggestion appears in the output, some code path called .with_suggestion().

Source Context Formatting

The most useful part of an error message is the source context -- the line of code where the error occurred, with the problematic span underlined. Producing this output requires three pieces of information: the source text, the span (start line/column and end line/column), and the file name.

The DiagnosticReporter handles the formatting:

rustpub struct DiagnosticReporter<'a> {
    source: &'a str,
    file_name: String,
    lines: Vec<&'a str>,
}

impl<'a> DiagnosticReporter<'a> {
    pub fn new(source: &'a str) -> Self {
        let lines: Vec<&str> = source.lines().collect();
        DiagnosticReporter {
            source,
            file_name: String::from("<input>"),
            lines,
        }
    }

    pub fn with_file_name(mut self, name: &str) -> Self {
        self.file_name = name.to_string();
        self
    }

    pub fn report(&self, diagnostic: &Diagnostic) -> String {
        let mut output = String::new();

        // Level and message
        output.push_str(&format!(
            "{}{}: {}\n",
            self.format_level(diagnostic.level),
            diagnostic.code.as_ref()
                .map(|c| format!("[{}]", c))
                .unwrap_or_default(),
            diagnostic.message
        ));

        // Source location and context
        if let Some(span) = &diagnostic.span {
            self.format_source_context(&mut output, span);
        }

        // Details, suggestions, notes
        for detail in &diagnostic.details {
            output.push_str(&format!("   = note: {}\n", detail));
        }
        for suggestion in &diagnostic.suggestions {
            output.push_str(&format!("   = help: {}\n", suggestion));
        }
        for note in &diagnostic.notes {
            output.push_str(&self.report(note));
        }

        output
    }
}

The source context formatter extracts the relevant line, prints it with its line number, and draws a caret or underline beneath the error span:

  --> app.flin:12:15
   |
12 |     total = count + tax
   |             ^^^^^ variable not declared in this scope

The alignment is precise. The --> arrow points to the file, line, and column. The pipe characters create a visual gutter. The underline uses ^ characters matching the exact width of the problematic span. If the span is a single character, there is one ^. If it spans an entire identifier, the underline covers the whole name.

For multi-line spans, the formatter shows the start and end lines with a vertical bar connecting them. This handles cases like unterminated strings or mismatched brackets that span multiple lines.

Colored Terminal Output

Color transforms a wall of text into a scannable report. The diagnostic system uses the colored crate for terminal output:

Error labels are red and bold
Warning labels are yellow and bold
Note labels are blue
Help labels are green
Source code is displayed in default color
Underlines use the same color as their diagnostic level

The colors are applied at the formatting stage, not stored in the Diagnostic struct. This separation matters because the same diagnostic might be rendered to a terminal (with colors), to a log file (without colors), or to a JSON API response (with semantic markup instead of ANSI codes).

When the terminal does not support color (detected via isatty()), the colored crate automatically falls back to plain text. The diagnostic output remains readable without color -- the structural formatting (arrows, pipes, underlines) conveys the hierarchy even in monochrome.

The Suggestion Engine

Suggestions are the difference between an error message that stops you and one that helps you. FLIN's diagnostic system supports three kinds of suggestions:

Textual suggestions are plain-text hints appended to the diagnostic:

   = help: Did you mean 'cost'? (declared on line 3)

These are generated by the compiler phase that detected the error. The type checker, for example, maintains a set of all declared variable names. When it encounters an undefined variable, it computes the Levenshtein distance between the undefined name and all known names, and suggests the closest match if the distance is below a threshold.

Contextual suggestions provide information about the language rule that was violated:

   = note: Variables must be declared before use in FLIN
   = help: Declare 'count' with: count = 0

These are particularly valuable for new FLIN developers who are still learning the language. The error does not just say "this is wrong" -- it says "this is how FLIN works" and "here is what you should write instead."

Related-location suggestions point to other places in the code that are relevant to understanding the error:

error[E0003]: type mismatch
  --> app.flin:12:15
   |
12 |     count = "hello"
   |             ^^^^^^^ expected int, found text
   |
note: 'count' was previously declared as int
  --> app.flin:5:1
   |
5  |     count = 0
   |     ^^^^^^^^^ first assignment (inferred type: int)

The nested note diagnostic has its own span pointing to line 5, where count was first assigned. The user sees both locations at once and understands immediately why the reassignment is invalid.

Common Error Patterns

The diagnostic system handles errors from every compiler phase. Here are the patterns for each:

Lexer errors catch malformed input before parsing begins:

error[L0001]: unterminated string literal
  --> app.flin:7:12
   |
7  |     name = "hello
   |            ^ string starts here but never ends
   |
   = help: Add a closing " at the end of the string

error[L0002]: unexpected character '#'
  --> app.flin:3:1
   |
3  |     #comment
   |     ^ unexpected character
   |
   = help: FLIN uses // for comments, not #

Parser errors catch syntactic problems:

error[P0001]: expected '}' to close entity declaration
  --> app.flin:4:1
   |
1  |     entity User {
   |                 - opening brace here
...
4  |     save user
   |     ^^^^ expected '}', found keyword 'save'
   |
   = help: Add '}' after the last field declaration

error[P0003]: mismatched closing tag
  --> app.flin:8:3
   |
5  |     <div>
   |      --- opening tag
...
8  |     </span>
   |       ^^^^ expected </div>, found </span>

Type checker errors catch semantic problems:

error[T0001]: cannot add text and int
  --> app.flin:6:15
   |
6  |     result = name + count
   |              ^^^^^^^^^^^^ operator '+' cannot be applied to text and int
   |
   = help: Convert count to text first: name + to_text(count)

error[T0004]: unknown entity 'Usr'
  --> app.flin:10:1
   |
10 |     save Usr { name: "Juste" }
   |          ^^^ entity 'Usr' not declared
   |
   = help: Did you mean 'User'? (declared on line 1)

Each error code (L0001, P0001, T0001) is stable and documented. Users can search for the code to find detailed explanations, and tooling can programmatically categorize errors.

The DiagnosticBag: Multiple Errors at Once

A compiler that stops at the first error forces the user into a frustrating cycle: fix one error, recompile, see the next error, fix it, recompile, see another error. Modern compilers report as many errors as they can find in a single pass.

FLIN uses a DiagnosticBag to collect diagnostics during compilation:

rustpub struct DiagnosticBag {
    diagnostics: Vec<Diagnostic>,
}

impl DiagnosticBag {
    pub fn new() -> Self {
        DiagnosticBag { diagnostics: Vec::new() }
    }

    pub fn add(&mut self, diagnostic: Diagnostic) {
        self.diagnostics.push(diagnostic);
    }

    pub fn has_errors(&self) -> bool {
        self.diagnostics.iter().any(|d| d.level == DiagnosticLevel::Error)
    }

    pub fn error_count(&self) -> usize {
        self.diagnostics.iter()
            .filter(|d| d.level == DiagnosticLevel::Error)
            .count()
    }

    pub fn report_all(&self, reporter: &DiagnosticReporter) -> String {
        let mut output = String::new();
        for diagnostic in &self.diagnostics {
            output.push_str(&reporter.report(diagnostic));
            output.push('\n');
        }

        let errors = self.error_count();
        let warnings = self.diagnostics.iter()
            .filter(|d| d.level == DiagnosticLevel::Warning)
            .count();

        if errors > 0 {
            output.push_str(&format!(
                "error: compilation failed with {} error{} and {} warning{}\n",
                errors,
                if errors == 1 { "" } else { "s" },
                warnings,
                if warnings == 1 { "" } else { "s" },
            ));
        }

        output
    }
}

The bag collects errors, warnings, notes, and help messages. At the end of compilation, report_all formats them in order of occurrence and appends a summary line ("compilation failed with 3 errors and 1 warning").

The key interaction is with error recovery. When the parser encounters an unexpected token, it adds a diagnostic to the bag and then calls synchronize() to skip tokens until it reaches a statement boundary (a newline followed by a keyword, or a < that starts a view element). It then resumes parsing from the recovered position. This means a FLIN program with five syntax errors will typically report all five in a single compilation attempt, rather than forcing five compile-fix-compile cycles.

Integration with the Compilation Pipeline

Each compiler phase owns its error types and converts them to diagnostics at the boundary:

rustpub fn compile(source: &str) -> Result<Chunk, DiagnosticBag> {
    let mut bag = DiagnosticBag::new();

    // Phase 1: Lex
    let tokens = match Lexer::new(source).tokenize() {
        Ok(tokens) => tokens,
        Err(lex_errors) => {
            for err in lex_errors {
                bag.add(err.into_diagnostic());
            }
            return Err(bag);
        }
    };

    // Phase 2: Parse (with recovery)
    let (ast, parse_errors) = Parser::new(tokens).parse_with_recovery();
    for err in parse_errors {
        bag.add(err.into_diagnostic());
    }
    if bag.has_errors() {
        return Err(bag);
    }

    // Phase 3: Type check
    let typed_ast = match TypeChecker::new().check(ast) {
        Ok(typed) => typed,
        Err(type_errors) => {
            for err in type_errors {
                bag.add(err.into_diagnostic());
            }
            return Err(bag);
        }
    };

    // Phase 4: Generate
    let chunk = CodeGenerator::new().generate(&typed_ast)
        .map_err(|e| {
            bag.add(e.into_diagnostic());
            bag
        })?;

    Ok(chunk)
}

Each error type (LexError, ParseError, TypeError, CodeGenError) implements an into_diagnostic() method that converts the internal error representation into a Diagnostic with the appropriate level, code, message, span, and suggestions. This keeps the conversion logic close to the code that generates the error -- the parser knows best what suggestion to give for a parse error.

The Design Philosophy

Three principles guided the diagnostic system.

Show, do not tell. Every error includes the source line where the problem occurred. The user never has to open a file and count line numbers -- the error message shows them the exact code and points to the exact character. This seems obvious, but many compilers still output errors like "line 42: type error" without the source context.

Suggest, do not just reject. When the compiler can infer what the user probably intended, it says so. An undefined variable with a close match gets a "did you mean?" suggestion. A missing closing brace gets a "add } after line N" suggestion. A type mismatch gets a "convert with to_text()" suggestion. These suggestions are not always correct, but even when they are wrong, they give the user a starting point for understanding the error.

Respect the user's time. Report all errors at once. Use color to make the output scannable. Put the most important information (the error message and the source line) at the top, and supplementary information (notes, suggestions) below. A developer should be able to glance at the output and know what to fix, without reading every word.

Testing the Diagnostic System

Session 018 added 68 new tests, bringing the total from 522 to 590. The diagnostic system's tests covered:

Source context formatting (correct line extraction, underline width, multi-line spans)
Colored output (ANSI escape codes for each diagnostic level)
Suggestion formatting (single suggestions, multiple suggestions, nested notes)
DiagnosticBag behavior (error counting, has_errors check, ordering)
Integration with the compilation pipeline (lex errors produce diagnostics, parse errors produce diagnostics with recovery, type errors include source context)

The tests use snapshot-style assertions: the expected output is a multi-line string that matches the formatter's output exactly, including whitespace alignment. This ensures that formatting changes are caught by the test suite -- a shifted underline or a missing pipe character will cause a test failure.

What Good Diagnostics Cost

The diagnostic system added approximately 570 lines of Rust to the codebase. It required no changes to the lexer, parser, or type checker implementations -- only the addition of into_diagnostic() methods on existing error types and the new diagnostic.rs module.

The runtime cost is negligible. Diagnostics are only constructed on the error path -- successful compilation never allocates a Diagnostic. On the error path, the cost of formatting a few error messages is invisible compared to the compilation work that preceded the error.

The development cost was one session -- roughly 30 minutes. But the ongoing value is measured in every error message that every FLIN developer will ever see. An unhelpful error message is a tax on every future user. A helpful error message is an investment that pays off thousands of times.

The Rust compiler team understood this. Elm understood this. Now FLIN understands this. Error messages are not an afterthought. They are a product feature.

This is Part 19 of the "How We Built FLIN" series, documenting how a CEO in Abidjan and an AI CTO built a programming language compiler in sessions measured in minutes, not months.

Next in the series: The complete compilation pipeline from end to end -- how source code enters, how six phases transform it, and how a running application exits.

#019 -- Error Diagnostics That Actually Help

The Diagnostic Architecture

Source Context Formatting

Colored Terminal Output

The Suggestion Engine

Common Error Patterns

The DiagnosticBag: Multiple Errors at Once

Integration with the Compilation Pipeline

The Design Philosophy

Testing the Diagnostic System

What Good Diagnostics Cost

Responses

Related Articles

Step Zero Wasn’t Enough: How Validating A Constructor But Not The Runtime Took Down Every Déblo Voice Session The Hour We Shipped Real-Time Camera Streaming

The Em-Dash That Killed Production: How One Marketing Tagline In An HTTP Header Took Down Déblo’s Chat For 24 Hours

Six Hours From Empty Page to Apple Review — How We Submitted Déblo to the App Store, Live