#015 -- Hindley-Milner Type Inference in a Custom Language

FLIN infers types. You write count = 0 and the compiler knows it is an integer. You write name = "Juste" and the compiler knows it is text. You write items = [] and the compiler assigns it a polymorphic type that will be resolved when you first add an element. The algorithm behind this is Hindley-Milner -- a type inference system invented independently by Roger Hindley in 1969 and Robin Milner in 1978, and used in some form by Haskell, OCaml, F#, Rust (partially), and now FLIN.

Implementing Hindley-Milner in a custom language is one of those tasks that sounds intimidating and turns out to be surprisingly compact. FLIN's entire type inference engine -- unification, substitution, generalization, instantiation, and let-polymorphism -- fits in about 200 lines of Rust. It was built in a single 25-minute session. And it works.

This article explains the algorithm, walks through the implementation, and shows how FLIN uses it to catch errors at compile time without burdening developers with type annotations.

Why Type Inference Matters

FLIN is designed for developers who want to build applications quickly. Forcing them to write explicit types on every variable, every function parameter, every return value, would slow them down and add visual noise to an otherwise clean language.

But FLIN is also a compiled language with static type checking. The compiler needs to know the type of every expression to verify that operations are valid (you cannot add a boolean to a string), to generate correct bytecode (integer addition and float addition are different instructions), and to catch errors before the program runs.

Type inference resolves this tension. The developer writes unannotated code. The compiler figures out the types. If the types are inconsistent -- if you try to use the same variable as both an integer and a string -- the compiler reports an error at the exact location where the inconsistency arises.

The question is: which type inference algorithm? The simplest approach, local inference, just looks at the right-hand side of each assignment and infers the type from the literal. count = 0 is int because 0 is an integer. This works for simple cases but fails for generic data structures: what is the type of items = []? Local inference cannot answer this because the right-hand side contains no type information.

Hindley-Milner handles this by introducing type variables -- placeholders for unknown types that are resolved through a process called unification. It also supports let-polymorphism, which allows a variable to have different types in different contexts. These two features make it the right choice for FLIN.

The Three Pillars: Unification, Substitution, and Type Variables

The foundation of Hindley-Milner is the type variable. When the compiler encounters an expression whose type is not immediately known, it creates a fresh type variable -- a placeholder like ?T0, ?T1, ?T2. These variables are not types themselves; they are promises that a type will be determined later.

Type variables are resolved through unification. Unification takes two types and attempts to make them equal by binding type variables. For example:

Unify ?T0 with int -- bind ?T0 to int. Success.
Unify int with int -- already equal. Success.
Unify int with text -- cannot be made equal. Error.
Unify [?T0] with [int] -- recursively unify ?T0 with int. Bind ?T0 to int. Success.

The unification algorithm in FLIN:

rustpub fn unify(&mut self, a: &FlinType, b: &FlinType) -> Result<(), TypeError> {
    let a = self.substitute(a);
    let b = self.substitute(b);

    match (&a, &b) {
        // Same concrete types
        (FlinType::Int, FlinType::Int) => Ok(()),
        (FlinType::Text, FlinType::Text) => Ok(()),
        (FlinType::Bool, FlinType::Bool) => Ok(()),
        (FlinType::Float, FlinType::Float) => Ok(()),
        (FlinType::Time, FlinType::Time) => Ok(()),
        (FlinType::Money, FlinType::Money) => Ok(()),

        // Type variable: bind it
        (FlinType::TypeVar(id), other) | (other, FlinType::TypeVar(id)) => {
            if let FlinType::TypeVar(other_id) = other {
                if id == other_id {
                    return Ok(()); // Same variable, nothing to do
                }
            }
            // Occurs check: prevent infinite types
            if self.occurs_in(*id, other) {
                return Err(TypeError::InfiniteType(*id, other.clone()));
            }
            self.substitutions.insert(*id, other.clone());
            Ok(())
        }

        // Structural types: recurse
        (FlinType::List(inner_a), FlinType::List(inner_b)) => {
            self.unify(inner_a, inner_b)
        }
        (FlinType::Optional(inner_a), FlinType::Optional(inner_b)) => {
            self.unify(inner_a, inner_b)
        }

        // Int can coerce to Float
        (FlinType::Int, FlinType::Float) | (FlinType::Float, FlinType::Int) => Ok(()),

        // Mismatch
        _ => Err(TypeError::Mismatch(a.clone(), b.clone())),
    }
}

Two things worth noting. First, every call to unify starts by applying existing substitutions to both types. If ?T0 was previously bound to int, then unifying ?T0 with text actually unifies int with text, which correctly produces an error. Without this substitution step, the algorithm would silently accept contradictory bindings.

Second, the occurs check: before binding a type variable ?T0 to a type, we verify that ?T0 does not appear inside that type. Without this check, unifying ?T0 with [?T0] would create an infinite type ([[[[[...]]]]]), which would cause the substitution algorithm to loop forever. The occurs check prevents this by rejecting the binding with a clear error.

Substitution: Applying What We Know

Substitution replaces type variables with their bound types throughout an expression:

rustpub fn substitute(&self, ty: &FlinType) -> FlinType {
    match ty {
        FlinType::TypeVar(id) => {
            if let Some(bound) = self.substitutions.get(id) {
                self.substitute(bound) // Chase the chain
            } else {
                ty.clone()
            }
        }
        FlinType::List(inner) => {
            FlinType::List(Box::new(self.substitute(inner)))
        }
        FlinType::Optional(inner) => {
            FlinType::Optional(Box::new(self.substitute(inner)))
        }
        _ => ty.clone(),
    }
}

The recursive call in the TypeVar case is important. Substitutions can chain: ?T0 -> ?T1 -> int. The substitute method follows the chain until it reaches a concrete type or an unbound variable. This is sometimes called "path compression" by analogy with union-find data structures, though our implementation does not actually compress the path in place.

Let-Polymorphism: The Key Feature

Local type inference cannot handle this code:

flinx = []
y = x
z = x

What is the type of x? It is an empty list, but a list of what? With local inference, you would have to pick a type or report an error. With Hindley-Milner, you can give x a polymorphic type: "for any type a, x has type [a]." Then y and z each get their own fresh copy of this polymorphic type, and they can independently be constrained to different element types.

This is let-polymorphism: variables bound by let (or in FLIN's case, by =) can have polymorphic types that are instantiated fresh at each use site.

The implementation has two parts: generalization and instantiation.

Generalization happens when a variable is defined. After inferring the type of the initializer, the type checker finds all type variables that are free (not bound by any enclosing scope) and quantifies over them:

rustpub fn generalize(&self, ty: &FlinType) -> TypeScheme {
    let env_free_vars = self.env.free_type_vars();
    let ty_free_vars = self.free_vars_in(ty);

    // Variables free in the type but not in the environment
    let quantified: Vec<TypeVar> = ty_free_vars
        .difference(&env_free_vars)
        .cloned()
        .collect();

    if quantified.is_empty() {
        TypeScheme::mono(ty.clone())
    } else {
        TypeScheme::poly(quantified, ty.clone())
    }
}

For x = [], the inferred type is [?T0] where ?T0 is free. Generalization produces the type scheme forall T0. [T0] -- for any type T0, x is a list of T0.

Instantiation happens when a variable is used. The type checker replaces each quantified variable with a fresh type variable:

rustpub fn instantiate(&mut self, scheme: &TypeScheme) -> FlinType {
    if scheme.is_mono() {
        return scheme.inner().clone();
    }

    let mut substitution = HashMap::new();
    for var in &scheme.vars {
        substitution.insert(*var, FlinType::TypeVar(self.fresh_var()));
    }

    self.apply_scheme_substitution(&scheme.ty, &substitution)
}

When the type checker encounters y = x, it looks up x's type scheme (forall T0. [T0]), instantiates it with a fresh variable ([?T5]), and assigns that to y. When it encounters z = x, it instantiates again with a different fresh variable ([?T6]). Now y and z can independently be constrained to different list element types.

The TypeScheme Struct

The TypeScheme struct bridges the gap between monomorphic types (a single concrete type) and polymorphic types (a type with quantified variables):

rustpub struct TypeScheme {
    pub vars: Vec<TypeVar>,  // Bound (quantified) variables
    pub ty: FlinType,        // The underlying type expression
}

impl TypeScheme {
    pub fn mono(ty: FlinType) -> Self {
        TypeScheme { vars: Vec::new(), ty }
    }

    pub fn poly(vars: Vec<TypeVar>, ty: FlinType) -> Self {
        TypeScheme { vars, ty }
    }

    pub fn is_mono(&self) -> bool {
        self.vars.is_empty()
    }

    pub fn inner(&self) -> &FlinType {
        &self.ty
    }
}

The type environment stores TypeScheme values, not FlinType values. This means every variable binding can potentially be polymorphic, even though most will be monomorphic in practice (their vars list will be empty). The uniform representation simplifies the lookup logic: you always instantiate, but for monomorphic schemes, instantiation is a no-op.

Type Checking Statements

The type checker walks the AST statement by statement, maintaining a type environment that maps variable names to type schemes:

For variable declarations, it infers the type of the initializer, unifies it with any explicit annotation, generalizes the result, and stores the scheme in the environment.

For entity declarations, it registers each field's type in the environment, creating a new scope for the entity. Entity field types are always explicit (the FLIN syntax requires type annotations on fields), so no inference is needed -- but the type checker still verifies that default values are compatible with their annotated types. If a field is count: int = "hello", the type checker reports a mismatch.

For if statements, it checks that the condition has type bool and recursively checks both branches. For for loops, it checks that the iterable has a list type and binds the loop variable to the list's element type.

For save and delete statements, it verifies that the expression has an entity type. You cannot save 42 -- the type checker catches this.

Type Checking Expressions

Expression type checking is where unification does most of its work:

Binary expressions: the type checker infers the types of both operands, then checks that the operator is valid for those types. + works on int + int (producing int), float + float (producing float), int + float (producing float, with coercion), and text + text (producing text, for concatenation). Any other combination is a type error.

Member access: user.name requires that user has an entity type and that the entity has a field named name. The result type is the field's type.

Method calls: items.count() requires that items has a list type. The method count is a built-in that returns int. The type checker has a table of built-in methods for each type.

Temporal access: user.name @ yesterday requires that user.name has a type and that yesterday is a temporal expression. The result type is the same as the operand's type -- temporal access changes when, not what.

Ask expressions: ask "query" requires that the query is a text expression. The result type is text -- the AI's response.

Entity construction: User { name: "Juste", age: 30 } requires that User is a defined entity, that all required fields are provided, and that each field's value matches the field's declared type. The result type is Entity("User").

Putting It Together: A Complete Example

Consider this FLIN program:

flinentity User {
    name: text
    age: int
}

user = User { name: "Thales", age: 28 }
greeting = "Hello, " + user.name

The type checker processes it step by step:

Entity declaration. Register User with fields name: text and age: int.

Variable user. Infer the type of User { name: "Thales", age: 28 }. The entity User exists. Field name expects text, and "Thales" is text -- match. Field age expects int, and 28 is int -- match. All fields provided. Result type: Entity("User"). Generalize: user is monomorphic (Entity("User")).

Variable greeting. Infer the type of "Hello, " + user.name. Left operand: "Hello, " is text. Right operand: user.name -- look up user (type Entity("User")), access field name (type text). Operator + on text + text produces text. Result type: text. Generalize: greeting is monomorphic (text).

No type annotations written. No errors. Every expression has a known type. The code generator can now emit the correct bytecode: StoreGlobal for variables, CreateEntity for entity construction, GetField for member access, StringConcat for text addition.

Error Messages

A type inference engine is only as good as its error messages. When unification fails, the type checker needs to explain what went wrong in terms the developer understands, not in terms of type variables and substitution chains.

FLIN's type checker produces errors like:

"Type mismatch: expected int, found text in assignment to count"
"Cannot apply operator + to types bool and int"
"Entity User does not have a field named email"
"Cannot save a value of type int -- save requires an entity type"

Each error includes the span of the problematic expression, so the developer knows exactly where to look. The error messages use the FLIN type names (text, int, bool) rather than internal representations (FlinType::Text, TypeVar(3)), because the developer wrote FLIN code, not Rust code.

The Session in Numbers

Session 8 -- the session that implemented Hindley-Milner -- took approximately 25 minutes and produced these results:

Metric	Value
Duration	~25 minutes
Lines added	~320 (120 types, 200 checker)
New tests	17
Total tests passing	193
Type system completion	100%
Part 1 completion	92%

Full Hindley-Milner polymorphism in about 200 lines of type-checking logic. This is not because the algorithm is trivial -- it is because Rust's type system does so much of the work. The FlinType enum with TypeVar carries polymorphism in the data structure itself. Pattern matching makes the unification cases exhaustive and clear. The HashMap<u32, FlinType> substitution map is all the state the algorithm needs.

Limitations and Future Work

FLIN's current Hindley-Milner implementation has deliberate limitations:

No higher-kinded types. FLIN cannot express "a container of any type" as a first-class concept. Lists are parameterized, but the parameterization is built into the FlinType::List variant, not expressed through a general type-constructor mechanism. This is sufficient for FLIN's domain -- web applications do not typically need higher-kinded polymorphism.

No row polymorphism. Entities have fixed field sets. You cannot write a function that accepts "any entity with a name field." This is a deliberate simplification: FLIN entities are closer to database tables than to structural types, and database tables have fixed schemas.

Value restriction. Generalization only happens at let-bindings (variable declarations), not at arbitrary expression positions. This is the standard value restriction from ML, which prevents unsound polymorphism in the presence of mutable state. Since FLIN variables are mutable, the value restriction is necessary for soundness.

These limitations are trade-offs, not oversights. Each one keeps the type system simpler, the implementation smaller, and the error messages clearer. If FLIN's user base grows to the point where higher-kinded types or row polymorphism are needed, the foundation is there -- the Hindley-Milner core is extensible.

What Came Next

With the type checker complete, every expression in the AST has a known type. The next phase -- code generation -- walks the typed AST and emits bytecode instructions for FLIN's virtual machine. Session 9 implemented the entire code generator in 30 minutes: 75 opcodes, a constant pool with deduplication, jump patching for conditionals and loops, and a disassembler for debugging.

The code generator is covered in the next batch of articles. But first, let us acknowledge what we achieved in the first nine sessions: a lexer, a Pratt parser, an AST, a Hindley-Milner type checker, and a code generator. A complete compilation pipeline from source code to bytecode. Built by a CEO in Abidjan and an AI CTO, in roughly four hours of total session time.

That pipeline compiled the counter example -- count = 0 / <button click={count++}>{count}</button> -- into 26 bytes of bytecode. Twenty-six bytes that represent a reactive, stateful user interface with event handling and automatic DOM updates. From four lines of FLIN to twenty-six bytes of instructions. That is what a compiler does.

This is Part 15 of the "How We Built FLIN" series, documenting how a CEO in Abidjan and an AI CTO built a programming language from scratch.

Series Navigation: - [11] Session 1: Project Setup and 42 Keywords - [12] Building a Lexer From Scratch in Rust - [13] Pratt Parsing: How FLIN Reads Your Code - [14] The Abstract Syntax Tree: FLIN's Internal Representation - [15] Hindley-Milner Type Inference in a Custom Language (you are here)