#017 -- The Bytecode Format Explained

FLIN compiles to bytecode. Not JavaScript. Not WASM. Its own instruction set, designed for a language that remembers everything.

Most new programming languages avoid this decision. They compile to JavaScript (for the browser), to LLVM IR (for native code), or to WASM (for portability). Each of these targets is a reasonable choice, and each comes with an ecosystem of tools, optimizers, and runtimes that would take years to replicate.

We built our own instruction set anyway.

The reason is that FLIN is not a general-purpose language. It is a language with built-in data persistence, time travel, reactive views, and AI-powered queries. No existing bytecode format has opcodes for "save this entity to the database," "retrieve this value as it existed yesterday," or "create a DOM element and bind a reactive text node to it." We could have compiled these operations to function calls in someone else's VM, but that would have made FLIN's core features second-class citizens -- library calls instead of native instructions.

This article describes the .flinc binary format: how instructions are encoded, how the constant pool is structured, how view operations work, and how the format supports debugging.

The .flinc File Format

A compiled FLIN program is stored in a .flinc file (FLIN Compiled). The format starts with a 64-byte header, followed by variable-length sections:

Offset    Size    Field
------    ----    -----
0x0000    4       Magic number: 0x464C494E ("FLIN" in ASCII)
0x0004    1       Major version
0x0005    1       Minor version
0x0006    1       Patch version
0x0007    1       Flags
0x0008    4       Constant pool offset
0x000C    4       Constant pool size (entry count)
0x0010    4       Code section offset
0x0014    4       Code section size (bytes)
0x0018    4       Debug info offset (0 if absent)
0x001C    4       Debug info size (bytes)
0x0020    4       Entity schema offset
0x0024    4       Entity schema size
0x0028    4       View section offset
0x002C    4       View section size
0x0030    16      Reserved (zeroed)

The header is exactly 64 bytes -- 0x40 in hexadecimal. This alignment is deliberate. The constant pool begins at offset 0x40, making it easy to locate by inspection when examining a hex dump.

The Rust representation mirrors this layout:

rust#[repr(C)]
pub struct FlincHeader {
    pub magic: [u8; 4],           // "FLIN"
    pub version_major: u8,
    pub version_minor: u8,
    pub version_patch: u8,
    pub flags: u8,
    pub const_pool_offset: u32,
    pub const_pool_count: u32,
    pub code_offset: u32,
    pub code_size: u32,
    pub debug_offset: u32,
    pub debug_size: u32,
    pub entity_offset: u32,
    pub entity_size: u32,
    pub view_offset: u32,
    pub view_size: u32,
    pub reserved: [u8; 16],
}

The magic number serves two purposes: file type identification (so tools can quickly reject non-FLIN files) and byte order detection. The bytes 0x46 0x4C 0x49 0x4E spell "FLIN" in ASCII -- if you open a .flinc file in a hex editor, the first four characters immediately tell you what you are looking at.

The flags byte encodes metadata about the compilation:

rustpub mod Flags {
    pub const DEBUG_INFO: u8      = 0b0000_0001;  // Has debug information
    pub const HAS_VIEWS: u8       = 0b0000_0010;  // Has view definitions
    pub const HAS_ENTITIES: u8    = 0b0000_0100;  // Has entity schemas
    pub const HAS_ROUTES: u8      = 0b0000_1000;  // Has HTTP routes
    pub const OPTIMIZED: u8       = 0b0001_0000;  // Optimizations applied
    pub const WASM_TARGET: u8     = 0b0010_0000;  // Built for WASM
}

A typical development build has flags 0x03 (debug info and views). A production build might have 0x16 (views, entities, optimized). The VM checks these flags at load time to determine which sections are present and how to initialize its subsystems.

All multi-byte values in the header -- and throughout the entire format -- are stored in little-endian byte order.

The Constant Pool

The constant pool sits immediately after the header and stores every value that cannot be encoded directly into an instruction operand. Each entry is tagged with a one-byte type identifier followed by the value data:

Tag 0x00 (Null):     [0x00]                          -- 1 byte
Tag 0x01 (Bool):     [0x01] [val]                     -- 2 bytes
Tag 0x02 (Int):      [0x02] [i64, little-endian]      -- 9 bytes
Tag 0x03 (Float):    [0x03] [f64, little-endian]      -- 9 bytes
Tag 0x04 (String):   [0x04] [len: u32] [UTF-8 bytes]  -- 5 + len bytes
Tag 0x05 (Identifier): [0x05] [len: u16] [UTF-8 bytes] -- 3 + len bytes
Tag 0x06 (EntityName): [0x06] [len: u16] [UTF-8 bytes] -- 3 + len bytes
Tag 0x07 (Function): [0x07] [arity: u8] [addr: u16] [name_idx: u16] -- 6 bytes
Tag 0x08 (Time):     [0x08] [timestamp: i64]          -- 9 bytes
Tag 0x09 (Money):    [0x09] [amount: i64] [currency: u8] -- 10 bytes

The distinction between String (tag 0x04) and Identifier (tag 0x05) is subtle but critical. Strings use a u32 length prefix because they can be arbitrarily long -- a user might write a multi-kilobyte string literal. Identifiers use a u16 length prefix because variable names are short by convention, and saving two bytes per identifier adds up when a program has hundreds of variable references.

The Money type (tag 0x09) deserves special mention. FLIN is designed for applications in Cote d'Ivoire and across West Africa, where currency handling is a first-class concern. The constant pool encodes monetary values as a 64-bit integer amount (in the smallest unit of the currency -- centimes for XOF, cents for USD) plus a one-byte currency code. No floating-point representation, no rounding errors, no "you owe 0.30000000000000004 dollars" bugs.

Here is how a simple program looks in the constant pool:

Source: count = 42, name = "Juste"

Constant Pool:
  [0] Int(42)             -> 02 2A 00 00 00 00 00 00 00
  [1] String("Juste")     -> 04 05 00 00 00 4A 75 73 74 65
  [2] Identifier("count") -> 05 05 00 63 6F 75 6E 74
  [3] Identifier("name")  -> 05 04 00 6E 61 6D 65

The code generator deduplicates constants at compile time. If count appears in ten instructions, the constant pool contains one Identifier("count") entry, and all ten instructions reference index 2.

Instruction Encoding

FLIN bytecode uses variable-length instructions. Each instruction begins with a one-byte opcode, optionally followed by operands. There are five instruction formats:

Format 0 -- No operands (1 byte):
  [opcode]
  Examples: Add, Sub, Mul, Div, Pop, Dup, Return, Halt

Format 1 -- One u8 operand (2 bytes):
  [opcode] [u8]
  Examples: LoadLocal, StoreLocal (slots 0-255)

Format 2 -- One u16 operand (3 bytes):
  [opcode] [u16 low] [u16 high]
  Examples: LoadConst, LoadGlobal, Jump, JumpIfFalse

Format 3 -- One u32 operand (5 bytes):
  [opcode] [u32 byte0] [u32 byte1] [u32 byte2] [u32 byte3]
  Examples: JumpFar, CallNative

Format 4 -- Two u8 operands (3 bytes):
  [opcode] [u8] [u8]
  Examples: Call (arity, const_idx)

The variable-length encoding is a conscious trade-off. Fixed-length instructions (like ARM's 4-byte instructions) simplify the decoder and enable random access into the instruction stream. Variable-length instructions (like x86 or JVM bytecode) produce more compact output at the cost of sequential decoding.

We chose variable-length for two reasons. First, FLIN programs are small -- a typical application fits in a few kilobytes of bytecode, and saving bytes matters when the entire program is loaded into memory at startup. Second, the most common instructions are the shortest. Add, Sub, Pop, Dup, LoadInt0, LoadTrue are all single-byte instructions. In a typical FLIN program, over 60% of instructions are Format 0 (one byte), keeping the average instruction length well under two bytes.

The Opcode Space

The 256 possible opcodes are divided into 16 ranges of 16 opcodes each:

0x00 - 0x0F : Control flow     (Halt, Jump, JumpIfFalse, Call, Return)
0x10 - 0x1F : Stack operations  (LoadConst, Pop, Dup, LoadNone, LoadTrue)
0x20 - 0x2F : Local variables   (LoadLocal, StoreLocal, IncrLocal)
0x30 - 0x3F : Global variables  (LoadGlobal, StoreGlobal)
0x40 - 0x4F : Arithmetic        (Add, Sub, Mul, Div, Neg, Incr)
0x50 - 0x5F : Comparison        (Eq, NotEq, Lt, Gt, IsNone)
0x60 - 0x6F : Logic             (And, Or, Not, BitAnd, ShiftLeft)
0x70 - 0x7F : Objects and fields (CreateObject, GetField, SetField)
0x80 - 0x8F : Lists and maps    (CreateList, GetIndex, MapGet)
0x90 - 0x9F : Entity operations  (Save, Delete, QueryAll, QueryFind)
0xA0 - 0xAF : View operations   (CreateElement, BindText, CreateHandler)
0xB0 - 0xBF : Intent operations  (Ask, Search, Embed)
0xC0 - 0xCF : Temporal operations (AtVersion, AtTime, History, LoadNow)
0xD0 - 0xDF : Built-in functions  (Print, ToString, Len, Split, Trim)
0xE0 - 0xEF : Reserved for extensions
0xF0 - 0xFF : Debug and special   (DebugBreak, SourceLoc, Trace)

This layout reveals the character of the language. Ranges 0x00 through 0x8F are standard fare -- any stack-based bytecode has control flow, stack operations, variables, arithmetic, and data structures. But ranges 0x90 through 0xCF are unique to FLIN. Entity operations, view operations, intent (AI) operations, and temporal operations are first-class instruction categories, each with their own 16-opcode range.

Most importantly, range 0xE0-0xEF is reserved. This gives us 16 opcodes for future language features without breaking binary compatibility. If we add pattern matching, concurrency primitives, or new data types, they have a home.

View Instructions in Detail

The view instruction range (0xA0-0xAF) implements FLIN's reactive UI system at the bytecode level:

Opcode	Mnemonic	Description
0xA0	`CreateElement`	Create a DOM element and push it onto the element stack
0xA1	`CloseElement`	Pop the current element from the element stack
0xA2	`SetAttribute`	Set a static attribute on the current element
0xA3	`BindText`	Bind a reactive text node to the current element
0xA4	`BindAttr`	Bind a reactive attribute
0xA5	`CreateHandler`	Begin an event handler block
0xA6	`EndHandler`	End an event handler block
0xA7	`BindHandler`	Attach the handler to the current element
0xA8	`TriggerUpdate`	Signal that reactive state has changed
0xA9	`StartIf`	Conditional rendering block start
0xAA	`EndIf`	Conditional rendering block end
0xAB	`StartFor`	List rendering block start
0xAC	`NextFor`	Advance to next iteration
0xAD	`EndFor`	List rendering block end
0xAE	`AddText`	Add static text content
0xAF	`SelfClose`	Self-closing element

The CreateHandler/EndHandler/BindHandler triad is the mechanism for event handling. CreateHandler marks the start of a handler body -- the VM captures the instructions between CreateHandler and EndHandler as a callable unit, similar to a closure. BindHandler attaches this captured handler to the current element for the specified event.

TriggerUpdate (0xA8) is the bridge between imperative code and reactive rendering. When a click handler increments a counter, TriggerUpdate tells the VM's reactivity system to re-evaluate all bindings that depend on the changed value. This single instruction replaces the entire "virtual DOM diff" approach used by frameworks like React -- FLIN knows exactly which bindings need updating because the dependencies are recorded at the bytecode level.

Temporal and Intent Instructions

The temporal range (0xC0-0xCF) implements FLIN's time-travel capability:

AtVersion (0xC0): entity, version -> entity    -- entity @ -1
AtTime    (0xC1): entity -> entity              -- entity @ yesterday
AtDate    (0xC2): entity, date_str -> entity    -- entity @ "2024-01-01"
History   (0xC3): entity -> list                -- entity.history
LoadNow   (0xC4): -> time                       -- current timestamp
LoadToday (0xC5): -> time                       -- start of today

AtTime uses a single-byte time code operand to encode named time references:

0x01: now        0x04: tomorrow
0x02: today      0x05: last_week
0x03: yesterday  0x06: last_month
                 0x07: last_year

This means user @ yesterday compiles to two instructions: LoadGlobal (to load the user) and AtTime 0x03 (to request the yesterday version). The VM resolves this by querying FlinDB's temporal index -- every entity mutation is versioned, and the @ operator retrieves historical versions without the programmer writing a single line of query logic.

The intent range (0xB0-0xBF) handles AI-powered operations:

Ask         (0xB0): query -> list               -- ask "users from last week"
Search      (0xB1): query, limit -> list        -- search "chair" in Products
SearchMulti (0xB2): query, limit, fields -> list -- multi-field search
Embed       (0xB3): text -> embedding           -- vector embedding

These opcodes defer to the runtime's AI subsystem, which communicates with embedding models and vector search indexes. From the bytecode perspective, they are ordinary instructions that consume stack values and produce results. The complexity is hidden behind the opcode boundary.

Debug Information

Development builds include a debug section that maps bytecode offsets back to source locations:

rustpub struct LineTable {
    entries: Vec<LineEntry>,
}

pub struct LineEntry {
    pub bytecode_offset: u32,
    pub source_line: u32,
    pub source_column: u16,
}

pub struct LocalVarTable {
    entries: Vec<LocalVar>,
}

pub struct LocalVar {
    pub name_idx: u16,       // Index into constant pool
    pub slot: u8,            // Stack slot
    pub start_offset: u32,   // Scope start (bytecode offset)
    pub end_offset: u32,     // Scope end (bytecode offset)
    pub type_info: u8,       // Type tag
}

The line table enables error messages that point to the correct source line when a runtime error occurs. The local variable table allows the debugger to display variable names and values when stopped at a breakpoint, even though the bytecode itself only uses numeric slot indices.

The debug section is optional. Production builds omit it entirely -- the DEBUG_INFO flag in the header is cleared, and the debug_offset and debug_size fields are zero. This keeps production bytecode minimal.

A Complete Binary Example

To make all of this concrete, here is the complete .flinc output for the counter example:

flincount = 0
<button click={count++}>{count}</button>

In hexadecimal:

; Header (64 bytes)
464C 494E           ; magic: "FLIN"
00 01 00            ; version: 0.1.0
03                  ; flags: DEBUG_INFO | HAS_VIEWS
4000 0000           ; const pool offset: 0x40
0400 0000           ; const pool count: 4
6000 0000           ; code offset: 0x60
1D00 0000           ; code size: 29 bytes
...                 ; remaining header fields

; Constant Pool (at 0x40)
02 0000 0000 0000 0000     ; [0] Int(0)
05 0500 636F 756E 74       ; [1] Identifier("count")
05 0600 6275 7474 6F6E     ; [2] Identifier("button")
05 0500 636C 6963 6B       ; [3] Identifier("click")

; Code Section (at 0x60)
10 0000             ; LoadConst 0
31 0100             ; StoreGlobal 1
A0 0200             ; CreateElement 2
A5 0300             ; CreateHandler 3
30 0100             ; LoadGlobal 1
12                  ; Dup
46                  ; Incr
31 0100             ; StoreGlobal 1
A8                  ; TriggerUpdate
A6                  ; EndHandler
A7                  ; BindHandler
30 0100             ; LoadGlobal 1
A3                  ; BindText
A1                  ; CloseElement
00                  ; Halt

Twenty-nine bytes of code. Four constants. A reactive counter application with click handling, state management, and automatic UI updates. The entire compiled output fits in less than 150 bytes including the header.

This is what it means to design a bytecode format for a specific language instead of targeting a general-purpose VM. Every instruction is relevant. There is no impedance mismatch between the language's semantics and the VM's capabilities. The bytecode is a direct, compact encoding of the programmer's intent.

Design Decisions and Their Consequences

Three decisions shaped this format and will have lasting consequences.

Stack-based rather than register-based. Register-based VMs (like Lua 5 or Dalvik) can be faster because they avoid redundant push/pop sequences. Stack-based VMs (like the JVM or CPython) produce simpler, more compact bytecode and are easier to generate code for. We chose stack-based because the code generator was written in a single session, and simplicity of code generation was more valuable than a 10-15% runtime performance improvement that we could pursue later.

Domain-specific opcodes rather than library calls. Making Save, QueryAll, CreateElement, AtTime, and Ask into opcodes rather than function calls means the VM can dispatch them with a single match arm rather than a call frame setup. It also means the bytecode is self-documenting -- a disassembly of a FLIN program reads like a description of what the program does, not like a stream of generic operations interspersed with library calls.

Fixed 64-byte header rather than a flexible container format. Formats like ELF or Mach-O use complex section tables that allow arbitrary sections. We used a fixed header with known section offsets. This means adding a new section type requires a format version bump. But it also means that loading a .flinc file is a single 64-byte read followed by direct offset lookups -- no parsing of section tables, no variable-length headers, no ambiguity.

This is Part 17 of the "How We Built FLIN" series, documenting how a CEO in Abidjan and an AI CTO built a programming language compiler in sessions measured in minutes, not months.

Next in the series: The sprint that built the entire compiler -- ten sessions, two days, from zero to a working lexer, parser, type checker, code generator, and virtual machine.

#017 -- The Bytecode Format Explained

The .flinc File Format

The Constant Pool

Instruction Encoding

The Opcode Space

View Instructions in Detail

Temporal and Intent Instructions

Debug Information

A Complete Binary Example

Design Decisions and Their Consequences

Responses

Related Articles

Step Zero Wasn’t Enough: How Validating A Constructor But Not The Runtime Took Down Every Déblo Voice Session The Hour We Shipped Real-Time Camera Streaming

The Em-Dash That Killed Production: How One Marketing Tagline In An HTTP Header Took Down Déblo’s Chat For 24 Hours

Six Hours From Empty Page to Apple Review — How We Submitted Déblo to the App Store, Live