#117 -- Semantic Search and Vector Storage

Traditional text search matches words. You search for "chair" and find documents containing the word "chair." But you miss "seat," "stool," "armchair," and "seating" -- words that mean the same thing but use different letters. Keyword search is fast and predictable, but it does not understand meaning.

Semantic search matches meaning. You search for "comfortable seating for office" and find products described as "ergonomic desk chair," "adjustable swivel stool," and "lumbar support office seat" -- even though none of these descriptions contain the words "comfortable" or "seating." The search understands concepts, not just characters.

FLIN builds semantic search into the type system. A field declared as semantic text is automatically embedded as a vector, indexed for fast similarity search, and queryable with the search keyword. No Pinecone. No Elasticsearch. No vector database to configure and maintain.

The semantic text Type

The semantic modifier on a text field tells FLIN to generate embeddings automatically:

flinentity Product {
    name: text                    // Regular text -- exact match
    description: semantic text    // Semantic -- meaning-based search
    sku: text                     // Regular text -- identifiers
}

When a semantic text field is saved, FLIN: 1. Stores the original text in FlinDB. 2. Generates a vector embedding using the configured AI model. 3. Indexes the embedding in an HNSW (Hierarchical Navigable Small World) index. 4. Associates the embedding with the entity instance.

All four steps happen atomically on save. The developer writes save product and the embedding is generated. There is no separate indexing step, no batch job, no sync process.

The search Keyword

The search keyword performs semantic similarity search:

flinresults = search "comfortable seating for office"
          in Product
          by description
          limit 10

The syntax is: search "query" in Entity by field [limit N]

The query string is embedded using the same model as the field, then compared against all stored embeddings using cosine similarity. Results are ranked by similarity score and the top N are returned as typed FLIN entities.

flinentity Article {
    title: text
    content: semantic text
    summary: semantic text
}

// Search by different semantic fields
by_content = search "machine learning tutorials" in Article by content
by_summary = search "AI beginner guide" in Article by summary

An entity can have multiple semantic fields, each with its own index. Searching by content compares against the full article text. Searching by summary compares against the shorter summaries. The choice depends on the use case -- full-content search for accuracy, summary search for speed.

How Embeddings Work

An embedding is a vector of floating-point numbers (typically 384 to 1536 dimensions) that represents the meaning of a text. Similar texts produce similar vectors. The distance between vectors correlates with semantic distance.

"comfortable office chair"    -> [0.12, -0.45, 0.78, 0.33, ...]
"ergonomic desk seating"      -> [0.11, -0.42, 0.76, 0.35, ...]
"kitchen table with drawers"  -> [-0.55, 0.23, 0.09, -0.41, ...]

The first two vectors are close together (similar meaning). The third is far away (different concept). Cosine similarity measures this distance:

similarity("comfortable office chair", "ergonomic desk seating") = 0.92
similarity("comfortable office chair", "kitchen table with drawers") = 0.23

The HNSW Index

Brute-force similarity search compares the query vector against every stored vector. This is O(n) and becomes unusable at scale. FLIN uses HNSW (Hierarchical Navigable Small World) for approximate nearest neighbor search:

rustpub struct HnswIndex {
    layers: Vec<Vec<Node>>,
    entry_point: usize,
    max_connections: usize,
    ef_construction: usize,
}

impl HnswIndex {
    pub fn search(
        &self,
        query: &[f32],
        k: usize,
        ef_search: usize,
    ) -> Vec<(usize, f32)> {
        // Start from top layer, descend through layers
        let mut current = self.entry_point;

        for layer in (1..self.layers.len()).rev() {
            current = self.greedy_search(query, current, layer);
        }

        // Exhaustive search in bottom layer with beam width ef_search
        self.beam_search(query, current, 0, ef_search, k)
    }
}

HNSW properties: - Query time: O(log n) average - Memory: O(n * d) where d is the embedding dimension - Accuracy: >95% recall at search speeds of millions per second - Insert time: O(log n) average

For a database of 100,000 products, a semantic search query completes in under 5 milliseconds.

Embedding Generation

Embeddings are generated using the configured AI model. FLIN supports multiple embedding providers:

flin// flin.config
{
    "ai": {
        "provider": "anthropic",
        "model": "claude-3-haiku",
        "embedding_model": "text-embedding-3-small"
    }
}

The embedding generation happens in the FLIN runtime:

rustpub async fn generate_embedding(
    text: &str,
    model: &str,
    api_key: &str,
) -> Result<Vec<f32>, EmbeddingError> {
    let response = reqwest::Client::new()
        .post("https://api.openai.com/v1/embeddings")
        .bearer_auth(api_key)
        .json(&json!({
            "input": text,
            "model": model
        }))
        .send()
        .await?;

    let data: EmbeddingResponse = response.json().await?;
    Ok(data.data[0].embedding.clone())
}

For offline or low-latency use cases, FLIN supports local embedding generation through FastEmbed (covered in article 119).

Automatic Embedding on Save

When an entity with a semantic text field is saved, the embedding is generated automatically:

flinproduct = Product {
    name: "Ergonomic Office Chair",
    description: "A comfortable chair designed for long work sessions with adjustable lumbar support, breathable mesh back, and 360-degree swivel base.",
    sku: "CHAIR-ERG-001"
}

save product  // Embedding generated here for description field

If the description is updated, the embedding is regenerated:

flinproduct = Product.find(42)
product.description = "Updated description with new features..."
save product  // Embedding regenerated

The old embedding is replaced atomically. There is no stale index problem.

Multiple Semantic Fields

Entities can have multiple semantic fields, each independently searchable:

flinentity Job {
    title: text
    description: semantic text
    requirements: semantic text
    benefits: semantic text
}

// Search different aspects of the same entity
by_desc = search "remote engineering position" in Job by description
by_reqs = search "python and kubernetes experience" in Job by requirements
by_perks = search "health insurance and remote work" in Job by benefits

Each semantic field has its own HNSW index. The indices are independent -- updating the description does not affect the requirements index.

Search in Templates

Semantic search integrates naturally with FLIN's view templates:

flin// app/products.flin

query = ""

<input placeholder="Search products..."
       value={query}
       input={results = search query in Product by description limit 20}>

{if query.len > 3}
    {for product in results}
        <div class="product-card">
            <h3>{product.name}</h3>
            <p>{product.description}</p>
            <span class="price">${product.price}</span>
        </div>
    {/for}
{/if}

As the user types, the search fires with each input event. Results update reactively. The entire search experience -- from keystroke to rendered results -- is built with FLIN's standard reactivity system.

Performance Characteristics

Operation	Time	Notes
Embedding generation (API)	100-300 ms	Depends on text length and provider
Embedding generation (local)	10-50 ms	With FastEmbed
HNSW insert	< 1 ms	Per document
HNSW search (10K docs)	< 2 ms	Top 10 results
HNSW search (100K docs)	< 5 ms	Top 10 results
HNSW search (1M docs)	< 15 ms	Top 10 results

The bottleneck is embedding generation, not search. For interactive use cases (like the search-as-you-type example above), local embedding generation with FastEmbed is recommended to keep latency under 50 ms.

When to Use semantic vs Regular text

Use Semantic	Use Regular
Long descriptions	Short names and labels
User-generated content	Structured identifiers (SKU, ID)
Search-heavy fields	Exact-match fields
Natural language content	Codes, enums, status values
Multiple possible phrasings	Canonical values

The semantic modifier adds storage overhead (embedding vector per record) and write latency (embedding generation on save). Use it for fields where meaning-based search provides value, not for every text field.

Privacy Considerations

Embedding generation requires sending text to an AI provider (unless using local embeddings). FLIN addresses this:

Only semantic fields are sent for embedding. Regular text fields, passwords, and other entity data stay local.
Text, not context. The field value is sent for embedding, not the entity structure or related data.
Local option. FastEmbed generates embeddings locally without any network call.

For applications with strict data residency requirements, local embedding with FastEmbed eliminates all external data transmission.

Semantic search transforms how users interact with data. Instead of requiring them to know the exact keywords, categories, or tags, they describe what they want in natural language and FLIN finds the closest matches. In the next article, we explore the AI Gateway -- how FLIN connects to eight different AI providers through a single, unified API.

This is Part 117 of the "How We Built FLIN" series, documenting how a CEO in Abidjan and an AI CTO designed and built a programming language from scratch.

Series Navigation: - [116] The Intent Engine: Natural Language Database Queries - [117] Semantic Search and Vector Storage (you are here) - [118] AI Gateway: 8 Providers, One API - [119] FastEmbed Integration for Embeddings