Skip to main content

Typesense Search

Typesense provides instant full-text search with typo tolerance and faceted filtering. It is the user-facing search engine — when a user types in the search bar, Typesense is what responds.

Engine: Typesense 26
Port: 8108

Typesense vs FalkorDB

CapabilityFalkorDBTypesense
Graph traversal (cross-references, relationships)✅ Primary
Full-text search with typo tolerance✅ Primary
Faceted filtering (by book, testament, language)✅ Primary
Autocomplete / instant results✅ Primary
Structured queries (Cypher)✅ Primary

FalkorDB is the source of truth for all scripture data. Typesense is a read-optimized index synced from FalkorDB during the ingest pipeline (Stage 7).

Collections

Passages

{
"name": "passages",
"fields": [
{ "name": "id", "type": "string" },
{ "name": "book_id", "type": "string", "facet": true },
{ "name": "book_title", "type": "string", "facet": true },
{ "name": "chapter", "type": "int32" },
{ "name": "verse", "type": "int32" },
{ "name": "text", "type": "string" },
{ "name": "translation", "type": "string", "facet": true }
],
"default_sorting_field": "chapter"
}

Facets on book_id, book_title, and translation enable filtered search like "search 'faith' only in Book of Mormon".

Topics

{
"name": "topics",
"fields": [
{ "name": "id", "type": "string" },
{ "name": "title", "type": "string" },
{ "name": "description", "type": "string" },
{ "name": "source", "type": "string", "facet": true }
]
}

The source facet distinguishes between Topical Guide (tg), Bible Dictionary (bd), and Scripture Index (si) entries.

Lexicon

{
"name": "lexicon",
"fields": [
{ "name": "id", "type": "string" },
{ "name": "lemma", "type": "string" },
{ "name": "language", "type": "string", "facet": true },
{ "name": "strongs_id", "type": "string" },
{ "name": "gloss", "type": "string" },
{ "name": "definition", "type": "string" }
]
}

The language facet enables Hebrew-only or Greek-only searches.

Indexing Strategy

Typesense is populated during the ingest pipeline's Stage 7 (search index sync):

graph LR
Corpus["JSON Corpus"] --> Ingest["Ingest Pipeline<br/>(Stages 1-6)"]
Ingest --> FDB["FalkorDB<br/>(graph)"]
FDB --> Stage7["Stage 7:<br/>Search Index Sync"]
Stage7 --> TS["Typesense<br/>(search index)"]
  1. The ingest pipeline writes all data to FalkorDB first (Stages 1-6)
  2. Stage 7 queries FalkorDB for passages, topics, and lexicon entries
  3. Each result is formatted and upserted into the appropriate Typesense collection
  4. Upserts are batched for performance

Search API

The search endpoint supports filtering across collections:

GET /api/v1/search?q=faith&type=passages|topics|lexicon|all

Search parameters:

  • q — search query (typo-tolerant)
  • type — which collections to search (passages, topics, lexicon, all)
  • filter_by — faceted filter (e.g., book_id:=[gen,exod])
  • sort_by — sorting field
  • page / per_page — pagination

Hosting Progression

PhaseProviderNotes
Phase 1Docker containerSingle node, sufficient to 10M documents
Phase 2K8s StatefulSetDedicated node for search workload
Phase 3K8s clusterMulti-node for redundancy

Typesense scales well on a single node to millions of documents. Clustering is needed for redundancy, not performance, at GospeLib's projected scale.