Skip to main content

GospeLib JSON Schema Specification

Version: 2.0.0
Status: Active


Table of Contents

  1. Design Principles
  2. Schema Families Overview
  3. Shared Types
  4. Scripture Text
  5. Lexicon
  6. Topical Guide
  7. Bible Dictionary
  8. Scripture Index
  9. Verse Commentary
  10. Scholarly Commentary
  11. Conference Talk
  12. Curriculum Lesson
  13. Church Publication
  14. Cross-References
  15. Proper Names
  16. Versification
  17. Morphology Codes
  18. Theographic
  19. File Layout & Naming Conventions
  20. Controlled Vocabularies
  21. Validation Rules

1. Design Principles

1. Every file declares its own type. The schema field is required on every file. No heuristic detection is needed.

2. All cross-references are structured objects. No human-formatted reference strings anywhere in the data. Every reference to a passage is a PassageRef object: { bookId, chapter, verse?, verseEnd?, chapterEnd? }.

3. bookId is always a canonical slug. The canonical slug (e.g. "gen", "1-enoch", "dc") is the only identifier used for books throughout all schemas. No integers, no display titles, no abbreviations as identifiers.

4. seeAlso arrays are typed discriminated unions. Every element has an explicit type field. No stateful parsing, no positional inference.

5. Optional fields are absent, not null or empty string. A field that has no value is omitted. Consumers test with field !== undefined, not field !== null && field !== "".

6. Presentation concerns are excluded. No page numbers, URL slugs, typesetting size tokens, or InDesign-specific markup. Those concerns belong in the rendering layer.

7. camelCase throughout. No snake_case. No mixed conventions.

8. Arrays contain only populated items. Commentary notes arrays exclude empty notes. Sections arrays exclude null-content sections.

9. Controlled enumerations for categorical fields. pos, corpus, language, witnessLanguage are all enums defined in §20. Free-form originals are preserved in *Raw sibling fields for forward compatibility.

10. Verse enrichments are additive and co-resident. A single Verse object can carry witnesses, words, and notes simultaneously. These are orthogonal enrichments of one concept.


2. Schema Families Overview

The corpus uses fifteen schema families.

SchemaFile pattern
scripture-textcorpus/{bookId}.json
lexiconlexicon/{range}.json
topical-guidetg/{letter}.json
bible-dictionarybd/{letter}.json
scripture-indexindex/{letter}.json
verse-commentarycommentary/{commentaryId}/{bookId}.json
scholarly-commentaryscholarly/{commentaryId}.json
conference-talkconference/{year}-{month}/{talkId}.json
curriculum-lessoncurriculum/{curriculumId}.json
church-publicationpublications/{publicationId}.json
cross-referencescross-references/{bookId}.json
proper-namesproper-names/{letter}.json
versificationversification/{schemeId}.json
morphology-codesmorphology-codes/{language}.json
theographictheographic/{category}.json

3. Shared Types

These types appear across multiple schemas. They are defined once here and referenced by name throughout.

3.1 PassageRef

The canonical representation of a reference to a location in any text.

interface PassageRef {
bookId: string; // Canonical book slug: "gen", "1-enoch", "dc"
chapter: number; // 1-based
verse?: number; // 1-based; absent for chapter-only refs
verseEnd?: number; // Inclusive end verse for ranges: verse 3–7 → verse: 3, verseEnd: 7
chapterEnd?: number; // Inclusive end chapter for cross-chapter ranges
}

Constraints:

  • verseEnd requires verse to be present
  • verseEnd must be ≥ verse
  • chapterEnd must be ≥ chapter
  • If chapterEnd is present, verse and verseEnd refer to the start/end verse within their respective chapters

Examples:

{ "bookId": "gen", "chapter": 1, "verse": 1 }
{ "bookId": "dc", "chapter": 84, "verse": 26, "verseEnd": 27 }
{ "bookId": "heb", "chapter": 7, "verse": 1, "verseEnd": 3 }
{ "bookId": "dc", "chapter": 13 }
{ "bookId": "num", "chapter": 16, "chapterEnd": 18 }

A typed link to a related resource. Used in TG, BD, and Index entries.

type SeeAlsoLink = TopicLink | ArticleLink | PassageLink | PersonLink | PlaceLink;

interface TopicLink {
type: 'topic';
topicId: string; // "tg:angels"
title: string; // "Angels" — denormalized for display without graph lookup
}

interface ArticleLink {
type: 'article';
articleId: string; // "bd:angels"
title: string; // "Angels"
}

interface PassageLink {
type: 'passage';
ref: PassageRef;
}

interface PersonLink {
type: 'person';
personId: string; // "person:aaron.1"
name: string; // "Aaron"
}

interface PlaceLink {
type: 'place';
placeId: string; // "place:ammonihah"
name: string; // "Ammonihah"
}

Why title/name is denormalized on TopicLink/ArticleLink/PersonLink/PlaceLink: The linked node's display name must be available without a graph query for rendering hover cards, autocomplete, and inline tooltips. The title/name fields are always the primary display string from the target node; they are redundant with the target node but intentionally so.

3.3 NoteAnchor

Locates a footnote's attachment point within a verse.

interface NoteAnchor {
wordIndex: number; // 0-based word position in verse.text (split on whitespace)
charOffset: number; // 0-based character offset of the anchor word in verse.text
word: string; // The anchor word/phrase as it appears in verse.text
}

Both wordIndex and charOffset are provided for robustness: wordIndex is the primary locator (simpler for most consumers); charOffset is a secondary verification point. If verse.text at charOffset does not begin with word, the anchor should be flagged as potentially stale.

3.4 MediaRef

A reference to an audio or video resource.

interface MediaRef {
url: string; // Media URL
format?: string; // "mp3", "m3u8", "mp4"
variant?: string; // "audio", "video-720p", etc.
}

3.5 Paragraph

A single paragraph of prose content, used across conference talks, curriculum lessons, and church publications.

interface Paragraph {
pid: string; // Paragraph ID (from data-aid)
text: string; // Plain text content
scriptureRefs?: InlineRef[]; // Scripture references within this paragraph
}

pid corresponds to the data-aid attribute from Church website HTML content. It serves as a stable, paragraph-level identifier for cross-referencing and annotation.

3.6 InlineRef

A scripture reference appearing inline within prose text, paired with its display form.

interface InlineRef {
ref: PassageRef; // The structured scripture reference
displayText: string; // How it appears in text, e.g. "Alma 30:44"
}

3.7 ImageRef

A reference to an image asset, typically from the Church image library.

interface ImageRef {
assetId: string; // Church image asset ID
alt: string; // Alt text
title?: string; // Public title
description?: string; // Public description
creditLine?: string; // Attribution
}

4. Scripture Text

Schema value: "scripture-text"
File pattern: corpus/{bookId}.json

This is the primary schema for any biblical, deuterocanonical, or pseudepigraphical text. All three enrichment types — manuscript witnesses, interlinear word alignment, and scholarly footnotes — are optional and co-resident on the same verse object.

4.1 File Root

interface ScriptureTextFile {
schema: 'scripture-text';
version: string; // semver, e.g. "2.0.0"
bookId: string; // Canonical slug: "gen", "1-enoch", "apoc-ab"
title: string; // Primary display title: "Genesis", "1 Enoch"
fullTitle?: string; // Long form if different: "The First Book of Enoch"
subtitle?: string; // Recension/language qualifier: "Ethiopic", "Slavonic"
abbreviation: string; // Citation form: "Gen.", "1 En.", "Apoc. Ab."
corpus: CorpusType; // See §12.1
language: string; // ISO 639-1 of the text's language: "en", "gez"
translation?: string; // Translation credit: "Charles 1912", "KJV"
introduction?: string; // Markdown prose; absent if none
chapters: Chapter[];
}

corpus (see §12.1 for full enum) distinguishes where in the broader library this text lives: "ot", "nt", "bom", "dc", "pgp", "pseudepigrapha", "apocrypha", "deuterocanonical".

language is the language of this file's text fields (the English translation, "en"). Source-language text is carried in witnesses[].text and words[].token.

fullTitle is only present when the primary title is an abbreviated or conventional form. For 1 Enoch, title: "1 Enoch" and fullTitle: "The First Book of Enoch". For Genesis, title: "Genesis" and fullTitle is absent.

subtitle captures the recension identifier. For 1 Enoch: subtitle: "Ethiopic". For most books: absent.

4.2 Chapter

interface Chapter {
chapter: number;
heading?: string; // Summary heading; absent if none
verses: Verse[];
}

Verse numbering within a chapter is 1-based. Chapter introductions are not modeled as verse 0; they belong in Verse Commentary files (§9).

4.3 Verse

interface Verse {
verse: number;
text: string; // English translation
sourceText?: string; // Full source-language verse
sourceTranslit?: string; // Full transliteration
witnesses?: Witness[]; // Manuscript evidence; absent if none
words?: WordAlignment[]; // Interlinear; absent if none
notes?: VerseNote[]; // Scholarly footnotes; absent if none
}

sourceText and sourceTranslit are verse-level aggregates (the full Hebrew/Greek/Ethiopic verse as a single string). They are present when word-level alignment exists (words) and may also be present independently when only a source-text string (not full alignment) is available.

Co-residence of enrichments: All three of witnesses, words, and notes may be present on the same verse. For example, a verse in 1 Enoch's Book of Watchers can have Ethiopic + Aramaic + Greek witnesses, a scholarly footnote notes, and (if alignment data becomes available) words.

4.4 Witness

A manuscript witness in a specific language attesting this verse.

interface Witness {
language: WitnessLanguage; // See §12.2
script: ScriptType; // See §12.3
text: string; // Raw script text; preserve all diacritics, cantillation
witness?: string; // Manuscript sigla: "4QEn^a", "Sinaiticus", "Synkellos"
edition?: string; // Critical edition: "Milik 1976", "Charles 1906"
hasLacunae?: boolean; // true if text contains DSS gap markers (…, ])
isPartial?: boolean; // true if fragment doesn't cover the full verse
notes?: string; // Free-form textual note about this witness
}

On hasLacunae: DSS Aramaic fragments use for gaps and ] for manuscript-edge boundaries. The hasLacunae flag is set true when either marker is present in text. The markers themselves are preserved verbatim in the text — do not normalize or remove them.

On isPartial: Some witnesses cover only part of a verse (e.g., a DSS fragment begins mid-verse). This flag distinguishes complete verse attestation from partial attestation.

Witness examples:

{
"language": "ethiopic",
"script": "ethiopic",
"text": "ቃለ፡​በረከት፡​ዘሄኖከ፡",
"witness": "Tana 9",
"edition": "Charles 1906"
}
{
"language": "aramaic",
"script": "hebrew",
"text": "…]חנך לבח֯ירין …",
"witness": "4QEn^a",
"edition": "Milik 1976",
"hasLacunae": true,
"isPartial": true
}
{
"language": "greek",
"script": "greek",
"text": "Λόγος εὐλογίας Ἑνώχ, καθὼς εὐλόγησεν ἐκλεκτοὺς δικαίους",
"witness": "Synkellos",
"edition": "Dindorf 1829"
}

4.5 WordAlignment

A single word token aligned between source language and English gloss.

interface WordAlignment {
order: number; // 0-based index in source-language word sequence
gloss: string; // English gloss: "In the beginning", "God", "created"
strongs: string; // Normalized Strong's: "H0430" (always letter + 4-digit zero-padded)
token: string; // Source-language token with all diacritics/cantillation: "אֱלֹהִ֑ים"
}

strongs normalization: Always H or G prefix + 4-digit zero-padded number. Input H430 normalizes to H0430. Input G56 normalizes to G0056.

Word order: order follows the source-language sequence, which is the Hebrew right-to-left reading order (or Greek order). The English gloss sequence may differ from the order sequence when translation reorders words.

Example (Genesis 1:1):

{
"verse": 1,
"text": "In the beginning created God and the heaven and the earth",
"sourceText": "בְּרֵאשִׁ֖ית בָּרָ֣א אֱלֹהִ֑ים אֵ֥ת הַשָּׁמַ֖יִם וְאֵ֥ת הָאָֽרֶץ׃",
"sourceTranslit": "bə·rê·šîṯ bā·rā ʾĕ·lō·hîm; ʾêṯ haš·šā·ma·yim wə·ʾêṯ hā·ʾā·reṣ.",
"words": [
{ "order": 0, "gloss": "In the beginning", "strongs": "H7225", "token": "בְּרֵאשִׁ֖ית" },
{ "order": 1, "gloss": "created", "strongs": "H1254", "token": "בָּרָ֣א" },
{ "order": 2, "gloss": "God", "strongs": "H0430", "token": "אֱלֹהִ֑ים" },
{ "order": 3, "gloss": "and", "strongs": "H0853", "token": "אֵ֥ת" },
{ "order": 4, "gloss": "the heaven", "strongs": "H8064", "token": "הַשָּׁמַ֖יִם" },
{ "order": 5, "gloss": "and", "strongs": "H0853", "token": "וְאֵ֥ת" },
{ "order": 6, "gloss": "the earth", "strongs": "H0776", "token": "הָאָֽרֶץ׃" }
]
}

4.6 VerseNote

A scholarly footnote anchored to a word or phrase in the verse.

interface VerseNote {
anchor: NoteAnchor; // See §3.3
content: NoteBlock[]; // One or more typed content blocks
}
type NoteBlock = AnnotationBlock | ReferenceBlock;

4.6.1 AnnotationBlock

A linguistic, textual, or explanatory annotation.

interface AnnotationBlock {
type: 'annotation';
kind:
| 'gloss' // Explanatory definition or equivalent phrase
| 'alternate' // Alternative textual reading
| 'hebrew' // Hebrew original, cognate, or literal meaning
| 'greek'; // Greek original, cognate, or transliteration
text: string;
}

4.6.2 ReferenceBlock

A cross-reference to another passage.

interface ReferenceBlock {
type: 'reference';
ref: PassageRef;
}

Complete VerseNote example:

{
"anchor": {
"wordIndex": 13,
"charOffset": 67,
"word": "Egypt,"
},
"content": [
{ "type": "annotation", "kind": "hebrew", "text": "Mitsraim" },
{ "type": "reference", "ref": { "bookId": "gen", "chapter": 37, "verse": 25 } },
{ "type": "reference", "ref": { "bookId": "4-ezra", "chapter": 1, "verse": 3 } }
]
}

4.7 Complete Verse Example (all enrichments co-resident)

{
"verse": 1,
"text": "The words of the blessing of Enoch, wherewith he blessed the elect and righteous, who will be living in the day of tribulation, when all the wicked and godless are to be removed.",
"witnesses": [
{
"language": "ethiopic",
"script": "ethiopic",
"text": "ቃለ፡​በረከት፡​ዘሄኖከ፡​ዘከመ፡​ባረከ፡​ኅሩያነ፡​ወጻድቃነ፡",
"witness": "Tana 9",
"edition": "Charles 1906"
},
{
"language": "aramaic",
"script": "hebrew",
"text": "…]חנך לבח֯ירין …",
"witness": "4QEn^a",
"edition": "Milik 1976",
"hasLacunae": true,
"isPartial": true
},
{
"language": "greek",
"script": "greek",
"text": "Λόγος εὐλογίας Ἑνώχ, καθὼς εὐλόγησεν ἐκλεκτοὺς δικαίους",
"witness": "Synkellos",
"edition": "Dindorf 1829"
}
],
"notes": [
{
"anchor": { "wordIndex": 4, "charOffset": 24, "word": "blessing" },
"content": [
{ "type": "annotation", "kind": "greek", "text": "εὐλογία (eulogia)" },
{ "type": "reference", "ref": { "bookId": "gen", "chapter": 27, "verse": 38 } }
]
}
]
}

5. Lexicon

Schema value: "lexicon"
File pattern: lexicon/{range}.json (e.g. lexicon/H0001-H1000.json)

5.1 File Root

interface LexiconFile {
schema: 'lexicon';
version: string;
language: LexiconLanguage; // "hebrew" | "greek" | "aramaic"
range: LexiconRange;
entries: { [strongs: string]: LexiconEntry };
}

interface LexiconRange {
from: string; // "H0001"
to: string; // "H1000"
}

The entries object is keyed by normalized Strong's number (H0430, G3056). Keys use zero-padded 4-digit format.

5.2 LexiconEntry

interface LexiconEntry {
strongs: string; // "H0430" — matches the key
original: string; // Hebrew/Greek Unicode: "אֱלֹהִים"
translit: string; // Transliteration: "elohiym"
pronunciation: string; // Stress-marked pronunciation: "eh-lo-HEEM"
pos: PartOfSpeech; // Controlled enum — see §12.4
posRaw: string; // Original source POS string, preserved verbatim
glosses: string[]; // KJV translation equivalents: ["God", "gods", "divine beings"]
definition: LexiconDefinition;
derivation: LexiconDerivation;
related?: string[]; // Strong's numbers of related entries: ["H0410"]
occurrences?: number; // Total OT/NT occurrence count: 2604
translations?: Translation[]; // NASB frequency table
hasAramaicCognate?: boolean; // Aramaic cognate exists
wordOrigin?: string; // Human-readable etymology note
}

glosses is an array of distinct English equivalents. Empty strings and duplicates are excluded.

pos is a controlled enum (§12.4). posRaw preserves the original uncontrolled string (e.g. "noun masculine and feminine; feminine plural noun; ...") for forward compatibility and audit.

pronunciation is the stress-marked form (e.g. "eh-lo-HEEM").

5.3 LexiconDefinition

interface LexiconDefinition {
short: string; // One-line definition
senses: DefinitionSense[]; // Semantic sense tree
}

interface DefinitionSense {
text: string; // Top-level sense description
subsenses?: string[]; // Sub-senses; always a flat array (max one level of nesting)
}

Each DefinitionSense has a text (the sense) and optional subsenses[] (the sub-items). Sub-sub-senses are not modeled; they are flattened into subsenses.

Example (H0430, Elohim):

"definition": {
"short": "gods in the ordinary sense; used especially of the supreme God; occasionally of magistrates",
"senses": [
{
"text": "plural uses",
"subsenses": ["rulers, judges", "divine ones", "angels", "gods"]
},
{
"text": "plural intensive — singular meaning",
"subsenses": ["god, goddess", "godlike one", "works or special possessions of God", "the true God"]
}
]
}

5.4 LexiconDerivation

interface LexiconDerivation {
description: string; // Human-readable: "plural of H0433"
roots: string[]; // Pre-extracted Strong's cross-refs: ["H0433"]
}

roots is pre-extracted from the description string. Consumers can traverse the etymology graph using roots directly without regex parsing. If description contains no Strong's cross-references, roots is an empty array [].

Example (H0430):

"derivation": {
"description": "plural of H0433",
"roots": ["H0433"]
}

Example (H0216, from H0215):

"derivation": {
"description": "from H0215",
"roots": ["H0215"]
}

Example (primitive word with no root):

"derivation": {
"description": "a primitive word",
"roots": []
}

5.5 Translation

interface Translation {
word: string; // English rendering: "God"
count: number; // Occurrences in NASB: 2326
}

translations is ordered descending by count.

Example (H0430):

"translations": [
{ "word": "God", "count": 2326 },
{ "word": "gods", "count": 204 },
{ "word": "god", "count": 45 },
{ "word": "judges","count": 3 },
{ "word": "great", "count": 2 }
]

5.6 Complete Entry Example

{
"strongs": "H0430",
"original": "אֱלֹהִים",
"translit": "elohiym",
"pronunciation": "eh-lo-HEEM",
"pos": "noun.masculine",
"posRaw": "Noun Masculine",
"glosses": ["God", "gods", "divine beings", "angels", "judges", "mighty"],
"definition": {
"short": "gods in the ordinary sense; specifically the supreme God; occasionally magistrates",
"senses": [
{
"text": "plural uses",
"subsenses": ["rulers, judges", "divine ones", "angels", "gods"]
},
{
"text": "plural intensive — singular meaning",
"subsenses": [
"god, goddess",
"godlike one",
"works or special possessions of God",
"the true God"
]
}
]
},
"derivation": {
"description": "plural of H0433",
"roots": ["H0433"]
},
"occurrences": 2604,
"translations": [
{ "word": "God", "count": 2326 },
{ "word": "gods", "count": 204 },
{ "word": "god", "count": 45 },
{ "word": "judges", "count": 3 }
]
}

6. Topical Guide

Schema value: "topical-guide"
File pattern: tg/{letter}.json

6.1 File Root

interface TopicalGuideFile {
schema: 'topical-guide';
version: string;
letter: string; // "A"
entries: TGEntry[];
}

6.2 TGEntry

interface TGEntry {
id: string; // "tg:angels" — stable slug, unique across all TG entries
title: string; // "Angels"
passages: TGPassage[]; // Curated scripture citations with quotes
seeAlso: SeeAlsoLink[]; // Typed links — see §3.2
}

id derivation: Lowercase, spaces to hyphens, punctuation removed, prefixed with "tg:". Examples: "Angels""tg:angels", "Aaronic Priesthood""tg:aaronic-priesthood", "Abomination of Desolation""tg:abomination-of-desolation".

6.3 TGPassage

interface TGPassage {
ref: PassageRef;
quote: string; // "three men stood by him" — the curated quote from the passage
}

ref is always a PassageRef object.

On parallel refs: When a passage parallels another (e.g. Heb. 12:22 and D&C 76:67), this becomes two separate TGPassage entries: one for each passage, both carrying the same quote. This is cleaner than a parallel field — both passages genuinely have the same quotation attached.

6.4 seeAlso Resolution

All seeAlso entries are fully resolved typed links:

Example inputSeeAlsoLink
"Messenger"{ type: "topic", topicId: "tg:messenger", title: "Messenger" }
"BD Angels"{ type: "article", articleId: "bd:angels", title: "Angels" }
"Gen. 16:9"{ type: "passage", ref: { bookId: "gen", chapter: 16, verse: 9 } }
"D&C 45:28–33"{ type: "passage", ref: { bookId: "dc", chapter: 45, verse: 28, verseEnd: 33 } }

Continuation refs (bare chapter:verse entries that inherit book from a preceding seeAlso entry) are pre-resolved. No consumer needs to implement stateful parsing.

6.5 Complete Entry Example

{
"id": "tg:angels",
"title": "Angels",
"passages": [
{ "ref": { "bookId": "gen", "chapter": 18, "verse": 2 }, "quote": "three men stood by him" },
{
"ref": { "bookId": "gen", "chapter": 19, "verse": 1 },
"quote": "there came two angels to Sodom at even"
},
{ "ref": { "bookId": "dc", "chapter": 13 }, "quote": "keys of the ministering of angels" },
{
"ref": { "bookId": "dc", "chapter": 129, "verse": 1 },
"quote": "Angels, who are resurrected personages, having bodies"
}
],
"seeAlso": [
{ "type": "topic", "topicId": "tg:angels-ministering", "title": "Angels, Ministering" },
{ "type": "topic", "topicId": "tg:messenger", "title": "Messenger" },
{ "type": "article", "articleId": "bd:angels", "title": "Angels" },
{ "type": "passage", "ref": { "bookId": "gen", "chapter": 16, "verse": 9 } },
{ "type": "passage", "ref": { "bookId": "mosiah", "chapter": 27, "verse": 18 } }
]
}

7. Bible Dictionary

Schema value: "bible-dictionary"
File pattern: bd/{letter}.json

7.1 File Root

interface BibleDictionaryFile {
schema: 'bible-dictionary';
version: string;
letter: string;
entries: BDEntry[];
}

7.2 BDEntry

interface BDEntry {
id: string; // "bd:aaron" — stable slug prefixed with "bd:"
title: string; // "Aaron"
body: string; // Article prose in Markdown
seeAlso: SeeAlsoLink[]; // Typed links to TG entries, passages, other BD entries
}

body is the article body in Markdown.

Prose refs are not pre-extracted. The body prose contains inline scripture references (e.g. "Ex. 4:10–16") as human-readable text. These are not converted to structured objects within the body — the article prose must remain readable as-is. The ingest script extracts them with the Commentary Reference Extractor. This is intentional: pre-extracting from prose is lossy (context is lost) and transforms a human-readable document into a parse artifact.

seeAlso uses the same SeeAlsoLink union as TG. BD entries may link to other BD entries (type: "article"), TG topics (type: "topic"), or specific passages (type: "passage").

7.3 Complete Entry Example

{
"id": "bd:apocrypha",
"title": "Apocrypha",
"body": "Secret or hidden. By this word is generally meant those sacred books of the Jewish people that were not included in the Hebrew Bible (see Canon). They are valuable as forming a link connecting the Old and New Testaments and are regarded in the Church as useful reading, although not all the books are of equal value. They are the subject of a revelation recorded in D&C 91...",
"seeAlso": [
{ "type": "topic", "topicId": "tg:canon", "title": "Canon" },
{ "type": "article", "articleId": "bd:canon", "title": "Canon" }
]
}

8. Scripture Index

Schema value: "scripture-index"
File pattern: index/{letter}.json

8.1 File Root

interface ScriptureIndexFile {
schema: 'scripture-index';
version: string;
letter: string;
entries: IndexEntry[];
}

8.2 IndexEntry (discriminated union)

type IndexEntry = PersonEntry | PlaceEntry | TopicEntry;

All three types share a common base shape:

interface IndexEntryBase {
id: string; // See §8.6
title: string; // Full display title, normalized (see §8.5)
description?: string; // Short identifier phrase
appearances: Appearance[];
seeAlso: SeeAlsoLink[];
}

8.3 PersonEntry

interface PersonEntry extends IndexEntryBase {
type: 'person';
name: string; // Base name without disambiguator: "Aaron"
disambiguator?: number; // 1, 2, 3 — distinguishes persons with the same name
dates?: string; // Temporal context extracted from description: "c. 100 B.C."
}

disambiguator is present when multiple persons share the same name. It is absent for persons with no ambiguity.

dates is extracted from the description field when it contains a parenthetical date phrase. Example: "son of Mosiah2 [c. 100 B.C.]"dates: "c. 100 B.C.". The description field retains the full original string.

8.4 PlaceEntry

interface PlaceEntry extends IndexEntryBase {
type: 'place';
name: string;
uncertain?: boolean; // true when identification is uncertain
uncertaintyNote?: string; // "possibly two different cities" — extracted from brackets
}

8.5 TopicEntry

interface TopicEntry extends IndexEntryBase {
type: 'topic';
}

8.6 Title Normalization

Title strings may embed structured data. These are extracted into explicit fields:

Source titletitleExtracted fields
"Aaron1""Aaron (1)"name: "Aaron", disambiguator: 1
"Aaron4""Aaron (4)"name: "Aaron", disambiguator: 4
"Aaron, City of [possibly two different cities]""Aaron, City of"uncertain: true, uncertaintyNote: "possibly two different cities"
"Aiath [possibly Ai]""Aiath"uncertain: true, uncertaintyNote: "possibly Ai"
"Abide""Abide"(no change)

The title field is always clean for display. Structural information lives in typed sibling fields.

8.7 ID Derivation

PersonEntry: "person:{normalized_name}.{disambiguator}"
"Aaron1" → "person:aaron.1"
"Alma3" → "person:alma.3"
"Abel" → "person:abel"

PlaceEntry: "place:{normalized_name}"
"Ammonihah" → "place:ammonihah"

TopicEntry: "topic:{normalized_title}"
"Abide" → "topic:abide"
"Aaronic Priesthood" → "topic:aaronic-priesthood"

Normalization: lowercase, trim, strip punctuation except hyphens, collapse spaces to hyphens.

8.8 Appearance

interface Appearance {
ref: PassageRef;
summary: string; // Narrative description: "leads brethren to land of Nephi"
}

summary is a narrative phrase (Index style) not a quotation (TG style). It describes what happens in the passage, not what the passage says.

8.9 Complete Entry Examples

{
"id": "person:aaron.1",
"type": "person",
"title": "Aaron (1)",
"name": "Aaron",
"disambiguator": 1,
"description": "brother of Moses",
"appearances": [
{
"ref": { "bookId": "dc", "chapter": 84, "verse": 18 },
"summary": "the Lord confirmed priesthood on Aaron and his seed"
},
{
"ref": { "bookId": "dc", "chapter": 84, "verse": 27 },
"summary": "law of carnal commandments continues with house of Aaron"
}
],
"seeAlso": [
{ "type": "topic", "topicId": "tg:priesthood-aaronic", "title": "Priesthood, Aaronic" },
{ "type": "article", "articleId": "bd:aaron", "title": "Aaron" }
]
}
{
"id": "place:aaron-city",
"type": "place",
"title": "Aaron, City of",
"uncertain": true,
"uncertaintyNote": "possibly two different cities",
"appearances": [
{
"ref": { "bookId": "alma", "chapter": 8, "verse": 13, "verseEnd": 14 },
"summary": "was in vicinity of Ammonihah"
}
],
"seeAlso": []
}

9. Verse Commentary

Schema value: "verse-commentary"
File pattern: commentary/{commentaryId}/{bookId}.json

9.1 File Root

interface VerseCommentaryFile {
schema: 'verse-commentary';
version: string;
commentaryId: string; // "clarke" — stable identifier for this commentary
commentaryTitle: string; // "Adam Clarke's Commentary"
bookId: string; // "gen"
bookTitle: string; // "Genesis"
testament: 'ot' | 'nt';
sourceUrl?: string; // Base attribution URL; may be overridden per chapter
chapters: CommentaryChapter[];
}

commentaryId is the stable identifier for the commentary series.

sourceUrl is the file-level source URL. If all chapters share the same base URL with a chapter-specific suffix, it lives here and can be supplemented per chapter.

9.2 CommentaryChapter

interface CommentaryChapter {
chapter: number;
sourceUrl?: string; // Chapter-specific URL override
introduction?: string; // Chapter introduction note (Markdown)
notes: CommentaryNote[]; // Only non-empty notes; empty-text notes are excluded
}

notes only contains populated entries. A verse absent from notes simply has no commentary — no empty sentinel needed.

introduction is a first-class named field for chapter-level introductory notes.

9.3 CommentaryNote

interface CommentaryNote {
verse: number;
body: string; // Commentary prose
}

body is the commentary text. Named body for consistency with BDEntry.body and ScholarlySection.body.

9.4 Complete File Example (abbreviated)

{
"schema": "verse-commentary",
"version": "2.0.0",
"commentaryId": "clarke",
"commentaryTitle": "Adam Clarke's Commentary",
"bookId": "gen",
"bookTitle": "Genesis",
"testament": "ot",
"sourceUrl": "https://biblehub.com/commentaries/clarke/genesis/",
"chapters": [
{
"chapter": 1,
"introduction": "Preface to the Book of Genesis. Every believer in Divine revelation...",
"notes": [
{
"verse": 1,
"body": "God in the beginning created the heavens and the earth — בראשית ברא אלהים... The original word אלהים Elohim, God, is certainly the plural form of אל El..."
},
{
"verse": 2,
"body": "The earth was without form and void — The original term תהו tohu and בהו bohu, which we translate without form and void, are of uncertain etymology..."
}
]
}
]
}

10. Scholarly Commentary

Schema value: "scholarly-commentary"
File pattern: scholarly/{commentaryId}.json

10.1 File Root

interface ScholarlyCommentaryFile {
schema: 'scholarly-commentary';
version: string;
commentaryId: string; // "byu-ntc-revelation"
title: string; // "The Revelation of John the Apostle"
bookId: string; // "rev" — the biblical book being commented on
authors?: string[]; // ["John W. Welch", "Loren D. Lybarger"]
publisher?: string; // "BYU Studies"
year?: number; // 2012
topics: ScholarlyTopic[];
}

bookId is always a canonical slug string.

topics contains the thematic divisions of a scholarly book commenting on a biblical text, not biblical chapters.

10.2 ScholarlyTopic

interface ScholarlyTopic {
title: string; // "The Dragon, the Woman, and the Child"
subtitle?: string;
sections: ScholarlySection[]; // Only sections with content are included
}

sections only contains entries with content. Headings without body content are omitted.

10.3 ScholarlySection

interface ScholarlySection {
heading?: string; // Section heading; absent if not labeled
body: string; // Prose content — always present (null-content sections excluded)
}

10.4 Complete File Example (abbreviated)

{
"schema": "scholarly-commentary",
"version": "2.0.0",
"commentaryId": "byu-ntc-revelation",
"title": "The Revelation of John the Apostle",
"bookId": "rev",
"topics": [
{
"title": "Introduction to the Vision",
"sections": [
{
"heading": "Introduction",
"body": "In Revelation, John finishes what Luke set out to do in Acts. In his history, the Evangelist indicated that what motivated him to write his Gospel was his desire to preserve all that 'Jesus began both to do and teach' (Acts 1:1)..."
},
{
"heading": "Conclusion",
"body": "The central message in chapter 1 hinges on the authenticity of John's call and underpins the entire book of Revelation. It is summarized in the shared title of both the Father and the Son, namely, 'the Almighty'..."
}
]
}
]
}

11. Conference Talk

Schema value: "conference-talk"
File pattern: conference/{year}-{month}/{talkId}.json

This schema captures General Conference talks and similar addresses. Content is organized as prose sections (not chapter-verse). Scripture references are extracted from body text and collected at both the paragraph level and the file level.

11.1 File Root

interface ConferenceTalkFile {
schema: 'conference-talk';
version: string; // semver, e.g. "2.0.0"
talkId: string; // Canonical ID, e.g. "2024-10-57nelson"
title: string; // Talk title
speaker: Speaker; // Who delivered the talk
conference: ConferenceRef; // Which conference
dateDelivered: string; // ISO 8601 date
session: string; // e.g. "saturday-morning", "sunday-afternoon"
summary?: string; // Brief description/subtitle
audio?: MediaRef; // Audio URL + format (§3.4)
video?: MediaRef[]; // Video sources (HLS, MP4)
pdf?: string; // PDF URL
sections: TalkSection[]; // Ordered prose sections
footnotes?: Footnote[]; // Numbered footnotes
scriptureRefs: PassageRef[]; // All scripture references mentioned (extracted from body)
seeAlso?: SeeAlsoLink[]; // Related talks, topics, etc.
}

talkId is the canonical slug derived from the Church website URI (e.g. "2024-10-57nelson" from /general-conference/2024/10/57nelson).

scriptureRefs is a file-level aggregate of all PassageRef objects mentioned anywhere in the talk — in paragraphs, footnotes, or other body text. This provides a quick lookup without traversing the section tree.

11.2 Speaker

interface Speaker {
name: string; // "Russell M. Nelson"
title?: string; // "President of The Church of Jesus Christ of Latter-day Saints"
uri?: string; // Church website URI for the speaker
}

11.3 ConferenceRef

interface ConferenceRef {
year: number; // 2024
month: number; // 4 or 10
name?: string; // "194th Annual General Conference"
}

11.4 TalkSection

interface TalkSection {
heading?: string; // Section heading (if any)
paragraphs: Paragraph[]; // Ordered paragraphs (§3.5)
}

11.5 Footnote

interface Footnote {
marker: string; // "1", "2", etc.
pid?: string; // Paragraph ID this footnote belongs to
text: string; // Footnote text
scriptureRefs?: InlineRef[]; // Scripture refs within footnote (§3.6)
}

11.6 Complete File Example (abbreviated)

{
"schema": "conference-talk",
"version": "2.0.0",
"talkId": "2024-10-57nelson",
"title": "The Lord Jesus Christ Will Come Again",
"speaker": {
"name": "Russell M. Nelson",
"title": "President of The Church of Jesus Christ of Latter-day Saints"
},
"conference": { "year": 2024, "month": 10 },
"dateDelivered": "2024-10-06",
"session": "sunday-afternoon",
"audio": { "url": "https://media2.ldscdn.org/assets/...", "format": "mp3" },
"sections": [
{
"paragraphs": [
{
"pid": "p1",
"text": "My dear brothers and sisters, ...",
"scriptureRefs": [
{ "ref": { "bookId": "isa", "chapter": 2, "verse": 2 }, "displayText": "Isaiah 2:2" }
]
}
]
}
],
"footnotes": [
{
"marker": "1",
"pid": "p3",
"text": "See Doctrine and Covenants 1:38.",
"scriptureRefs": [
{
"ref": { "bookId": "dc", "chapter": 1, "verse": 38 },
"displayText": "Doctrine and Covenants 1:38"
}
]
}
],
"scriptureRefs": [{ "bookId": "isa", "chapter": 2, "verse": 2 }]
}

12. Curriculum Lesson

Schema value: "curriculum-lesson"
File pattern: curriculum/{curriculumId}.json

This schema captures Come, Follow Me and other structured lesson/manual content. Each file contains an entire curriculum year's worth of lessons.

12.1 File Root

interface CurriculumLessonFile {
schema: 'curriculum-lesson';
version: string; // semver, e.g. "2.0.0"
curriculumId: string; // e.g. "cfm-bofm-2024"
title: string; // Curriculum name
year?: number; // Curriculum year
bookId?: string; // Associated book of scripture (if any)
lessons: Lesson[]; // Ordered lessons
}

curriculumId is a kebab-case slug combining the curriculum type and year (e.g. "cfm-bofm-2024" for Come, Follow Me — Book of Mormon 2024).

bookId is the canonical book slug when the curriculum covers a single book of scripture. For curricula covering multiple books, this field is absent.

12.2 Lesson

interface Lesson {
lessonId: string; // e.g. "02" or "week-02"
title: string; // "1 Nephi 1–5"
subtitle?: string; // Week/date labeling
dateStart?: string; // ISO 8601 start date
dateEnd?: string; // ISO 8601 end date
passageRange?: PassageRef; // Scripture range covered
sections: LessonSection[]; // Lesson content sections
scriptureRefs: PassageRef[]; // All scripture references
seeAlso?: SeeAlsoLink[];
}

dateStart and dateEnd correspond to data-date-start and data-date-end attributes from the Church website's CFM manifest.

passageRange uses the chapterEnd field on PassageRef to represent multi-chapter ranges (e.g. { bookId: "1-ne", chapter: 1, chapterEnd: 5 }).

12.3 LessonSection

interface LessonSection {
heading?: string; // Section heading
paragraphs: Paragraph[]; // Reuses Paragraph type (§3.5)
}

12.4 Complete File Example (abbreviated)

{
"schema": "curriculum-lesson",
"version": "2.0.0",
"curriculumId": "cfm-bofm-2024",
"title": "Come, Follow Me — Book of Mormon 2024",
"year": 2024,
"bookId": "bofm",
"lessons": [
{
"lessonId": "02",
"title": "1 Nephi 1–5",
"dateStart": "2024-01-08",
"dateEnd": "2024-01-14",
"passageRange": { "bookId": "1-ne", "chapter": 1, "chapterEnd": 5 },
"sections": [
{
"heading": "Ideas for Personal Scripture Study",
"paragraphs": [{ "pid": "p1", "text": "Nephi kept a record of his people..." }]
}
],
"scriptureRefs": [{ "bookId": "1-ne", "chapter": 1, "verse": 1 }]
}
]
}

13. Church Publication

Schema value: "church-publication"
File pattern: publications/{publicationId}.json

A flexible schema for manual chapters, magazine articles, Teachings of Presidents chapters, and other authored prose content from the Church of Jesus Christ.

13.1 File Root

interface ChurchPublicationFile {
schema: 'church-publication';
version: string; // semver, e.g. "2.0.0"
publicationId: string; // e.g. "gospel-principles", "teachings-joseph-smith"
title: string; // Publication title
publicationType: PublicationType; // See §20.6
authors?: string[]; // Author names (if applicable)
publisher?: string; // "The Church of Jesus Christ of Latter-day Saints"
datePublished?: string; // ISO 8601
chapters?: PublicationChapter[];
}

publicationType is a controlled enumeration (§20.6) distinguishing the kind of publication.

13.2 PublicationChapter

interface PublicationChapter {
chapterId: string; // "chapter-1" or numeric
title: string; // Chapter title
subtitle?: string; // Chapter subtitle
sections: PublicationSection[]; // Ordered sections
discussionQuestions?: string[]; // "Suggestions for Study and Teaching"
additionalScriptures?: InlineRef[]; // "Additional Scriptures" list (§3.6)
footnotes?: Footnote[]; // Numbered notes (§11.5)
scriptureRefs: PassageRef[]; // All scripture references in this chapter
images?: ImageRef[]; // Embedded images (§3.7)
audio?: MediaRef; // Audio narration (§3.4)
}

13.3 PublicationSection

interface PublicationSection {
heading?: string; // Section heading
paragraphs: Paragraph[]; // Reuses Paragraph type (§3.5)
}

13.4 Complete File Example (abbreviated)

{
"schema": "church-publication",
"version": "2.0.0",
"publicationId": "gospel-principles",
"title": "Gospel Principles",
"publicationType": "manual-chapter",
"publisher": "The Church of Jesus Christ of Latter-day Saints",
"datePublished": "2011-01-01",
"chapters": [
{
"chapterId": "chapter-1",
"title": "Our Heavenly Father",
"sections": [
{
"heading": "There Is a God",
"paragraphs": [
{
"pid": "128041362",
"text": "Alma, a Book of Mormon prophet, wrote, \"All things denote there is a God...\"",
"scriptureRefs": [
{
"ref": { "bookId": "alma", "chapter": 30, "verse": 44 },
"displayText": "Alma 30:44"
}
]
}
]
}
],
"discussionQuestions": [
"What are some things that testify to you that there is a God?",
"What are some of God's attributes?",
"How can we come to know God?"
],
"additionalScriptures": [
{
"ref": { "bookId": "acts", "chapter": 7, "verse": 55, "verseEnd": 56 },
"displayText": "Acts 7:55–56"
}
],
"scriptureRefs": [
{ "bookId": "alma", "chapter": 30, "verse": 44 },
{ "bookId": "dc", "chapter": 20, "verse": 17 }
]
}
]
}

14. Cross-References

Schema value: "cross-references"
File pattern: cross-references/{bookId}.json

A standalone schema for cross-references between passages. Each file contains all cross-references originating from a single book of scripture, drawn from one or more external reference sources (e.g., Treasury of Scripture Knowledge, LDS edition footnotes). References are typed by kind, scored by relevance, optionally anchored to a specific word in the source verse, and may be unidirectional or bidirectional.

14.1 CrossRefKind

type CrossRefKind =
| 'thematic' // Shared theme or doctrine across passages
| 'parallel' // Parallel accounts of the same event or teaching
| 'quotation' // Direct quotation of one passage by another
| 'allusion' // Indirect reference or echo of another passage
| 'typology' // Type-antitype relationship (OT event foreshadowing NT fulfillment)
| 'prophecy' // Prophetic utterance and its fulfillment
| 'commentary' // One passage provides commentary or explanation of another
| 'linguistic'; // Shared key term, phrase, or word root across passages

14.2 CrossRefSource

interface CrossRefSource {
sourceId: string; // Unique within this file: "tsk", "lds-footnotes"
name: string; // Human-readable: "Treasury of Scripture Knowledge"
version?: string; // Source edition/version
url?: string; // Attribution URL
}

14.3 CrossRefAnchor

interface CrossRefAnchor {
wordIndex: number; // 0-based position in the verse's word array
word: string; // The anchor word as it appears in the verse text
}

wordIndex is the 0-based index into the verse text split on whitespace. This locates which word in the source verse the cross-reference is anchored to (e.g., a quotation anchored to the word "Egypt" in Matthew 2:15).

14.4 CrossReference

interface CrossReference {
source: PassageRef; // The originating passage
target: PassageRef; // The referenced passage
sourceId: string; // Must match a CrossRefSource.sourceId in the same file
relevance: number; // 0–100 integer score
anchor?: CrossRefAnchor; // Word-level anchor in the source verse
kind: CrossRefKind; // Relationship type
bidirectional: boolean; // If true, the reference applies in both directions
}

relevance is an integer from 0 (weakest association) to 100 (strongest / direct quotation). This score is source-dependent — different reference sources may use different scoring methodologies.

bidirectional controls whether the ingest pipeline creates CROSS_REF edges in both directions. A thematic parallel (e.g., Isaiah 53:5 ↔ 2 Nephi 9:21) is typically bidirectional; a quotation (e.g., Matthew 2:15 quoting Hosea 11:1) is unidirectional from the quoting passage to the quoted passage.

14.5 File Root

interface CrossReferenceFile {
schema: 'cross-references';
version: string; // semver, e.g. "2.0.0"
bookId: string; // Canonical book slug — all references in this file originate from this book
sources: CrossRefSource[]; // Reference sources used in this file
references: CrossReference[]; // All cross-references from this book
}

14.6 Complete File Example

{
"schema": "cross-references",
"version": "2.0.0",
"bookId": "isa",
"sources": [
{
"sourceId": "tsk",
"name": "Treasury of Scripture Knowledge",
"version": "1.0",
"url": "https://www.biblestudytools.com/tsk/"
},
{
"sourceId": "lds-footnotes",
"name": "LDS Edition Footnotes",
"version": "2013"
}
],
"references": [
{
"source": { "bookId": "isa", "chapter": 53, "verse": 5 },
"target": { "bookId": "2-ne", "chapter": 9, "verse": 21 },
"sourceId": "lds-footnotes",
"relevance": 85,
"kind": "thematic",
"bidirectional": true
},
{
"source": { "bookId": "isa", "chapter": 7, "verse": 14 },
"target": { "bookId": "matt", "chapter": 1, "verse": 23 },
"sourceId": "tsk",
"relevance": 95,
"anchor": { "wordIndex": 5, "word": "virgin" },
"kind": "prophecy",
"bidirectional": false
}
]
}

A second example showing a quotation with anchor:

{
"schema": "cross-references",
"version": "2.0.0",
"bookId": "matt",
"sources": [
{
"sourceId": "tsk",
"name": "Treasury of Scripture Knowledge",
"version": "1.0",
"url": "https://www.biblestudytools.com/tsk/"
},
{
"sourceId": "lds-footnotes",
"name": "LDS Edition Footnotes",
"version": "2013"
}
],
"references": [
{
"source": { "bookId": "matt", "chapter": 2, "verse": 15 },
"target": { "bookId": "hosea", "chapter": 11, "verse": 1 },
"sourceId": "tsk",
"relevance": 98,
"anchor": { "wordIndex": 5, "word": "Egypt" },
"kind": "quotation",
"bidirectional": false
}
]
}

15. Proper Names

Schema value: "proper-names"
File pattern: proper-names/{letter}.json

People, places, objects, and deities referenced across scripture. Each file groups entries by the first letter of the canonical name.

interface ProperNameEntry {
id: string;
name: string;
alternateNames: string[];
description: string;
passages: PassageRef[];
category: ProperNameCategory; // See §20.8
}

interface ProperNamesFile {
schema: 'proper-names';
version: string;
language: string;
entries: Record<string, ProperNameEntry>;
}

Example (proper-names/a.json):

{
"schema": "proper-names",
"version": "2.0.0",
"language": "en",
"entries": {
"abraham": {
"id": "abraham",
"name": "Abraham",
"alternateNames": ["Abram"],
"description": "Patriarch of the covenant, father of Isaac, called out of Ur of the Chaldees.",
"passages": [
{ "bookId": "gen", "chapter": 12, "verse": 1 },
{ "bookId": "abr", "chapter": 1, "verse": 1 },
{ "bookId": "2-ne", "chapter": 8, "verse": 2 }
],
"category": "PERSON"
},
"ararat": {
"id": "ararat",
"name": "Ararat",
"alternateNames": [],
"description": "Mountain range where Noah's ark rested after the flood.",
"passages": [{ "bookId": "gen", "chapter": 8, "verse": 4 }],
"category": "PLACE"
}
}
}

Field reference:

FieldTypeRequiredDescription
schema"proper-names"Discriminator
versionstringSemver version
languagestringISO language code (e.g. "en", "he")
entriesRecord<string, Entry>Map of entry ID → entry
entry.idstringCanonical slug identifier
entry.namestringDisplay name
entry.alternateNamesstring[]Alternate spellings or names (empty array if none)
entry.descriptionstringBrief description
entry.passagesPassageRef[]Key scripture references
entry.categoryProperNameCategoryOne of PERSON, PLACE, OBJECT, DEITY — see §20.8

16. Versification

Schema value: "versification"
File pattern: versification/{schemeId}.json

Maps verse-level differences between versification schemes (e.g. Masoretic → LXX, KJV → Vulgate). Each file contains one or more named schemes with their full set of mappings.

interface VersificationMapping {
sourceRef: PassageRef;
targetRef: PassageRef;
mappingType: VersificationType; // See §20.9
}

interface VersificationScheme {
id: string;
name: string;
description: string;
mappings: VersificationMapping[];
}

interface VersificationFile {
schema: 'versification';
version: string;
schemes: VersificationScheme[];
}

Example (versification/masoretic-to-lxx.json):

{
"schema": "versification",
"version": "2.0.0",
"schemes": [
{
"id": "masoretic-to-lxx",
"name": "Masoretic → LXX",
"description": "Verse mapping from the Masoretic Text to the Septuagint.",
"mappings": [
{
"sourceRef": { "bookId": "gen", "chapter": 31, "verse": 55 },
"targetRef": { "bookId": "gen", "chapter": 32, "verse": 1 },
"mappingType": "REORDER"
},
{
"sourceRef": { "bookId": "ps", "chapter": 9, "verse": 1, "verseEnd": 20 },
"targetRef": { "bookId": "ps", "chapter": 9, "verse": 1, "verseEnd": 39 },
"mappingType": "MERGE"
},
{
"sourceRef": { "bookId": "mal", "chapter": 4, "verse": 1, "verseEnd": 6 },
"targetRef": { "bookId": "mal", "chapter": 3, "verse": 19, "verseEnd": 24 },
"mappingType": "SPLIT"
},
{
"sourceRef": { "bookId": "dan", "chapter": 3, "verse": 24 },
"targetRef": { "bookId": "dan", "chapter": 3, "verse": 24 },
"mappingType": "ABSENT"
}
]
}
]
}

Field reference:

FieldTypeRequiredDescription
schema"versification"Discriminator
versionstringSemver version
schemesVersificationScheme[]One or more named versification schemes
scheme.idstringKebab-case identifier (e.g. "masoretic-to-lxx")
scheme.namestringHuman-readable name
scheme.descriptionstringProse description of the scheme
scheme.mappingsVersificationMapping[]Full set of verse-level mappings
mapping.sourceRefPassageRefSource-scheme passage reference
mapping.targetRefPassageRefTarget-scheme passage reference
mapping.mappingTypeVersificationTypeOne of SPLIT, MERGE, REORDER, ABSENT — see §20.9

Mapping type semantics:

TypeMeaning
SPLITOne source verse maps to multiple target verses
MERGEMultiple source verses are combined into one target verse
REORDERVerse numbering differs but content is the same
ABSENTVerse is present in source scheme but absent (or relocated) in target scheme

17. Morphology Codes

Schema value: "morphology-codes"
File pattern: morphology-codes/{language}.json

Grammatical analysis codes for Hebrew and Greek tokens. Each file covers one language and maps opaque morphology codes to structured grammatical features.

interface MorphCodeEntry {
code: string;
pos: PartOfSpeech; // See §20.4
description: string;
features: Record<string, string>;
}

interface MorphologyCodesFile {
schema: 'morphology-codes';
version: string;
language: WitnessLanguage; // See §20.2
entries: Record<string, MorphCodeEntry>;
}

Example (morphology-codes/hebrew.json):

{
"schema": "morphology-codes",
"version": "2.0.0",
"language": "hebrew",
"entries": {
"HNcfsa": {
"code": "HNcfsa",
"pos": "noun.feminine",
"description": "Hebrew noun, common, feminine, singular, absolute",
"features": {
"language": "hebrew",
"type": "common",
"gender": "feminine",
"number": "singular",
"state": "absolute"
}
},
"HVqp3ms": {
"code": "HVqp3ms",
"pos": "verb",
"description": "Hebrew verb, qal, perfect, third person, masculine, singular",
"features": {
"language": "hebrew",
"stem": "qal",
"aspect": "perfect",
"person": "third",
"gender": "masculine",
"number": "singular"
}
},
"HPp3ms": {
"code": "HPp3ms",
"pos": "preposition",
"description": "Hebrew preposition with pronominal suffix, third person, masculine, singular",
"features": {
"language": "hebrew",
"suffix_person": "third",
"suffix_gender": "masculine",
"suffix_number": "singular"
}
}
}
}

Field reference:

FieldTypeRequiredDescription
schema"morphology-codes"Discriminator
versionstringSemver version
languageWitnessLanguageLanguage these codes apply to — see §20.2
entriesRecord<string, Entry>Map of morph code → entry
entry.codestringThe morphology code (must match its key)
entry.posPartOfSpeechPart of speech — see §20.4
entry.descriptionstringHuman-readable description of the code
entry.featuresRecord<string, string>Key-value pairs of grammatical features

features dictionary: Keys are grammatical categories (e.g. "gender", "number", "person", "stem", "aspect", "state"). Values are the specific feature values (e.g. "masculine", "singular", "qal"). The set of keys varies by part of speech and language.


18. Theographic

Schema value: "theographic"
File pattern: theographic/{category}.json

Biblical events and people groups with their associated passages and relationships. A single TheographicFile contains both events and peopleGroups arrays; either may be empty depending on the category file.

interface TheographicEvent {
id: string;
name: string;
description: string;
date?: string; // Free-text date notation: "c. 2000 BCE", "c. 30 CE"; absent when unknown
passages: PassageRef[];
participants: string[]; // IDs referencing proper-name entries or other theographic entities
}

interface TheographicPeopleGroup {
id: string;
name: string;
description: string;
passages: PassageRef[];
members: string[]; // IDs referencing proper-name entries
}

interface TheographicFile {
schema: 'theographic';
version: string;
events: TheographicEvent[];
peopleGroups: TheographicPeopleGroup[];
}

18.1 Events

Events are notable occurrences in scripture with temporal and relational context.

date format: Free-text strings such as "c. 2000 BCE", "c. 30 CE", "c. 1400 BCE". Absent when the date is unknown or uncertain beyond estimation.

participants: Array of string IDs referencing proper-name entries (§15) or other theographic entities. Empty array when no participants are identified.

18.2 People Groups

People groups are named collectives (nations, tribes, ethnic groups) with associated passages and member references.

members: Array of string IDs referencing proper-name entries (§15). Empty array when individual members are not catalogued.

Example (theographic/ot-events.json):

{
"schema": "theographic",
"version": "2.0.0",
"events": [
{
"id": "creation",
"name": "The Creation",
"description": "God creates the heavens, the earth, and all living things in six days.",
"date": "c. 4000 BCE",
"passages": [
{ "bookId": "gen", "chapter": 1, "verse": 1 },
{ "bookId": "gen", "chapter": 2, "verse": 4 },
{ "bookId": "moses", "chapter": 2, "verse": 1 },
{ "bookId": "abr", "chapter": 4, "verse": 1 }
],
"participants": ["adam", "eve"]
},
{
"id": "exodus-from-egypt",
"name": "The Exodus from Egypt",
"description": "The Israelites depart Egypt under the leadership of Moses after the tenth plague.",
"date": "c. 1446 BCE",
"passages": [
{ "bookId": "ex", "chapter": 12, "verse": 31 },
{ "bookId": "ex", "chapter": 14, "verse": 21 },
{ "bookId": "1-ne", "chapter": 4, "verse": 2 }
],
"participants": ["moses", "aaron", "pharaoh"]
},
{
"id": "crucifixion",
"name": "The Crucifixion of Jesus Christ",
"description": "Jesus is crucified at Golgotha outside Jerusalem.",
"date": "c. 30 CE",
"passages": [
{ "bookId": "matt", "chapter": 27, "verse": 33 },
{ "bookId": "john", "chapter": 19, "verse": 17 },
{ "bookId": "3-ne", "chapter": 8, "verse": 5 }
],
"participants": ["jesus"]
}
],
"peopleGroups": [
{
"id": "twelve-tribes",
"name": "Twelve Tribes of Israel",
"description": "The twelve tribes descended from the sons of Jacob (Israel).",
"passages": [
{ "bookId": "gen", "chapter": 49, "verse": 1, "verseEnd": 28 },
{ "bookId": "rev", "chapter": 7, "verse": 4, "verseEnd": 8 }
],
"members": [
"reuben",
"simeon",
"levi",
"judah",
"dan",
"naphtali",
"gad",
"asher",
"issachar",
"zebulun",
"joseph",
"benjamin"
]
},
{
"id": "nephites",
"name": "Nephites",
"description": "Descendants of Nephi, son of Lehi, who kept the Nephite record.",
"passages": [
{ "bookId": "2-ne", "chapter": 5, "verse": 9 },
{ "bookId": "jacob", "chapter": 1, "verse": 13 }
],
"members": ["nephi", "jacob", "enos"]
}
]
}

Event field reference:

FieldTypeRequiredDescription
event.idstringKebab-case identifier
event.namestringDisplay name
event.descriptionstringProse description
event.datestringFree-text date (e.g. "c. 2000 BCE"); absent when unknown
event.passagesPassageRef[]Key scripture references
event.participantsstring[]IDs of participants (proper-name entry IDs)

People group field reference:

FieldTypeRequiredDescription
peopleGroup.idstringKebab-case identifier
peopleGroup.namestringDisplay name
peopleGroup.descriptionstringProse description
peopleGroup.passagesPassageRef[]Key scripture references
peopleGroup.membersstring[]IDs referencing proper-name entries

File-level field reference:

FieldTypeRequiredDescription
schema"theographic"Discriminator
versionstringSemver version
eventsTheographicEvent[]Array of events (may be empty)
peopleGroupsTheographicPeopleGroup[]Array of people groups (may be empty)

19. File Layout & Naming Conventions

19.1 Directory Structure

data/
corpus/
gen.json ← scripture-text: Genesis (interlinear)
rev.json ← scripture-text: Revelation
apoc-ab.json ← scripture-text: Apocalypse of Abraham (annotated)
1-enoch.json ← scripture-text: 1 Enoch (original language)
...
tg/
a.json ← topical-guide: letter A
b.json
...
bd/
a.json ← bible-dictionary: letter A
...
index/
a.json ← scripture-index: letter A
...
lexicon/
H0001-H1000.json ← lexicon: Hebrew H0001–H1000
H1001-H2000.json
...
G0001-G1000.json ← lexicon: Greek G0001–G1000
...
commentary/
clarke/
gen.json ← verse-commentary: Clarke on Genesis
ex.json
...
scholarly/
byu-ntc-revelation.json ← scholarly-commentary: BYU NTC on Revelation
...

11.2 Schema Routing

The schema field on every file is the authoritative discriminator. The directory path is a secondary confirmation. Any file whose schema field conflicts with its directory path is an error.

schema valueExpected directory
"scripture-text"corpus/
"lexicon"lexicon/
"topical-guide"tg/
"bible-dictionary"bd/
"scripture-index"index/
"verse-commentary"commentary/{commentaryId}/
"scholarly-commentary"scholarly/

11.3 File Naming

  • corpus/: {bookId}.json — exact canonical slug
  • tg/, bd/, index/: {letter}.json — single lowercase letter
  • lexicon/: {lang}{start}-{lang}{end}.json — e.g. H0001-H1000.json
  • commentary/: {commentaryId}/{bookId}.json
  • scholarly/: {commentaryId}.json

11.4 Canonical Book ID List

These are the only valid bookId values. All other references must map to one of these.

Old Testament: gen, ex, lev, num, deut, josh, judg, ruth, 1-sam, 2-sam, 1-kgs, 2-kgs, 1-chr, 2-chr, ezra, neh, esth, job, ps, prov, eccl, song, isa, jer, lam, ezek, dan, hosea, joel, amos, obad, jonah, micah, nahum, hab, zeph, hag, zech, mal

New Testament: matt, mark, luke, john, acts, rom, 1-cor, 2-cor, gal, eph, philip, col, 1-thes, 2-thes, 1-tim, 2-tim, titus, philem, heb, james, 1-pet, 2-pet, 1-jn, 2-jn, 3-jn, jude, rev

Book of Mormon: 1-ne, 2-ne, jacob, enos, jarom, omni, w-of-m, mosiah, alma, hel, 3-ne, 4-ne, morm, ether, moro

Doctrine & Covenants / Pearl of Great Price: dc, moses, abr, js-m, js-h, a-of-f

Deuterocanonical / Apocrypha: tob, jdt, wis, sirach, bar, 1-macc, 2-macc, 1-esd, 2-esd, pr-man, ps-151, 3-macc, 4-macc

Pseudepigrapha (known corpus): 1-enoch, 2-enoch, 3-enoch, book-of-giants, jub, t-12-patr, t-reuben, t-simeon, t-levi, t-judah, t-issachar, t-zebulun, t-dan, t-naphtali, t-gad, t-asher, t-joseph, t-benjamin, t-ab, life-adam-eve, apoc-moses, apoc-adam, apocalypse-of-elijah, asc-is, apoc-ab, apoc-zeph, apoc-elijah, 4-ezra, 2-bar, 3-bar, ps-sol, odes-sol, pss-sol, ep-barnabas, 1-clem, 2-clem, didache, shepherd-hermas, ignatius-eph, ignatius-magn, ignatius-tral, ignatius-rom, ignatius-phil, ignatius-smyrn, ignatius-polyc, polycarp-phil, mart-polyc, diogn, frag-papias, gosp-thomas, gosp-philip, gosp-truth, gosp-egyptians, pist-sophia, acts-john, acts-paul, acts-thom, acts-pet, acts-andr, naz-zos, cave-of-treasures, conflict-adam-eve, gen-apoc, book-jasher, wis, jdt, m-abodah-zarah, ep-jer, pr-azar, sg-three, sus, bel


20. Controlled Vocabularies

20.1 CorpusType

type CorpusType =
| 'ot' // Old Testament / Hebrew Bible
| 'nt' // New Testament
| 'bom' // Book of Mormon
| 'dc' // Doctrine & Covenants
| 'pgp' // Pearl of Great Price
| 'deuterocanonical' // Catholic/Orthodox canon (Tobit, Judith, Wisdom, etc.)
| 'pseudepigrapha' // Texts attributed to ancient figures (1 Enoch, Jubilees, etc.)
| 'apocrypha'; // General category for non-canonical Jewish/early Christian texts

20.2 WitnessLanguage

type WitnessLanguage =
| 'ethiopic' // Ge'ez
| 'aramaic' // Jewish Aramaic / DSS Aramaic
| 'greek' // Koine Greek
| 'latin' // Latin
| 'hebrew' // Biblical Hebrew
| 'coptic' // Coptic (Nag Hammadi etc.)
| 'slavonic' // Old Church Slavonic
| 'armenian' // Classical Armenian
| 'syriac'; // Syriac / Peshitta

20.3 ScriptType

type ScriptType =
| 'ethiopic' // Ge'ez script
| 'hebrew' // Hebrew / Aramaic square script
| 'greek' // Greek alphabet (polytonic)
| 'latin' // Latin alphabet
| 'cyrillic' // Cyrillic
| 'armenian' // Armenian script
| 'coptic'; // Coptic alphabet

20.4 PartOfSpeech

type PartOfSpeech =
| 'noun.masculine'
| 'noun.feminine'
| 'noun.common' // Unspecified or both genders
| 'verb'
| 'adjective'
| 'adjective.masculine'
| 'adverb'
| 'preposition'
| 'conjunction'
| 'particle'
| 'interjection'
| 'pronoun'
| 'pronoun.masculine'
| 'pronoun.feminine'
| 'proper-name.masculine'
| 'proper-name.feminine'
| 'proper-name.location'
| 'proper-name.people'
| 'proper-name' // Ambiguous or uncategorized proper name
| 'other'; // Anything not fitting above; consult posRaw

Normalization table:

Source stringpos
"Noun Masculine""noun.masculine"
"Noun Feminine""noun.feminine"
"Verb""verb"
"Adjective""adjective"
"Adjective Masculine""adjective.masculine"
"Adverb""adverb"
"Preposition""preposition"
"Conjunction""conjunction"
"Interjection""interjection"
"Proper Name Masculine""proper-name.masculine"
"Proper Name Feminine""proper-name.feminine"
"Proper Name Location""proper-name.location"
"proper name, of a people and territory""proper-name.people"
Compound strings containing ;"other"
Everything else"other"

The original string is always preserved in posRaw.

20.5 LexiconLanguage

type LexiconLanguage = 'hebrew' | 'greek' | 'aramaic';

20.6 PublicationType

type PublicationType =
| 'manual-chapter' // Gospel Principles, Preach My Gospel, etc.
| 'magazine-article' // Liahona, Ensign articles
| 'teachings-chapter' // Teachings of Presidents of the Church series
| 'seminary-manual' // Seminary teacher/student manuals
| 'institute-manual'; // Institute course manuals

20.7 CrossRefKind

type CrossRefKind =
| 'thematic' // Shared theme or doctrine across passages
| 'parallel' // Parallel accounts of the same event or teaching
| 'quotation' // Direct quotation of one passage by another
| 'allusion' // Indirect reference or echo of another passage
| 'typology' // Type-antitype relationship (OT event foreshadowing NT fulfillment)
| 'prophecy' // Prophetic utterance and its fulfillment
| 'commentary' // One passage provides commentary or explanation of another
| 'linguistic'; // Shared key term, phrase, or word root across passages

20.8 ProperNameCategory

type ProperNameCategory =
| 'PERSON' // Individual human being
| 'PLACE' // Geographic location (city, mountain, river, region)
| 'OBJECT' // Named artifact or object (Urim and Thummim, Liahona, Ark of the Covenant)
| 'DEITY'; // Divine being (God, Jehovah, Elohim)

20.9 VersificationType

type VersificationType =
| 'SPLIT' // One source verse maps to multiple target verses
| 'MERGE' // Multiple source verses combined into one target verse
| 'REORDER' // Verse numbering differs but content is the same
| 'ABSENT'; // Verse present in source but absent/relocated in target

21. Validation Rules

These rules must be enforced by the Zod schemas and the ingest validator.

21.1 Universal Rules (all files)

  • schema must be one of the fifteen defined values
  • version must match semver pattern /^\d+\.\d+\.\d+$/
  • All bookId values must appear in the canonical book ID list (§19.4) or be flagged as __unresolved__
  • chapter must be ≥ 1
  • verse must be ≥ 1 when present
  • verseEnd must be ≥ verse when present
  • chapterEnd must be ≥ chapter when present

21.2 Scripture Text

  • bookId must not be empty
  • title must not be empty
  • abbreviation must not be empty
  • corpus must be a valid CorpusType
  • Each chapter's chapter number must be unique within the file
  • Each verse's verse number must be unique within its chapter
  • words[].order values must be unique within a verse
  • words[].strongs must match /^[HG]\d{4}$/
  • witnesses[] must not contain duplicate language values per verse
  • NoteAnchor.charOffset must be within bounds of verse.text.length

21.3 Lexicon

  • All entry keys must match /^[HG]\d{4}$/
  • strongs on each entry must match its key
  • glosses must be a non-empty array
  • definition.senses may be empty array but must be present
  • derivation.roots must be present (empty array if no roots)
  • Each root in derivation.roots must match /^[HG]\d{4}$/
  • related entries must match /^[HG]\d{4}$/
  • translations must be sorted descending by count when present

21.4 Topical Guide

  • Each TGEntry.id must be unique within the file and must start with "tg:"
  • passages may be empty array (redirect-only entries)
  • SeeAlsoLink must have a valid type
  • TopicLink.topicId must start with "tg:"
  • ArticleLink.articleId must start with "bd:"

21.5 Bible Dictionary

  • Each BDEntry.id must start with "bd:"
  • body must be non-empty

21.6 Scripture Index

  • Each IndexEntry.id must start with "person:", "place:", or "topic:"
  • PersonEntry.disambiguator must be ≥ 1 when present
  • PlaceEntry.uncertaintyNote must be present when uncertain: true
  • appearances may be empty array

21.7 Verse Commentary

  • commentaryId must be non-empty, kebab-case
  • bookId must be in canonical list
  • notes must contain only entries with non-empty body
  • note.verse must be ≥ 1

21.8 Scholarly Commentary

  • commentaryId must be non-empty
  • bookId must be in canonical list
  • topics must contain only topics with at least one section
  • sections must contain only entries with non-empty body

21.9 Conference Talk

  • talkId must be non-empty, kebab-case
  • speaker.name must be non-empty
  • conference.year must be ≥ 1971 (first televised General Conference)
  • conference.month must be 4 or 10
  • dateDelivered must match ISO 8601 date format /^\d{4}-\d{2}-\d{2}$/
  • session must be non-empty
  • sections must be a non-empty array
  • Paragraph.pid must be non-empty when present
  • Footnote.marker must be non-empty
  • scriptureRefs must be a valid array (may be empty)
  • All MediaRef.url values must be valid URLs

21.10 Curriculum Lesson

  • curriculumId must be non-empty, kebab-case
  • title must be non-empty
  • lessons must be a non-empty array
  • Each Lesson.lessonId must be unique within the file
  • dateStart must match ISO 8601 date format when present
  • dateEnd must match ISO 8601 date format when present
  • dateEnd must be ≥ dateStart when both are present
  • passageRange must have valid bookId when present
  • scriptureRefs must be a valid array (may be empty)

21.11 Church Publication

  • publicationId must be non-empty, kebab-case
  • title must be non-empty
  • publicationType must be a valid PublicationType (§20.6)
  • chapters must contain only chapters with at least one section
  • Each PublicationChapter.chapterId must be unique within the file
  • PublicationSection paragraphs must not be empty
  • discussionQuestions must contain only non-empty strings when present
  • additionalScriptures must contain valid InlineRef objects when present
  • ImageRef.assetId must be non-empty
  • ImageRef.alt must be non-empty

21.12 Cross-References

  • bookId must be non-empty and in canonical list (§19.4)
  • sources must be a non-empty array
  • CrossRefSource.sourceId must be non-empty and unique within the file's sources array
  • CrossReference.sourceId must reference an existing sources[].sourceId in the same file
  • CrossReference.relevance must be an integer in range 0–100 (inclusive)
  • CrossReference.kind must be a valid CrossRefKind value
  • CrossRefAnchor.wordIndex must be ≥ 0 (0-based index into the verse's word array)
  • CrossRefAnchor.word must be non-empty
  • When bidirectional is true, the ingest pipeline creates CROSS_REF edges in both directions

21.13 Proper Names

  • language must be a non-empty string
  • entries must be a valid object (key=entry ID, value=entry)
  • Each entry key must match its entry.id
  • entry.name must be non-empty
  • entry.alternateNames must be present (empty array if none)
  • entry.description must be non-empty
  • entry.passages must be a valid array of PassageRef (may be empty)
  • entry.category must be a valid ProperNameCategory value (PERSON, PLACE, OBJECT, DEITY)

21.14 Versification

  • schemes must be a non-empty array
  • scheme.id must be non-empty, kebab-case
  • scheme.name must be non-empty
  • scheme.description must be non-empty
  • scheme.mappings must be a valid array (may be empty)
  • mapping.sourceRef must be a valid PassageRef
  • mapping.targetRef must be a valid PassageRef
  • mapping.mappingType must be a valid VersificationType value (SPLIT, MERGE, REORDER, ABSENT)

21.15 Morphology Codes

  • language must be a valid WitnessLanguage value (§20.2)
  • entries must be a valid object
  • Each entry key must match its entry.code
  • entry.pos must be a valid PartOfSpeech value (§20.4)
  • entry.description must be non-empty
  • entry.features must be a valid object (key=feature name, value=feature value)
  • Morph codes referenced by WordAlignment.morphCode must exist in the corresponding language file

21.16 Theographic

  • events must be a valid array (may be empty)
  • peopleGroups must be a valid array (may be empty)
  • events and peopleGroups are arrays (may be empty)
  • event.id must be non-empty, kebab-case
  • event.name must be non-empty
  • event.description must be non-empty
  • event.date when present must be a non-empty string following free-text date notation (e.g. "c. 2000 BCE")
  • event.passages must be a valid array of PassageRef (may be empty)
  • event.participants must be a valid array of strings (may be empty)
  • peopleGroup.id must be non-empty, kebab-case
  • peopleGroup.name must be non-empty
  • peopleGroup.description must be non-empty
  • peopleGroup.passages must be a valid array of PassageRef (may be empty)
  • peopleGroup.members must be a valid array of strings (may be empty)

End of specification.