Skip to main content

Data Sources

This section documents the external Bible data sources evaluated for filling content gaps in GospeLib's scripture content layer.

Executive Summary

GospeLib currently holds 9 Bible translations, LDS canonical texts, interlinear Hebrew/Greek (without morphology or source tokens), two lexicons, topical/dictionary references, and 61 pseudepigrapha texts — all stored in a FalkorDB graph database. Despite this breadth, critical scholarly features remain missing: morphological analysis, the Septuagint, cross-references, person/place databases, and versification mapping across traditions.

A comprehensive evaluation identified 15 external open-data sources that can fill these gaps. Of 16 total identified gaps, only 3 remain truly unfilled: Synoptic Parallels, Textual Apparatus (manuscript variants), and Audio Pronunciation.

The strongest sources are STEPBible-Data (CC BY 4.0) for morphological tagging, proper names, versification mapping, and extended lexicons; Clear-Bible MACULA (CC BY 4.0) for syntax trees and semantic roles; scrollmapper/bible_databases (MIT) for cross-references and bulk translations; and lxx-swete for immediate Septuagint text.

Current State

Content AreaDetails
Bible translations9 translations (KJV + 8 others) in corpus/ as scripture-text v2.0.0 JSON
LDS canonBook of Mormon, Doctrine & Covenants, Pearl of Great Price
JSTJoseph Smith Translation — 65 books (NT + OT)
Interlinear Hebrew OT39 books — verse-aligned, but no morphology or source tokens
Interlinear Greek NT27 books — verse-aligned, but no morphology or source tokens
Hebrew Lexicon~8,500 entries (Strong's-keyed)
Greek Lexicon~5,600 entries (Strong's-keyed)
Reference worksTopical Guide (TG), Bible Dictionary (BD), Study Index (SI)
Pseudepigrapha61 texts (1 Enoch, Jubilees, Testaments of the Twelve Patriarchs, etc.)
StorageFalkorDB graph database with :Passage, :Book, :Translation nodes