Data Sources
This section documents the external Bible data sources evaluated for filling content gaps in GospeLib's scripture content layer.
Executive Summary
GospeLib currently holds 9 Bible translations, LDS canonical texts, interlinear Hebrew/Greek (without morphology or source tokens), two lexicons, topical/dictionary references, and 61 pseudepigrapha texts — all stored in a FalkorDB graph database. Despite this breadth, critical scholarly features remain missing: morphological analysis, the Septuagint, cross-references, person/place databases, and versification mapping across traditions.
A comprehensive evaluation identified 15 external open-data sources that can fill these gaps. Of 16 total identified gaps, only 3 remain truly unfilled: Synoptic Parallels, Textual Apparatus (manuscript variants), and Audio Pronunciation.
The strongest sources are STEPBible-Data (CC BY 4.0) for morphological tagging, proper names, versification mapping, and extended lexicons; Clear-Bible MACULA (CC BY 4.0) for syntax trees and semantic roles; scrollmapper/bible_databases (MIT) for cross-references and bulk translations; and lxx-swete for immediate Septuagint text.
Current State
| Content Area | Details |
|---|---|
| Bible translations | 9 translations (KJV + 8 others) in corpus/ as scripture-text v2.0.0 JSON |
| LDS canon | Book of Mormon, Doctrine & Covenants, Pearl of Great Price |
| JST | Joseph Smith Translation — 65 books (NT + OT) |
| Interlinear Hebrew OT | 39 books — verse-aligned, but no morphology or source tokens |
| Interlinear Greek NT | 27 books — verse-aligned, but no morphology or source tokens |
| Hebrew Lexicon | ~8,500 entries (Strong's-keyed) |
| Greek Lexicon | ~5,600 entries (Strong's-keyed) |
| Reference works | Topical Guide (TG), Bible Dictionary (BD), Study Index (SI) |
| Pseudepigrapha | 61 texts (1 Enoch, Jubilees, Testaments of the Twelve Patriarchs, etc.) |
| Storage | FalkorDB graph database with :Passage, :Book, :Translation nodes |
Navigation
- Gap Inventory — What data gaps exist
- Sources — Detailed analysis of 15 external sources
- Gap-to-Source Matrix — Which sources fill which gaps
- Adoption Plan — Three-phase integration roadmap
- Format Compatibility — How sources map to FalkorDB
- Licensing — License compatibility analysis
- Remaining Gaps — Unfilled gaps and future research
- BLB Investigation — Blue Letter Bible scraping decision