Skip to main content

scrollmapper/bible_databases

  • Repository: scrollmapper/bible_databases
  • Maintainer: Community-maintained open-source project
  • License: MIT — maximally permissive. No restrictions on use, modification, or distribution.
  • Suitability Score: ⭐⭐⭐⭐ (4/5)

Coverage

Format: MySQL dumps, SQLite databases, CSV, JSON, YAML, TXT, Markdown. Multiple formats for the same data. SQLite is the most convenient for bulk processing.

  • 140 Bible translations across 30+ languages, including:
    • 5 Vulgate variants: VulgClementine, VulgConte, VulgHetzenauer, VulgSistine, Vulgate (generic)
    • WLC (Westminster Leningrad Codex) Hebrew
    • Peshitta (Syriac)
    • Byzantine Greek text
    • Textus Receptus
    • KJV with Strong's numbers and morphology codes
    • Dozens of English translations (ASV, BBE, Darby, Geneva, Webster, YLT, etc.)
    • Spanish, French, German, Portuguese, Korean, Chinese, Arabic, Russian, and more
  • Cross-references: ~340,000 entries derived from Treasury of Scripture Knowledge (TSK), with community relevance votes (integer ranking)
  • Schema per translation: <translation>_books (book metadata), <translation>_verses (verse text), translations (master list), cross_references (from→to with votes)

Quality

Moderate to good. Community-maintained with contributions over many years. Vulgate texts are from established scholarly editions. Cross-reference data from TSK is a well-known reference work. Relevance votes add signal for filtering.

Gaps Filled

  • ✅ Vulgate (Latin) — 5 distinct variants
  • ✅ Cross-references — ~340,000 entries with relevance ranking
  • ✅ Additional translations — 140 translations in 30+ languages
  • 🔶 Morphological tags (only in KJV+Strong's variant, not comprehensive)

Integration Notes

  • SQLite/JSON formats parse easily into GospeLib's Pydantic models
  • Verse schema (book_id, chapter, verse, text) maps to scripture-text v2.0.0 with minimal transformation
  • Cross-references create :CROSS_REFERENCES edges between :Passage nodes in FalkorDB
  • Vulgate texts would be new :Translation nodes (e.g., vulg-clementine, vulg-conte)
  • Relevance votes on cross-references can be stored as edge properties for weighted retrieval
  • Book ID mapping needed: scrollmapper uses integer IDs, GospeLib uses canonical slugs
  • New ingest pipeline stage needed for cross-references; translation ingestion fits existing TranslationPipeline