OpenScriptures morphhb
- Repository: openscriptures/morphhb
- Maintainer: OpenScriptures community project
- License: CC BY 4.0 — permissive, compatible with open-source distribution.
- Suitability Score: ⭐⭐⭐⭐ (4/5)
Coverage
Format: OSIS XML files. One file per OT book (39 files). Each <w> element contains:
lemma— Strong's number(s)morph— morphological code (OSHM scheme)n— unique word ID- Text content is the Hebrew word form
Complete Hebrew Old Testament based on the Westminster Leningrad Codex (WLC). Version 2.2. All 39 books with word-level morphological analysis.
Quality
Good. The WLC is the standard scholarly Hebrew text. Morphological tagging has been refined through community contributions. The OSHM (Open Scriptures Hebrew Morphology) coding scheme is well-documented.
Gaps Filled
- ✅ Morphological tags (Hebrew OT) — per-word with Strong's alignment
- ✅ Source tokens (Hebrew OT) — individual word forms with lemmas
- 🔶 Unique word IDs enable precise cross-referencing
Integration Notes
- XML parsing requires
lxmlor similar — slightly more complex than TSV but well-understood - Strong's numbers in morphhb align directly with GospeLib's existing Strong's-keyed Hebrew lexicon
- OSIS XML is a recognized biblical text standard — mature parsing libraries exist
- Serves as alternative/complement to STEPBible TAHOT for Hebrew morphology
- Unique word IDs (
nattribute) could anchor a word-level graph - Would extend the existing interlinear pipeline stage with a morphology enrichment sub-stage