M00: Tech Prep
Version tag: None — prerequisite milestone, no release artifact Phase: P0: Foundation Target: Weeks 1–2 Sprint: S0
Phase Context
Goal: Corpus data flows from JSON files through the ingest pipeline into FalkorDB, and the content service serves passage data over HTTP.
Key constraint: Everything downstream depends on this. No reader, no interlinear, no graph view without a working data layer.
ZenHub Configuration
| Field | Value |
|---|---|
| Milestone | M00: Tech Prep |
| Due Date | 2026-03-22 |
| Default Pipeline | Product Backlog |
| Primary Epic(s) | Tech Prep Foundation |
Prerequisites
None — M00 is the first milestone. All downstream milestones depend on artifacts produced here.
Epic: Tech Prep Foundation
Establish cross-service patterns, tooling, and testing infrastructure before business logic begins.
| Story Area | Scope | Spec Reference |
|---|---|---|
| Dev environment validation | Verify pnpm infra:up works, all services start with go run/uv run, setup.sh is current, document issues | TECH-SPEC.md § Development Workflow |
| PostgreSQL migration framework | Install golang-migrate, create migrations/ directory structure in auth + billing, add Nx migration targets (nx run auth:migrate-up), add migration CI step | TECH-SPEC.md § Data Architecture |
| Initial PostgreSQL schemas | gl_users, gl_user_devices, gl_subscriptions, gl_stripe_events, gl_highlights, gl_notes, gl_bookmarks, gl_reading_progress, gl_audit_log — all tables from tech spec | TECH-SPEC.md § PostgreSQL Schema |
| Typesense collection schemas | Define passages, topics, lexicon collection schemas per tech spec | TECH-SPEC.md § Typesense Collections |
| Response envelope utilities | Implement SuccessResponse/ErrorResponse helpers in Go (gateway, auth, billing) and Python (content, ai) matching the API contract | TECH-SPEC.md § Response Envelope |
| Structured logging configuration | Configure zerolog (Go) and structlog (Python) with JSON format, request_id, service name, version fields across all services | TECH-SPEC.md § Observability |
| Health check contract | Implement /health (alive) and /ready (dependency check) on all services with consistent JSON response shape | TECH-SPEC.md § Health Endpoints |
| Request ID propagation | Gateway generates X-Request-ID; downstream services extract and include in logs and error responses | TECH-SPEC.md § Communication |
| Testing infrastructure | Create per-project vitest.config.ts for TS packages, implement conftest.py fixtures for Python services, create one reference Go table-driven test per service, validate all test runners in CI | TECH-SPEC.md § Testing |
| OpenAPI spec workflow | Content service auto-generates OpenAPI via FastAPI; set up gen-types.sh to pull /openapi.json and generate TS types, validate generation pipeline | TECH-SPEC.md § API Design |
| Env validation schemas | Implement Zod env schemas in @gospelib/config for web app env vars | TECH-SPEC.md § Shared Packages |
| Error code catalog | Define UPPER_SNAKE error codes as constants in Go and Python (PASSAGE_NOT_FOUND, INVALID_REQUEST, UNAUTHORIZED, etc.) | TECH-SPEC.md § Response Envelope |
| Corpus v1→v2 migration | Run convert_corpus.py --input corpus --output data, validate output against v2 schemas, confirm data/ is canonical for ingest | GOSPELIB-MIGRATION-SPEC.md, REPO-MAP.md § Data & Corpus |
⚠️ Note:
GOSPELIB-MIGRATION-SPEC.mddoes not yet exist. The corpus v1→v2 migration is currently handled byconvert_corpus.pyat the repo root. The migration spec should be written before or during this milestone to document conversion rules and validation criteria.
Issues
| Issue | Title | Status | Notes |
|---|---|---|---|
| M00-001 | Dev Environment Validation | ✅ Done | setup.sh, health-check.sh, Docker infrastructure present |
| M00-002 | PostgreSQL Migration Framework | ✅ Done | golang-migrate in go.mod; auth/billing have migration dirs |
| M00-003 | Initial PostgreSQL Schemas | ✅ Done | Migrations 001-005 with CREATE TABLE + rollback files |
| M00-004 | Typesense Collection Schemas | ✅ Done | typesense.py (108 lines) defines all 3 collection schemas (passages, topics, lexicon) with idempotent create_collections(); handles ObjectAlreadyExists |
| M00-005 | Response Envelope Utilities | ✅ Done | response.go in gateway; Pydantic models in content/ai; TS envelope types in @gospelib/types |
| M00-006 | Structured Logging Configuration | ✅ Done | logger.go middleware in gateway; structlog in Python services |
| M00-007 | Health Check Contract | ✅ Done | health.go handlers in all Go services with tests |
| M00-008 | Request ID Propagation | ✅ Done | request_id.go middleware; all services bind to logging context |
| M00-009 | Testing Infrastructure | ✅ Done | vitest.config.ts in all TS packages; Go table-driven tests; Python conftest.py |
| M00-010 | OpenAPI Spec Workflow | ✅ Done | gen-types.sh + generated TS types at packages/types/src/api/generated.d.ts |
| M00-011 | Env Validation Schemas | ✅ Done | packages/config/src/env.ts with Zod validators |
| M00-012 | Error Code Catalog | ✅ Done | errors.py in content/ai (18+ codes); Go error codes in gateway |
| M00-013 | Corpus v1→v2 Migration | ✅ Done | convert_corpus.py executed; corpus v2 structure validated |
Progress: 13 Done · 0 Partial · 0 To Do (100%)
Document References
| Doc | Contains | Use When Writing Stories For |
|---|---|---|
| MVP.md | Feature scope, tier breakdown, success criteria, budget | Acceptance criteria, scope boundaries |
| TECH-SPEC.md | Architecture, service boundaries, data stores, API catalog | Technical implementation details |
| GOSPELIB-SCHEMAS.md | All 7 schema families, node/edge types, validation rules | Data models, Pydantic models, graph schema |
GOSPELIB-INGEST-SPEC.md | 7-stage pipeline, Cypher templates, batch strategy, CLI | Ingest pipeline stories |
| DESIGN-SYSTEM.md | Visual identity, component catalog, reader modes, tokens | UI component stories |
| Deployment & Operations | Environments, K8s, CI/CD, migrations, secrets, DR | Infrastructure and deployment stories |
| REPO-MAP.md | Directory structure, naming conventions, dependency rules | All stories (coding standards) |
| GOSPELIB-MIGRATION-SPEC.md | v1→v2 corpus migration, conversion rules — doc not yet written; migration handled via convert_corpus.py | M00 data preparation stories |
| Business | LEGAL.md, POLICY-TERMS.md, executive summary, market research, GTM | Launch readiness, legal/compliance stories |
Sprint Mapping
| Sprint | Weeks | Primary Focus |
|---|---|---|
| S0 | 1–2 | Tech Prep: Dev environment validation, PostgreSQL migration framework + initial schemas, response envelope utilities, structured logging, health check contracts, testing infrastructure, OpenAPI workflow |
Sprint Load Warnings
No load warnings apply to S0. However, from the risk registry:
Tech prep scope creep — Strict 2-week timebox for Sprint 0; defer nice-to-haves; only implement what's blocking downstream work.
Release Info
No release tag — M00 is a prerequisite milestone with no release artifact. It produces verified readiness for all downstream work.
Relevant Risks
| Risk | Impact | Mitigation |
|---|---|---|
| Tech prep scope creep | Delays P0 ingest work | Strict 2-week timebox for Sprint 0; defer nice-to-haves; only implement what's blocking downstream work |
| Missing specification documents | GOSPELIB-MIGRATION-SPEC.md not written | Document corpus migration process before M01 |
Cross-Cutting Concerns
Testing
| Layer | Framework | When | Spec Reference |
|---|---|---|---|
| Python unit/integration | pytest + testcontainers | Every PR | GOSPELIB-INGEST-SPEC.md § Testing |
| Go unit | go test -race + table-driven | Every PR | TECH-SPEC.md § Testing |
| TypeScript unit | Vitest | Every PR | TECH-SPEC.md § Testing |
Documentation
| Doc | Update Trigger |
|---|---|
| ADRs | Each major technical decision |
CI/CD
| Addition | Detail |
|---|---|
| Python test containers in CI | FalkorDB + PostgreSQL service containers for ingest integration tests |
Dependencies
Upstream (what M00 needs)
- Functioning monorepo tooling (Nx, pnpm, CI, Docker Compose) — already operational
- Corpus data files in
corpus/— already complete (~966 files, ~215 MB)
Downstream (what depends on M00)
- M01: Data Pipeline — depends on testing infrastructure, corpus v1→v2 migration, dev environment validation
- M02: Content API — depends on response envelope utilities, structured logging, health check contract, error code catalog, request ID propagation
- M04: Annotations — depends on PostgreSQL schemas (annotation tables created here)
- All milestones — depend on structured logging, response envelope, health check, and testing patterns established here
Issue Dependency Graph
M00-001 ──► M00-002 ──► M00-003
M00-001 ──► M00-004
M00-001 ──► M00-005 ──► M00-008
M00-001 ──► M00-006 ──► M00-008
M00-001 ──► M00-007
M00-001 ──► M00-009
M00-001 ──► M00-010
M00-001 ──► M00-013
M00-005 ──► M00-012
M00-011 (independent)
Legend:
A ──► Bmeans A blocks B (B is blocked by A)
Summary
| Metric | Count |
|---|---|
| Total Issues | 13 |
| Sub-Issues | 3 |
| Total Estimate (pts) | 54 |
| Sprints | S0 |
| Dependencies (blocking) | 12 |
| Dependencies (blocked by) | 12 |