Skip to main content

M00: Tech Prep

Version tag: None — prerequisite milestone, no release artifact Phase: P0: Foundation Target: Weeks 1–2 Sprint: S0


Phase Context

Goal: Corpus data flows from JSON files through the ingest pipeline into FalkorDB, and the content service serves passage data over HTTP.

Key constraint: Everything downstream depends on this. No reader, no interlinear, no graph view without a working data layer.


ZenHub Configuration

FieldValue
MilestoneM00: Tech Prep
Due Date2026-03-22
Default PipelineProduct Backlog
Primary Epic(s)Tech Prep Foundation

Prerequisites

None — M00 is the first milestone. All downstream milestones depend on artifacts produced here.


Epic: Tech Prep Foundation

Establish cross-service patterns, tooling, and testing infrastructure before business logic begins.

Story AreaScopeSpec Reference
Dev environment validationVerify pnpm infra:up works, all services start with go run/uv run, setup.sh is current, document issuesTECH-SPEC.md § Development Workflow
PostgreSQL migration frameworkInstall golang-migrate, create migrations/ directory structure in auth + billing, add Nx migration targets (nx run auth:migrate-up), add migration CI stepTECH-SPEC.md § Data Architecture
Initial PostgreSQL schemasgl_users, gl_user_devices, gl_subscriptions, gl_stripe_events, gl_highlights, gl_notes, gl_bookmarks, gl_reading_progress, gl_audit_log — all tables from tech specTECH-SPEC.md § PostgreSQL Schema
Typesense collection schemasDefine passages, topics, lexicon collection schemas per tech specTECH-SPEC.md § Typesense Collections
Response envelope utilitiesImplement SuccessResponse/ErrorResponse helpers in Go (gateway, auth, billing) and Python (content, ai) matching the API contractTECH-SPEC.md § Response Envelope
Structured logging configurationConfigure zerolog (Go) and structlog (Python) with JSON format, request_id, service name, version fields across all servicesTECH-SPEC.md § Observability
Health check contractImplement /health (alive) and /ready (dependency check) on all services with consistent JSON response shapeTECH-SPEC.md § Health Endpoints
Request ID propagationGateway generates X-Request-ID; downstream services extract and include in logs and error responsesTECH-SPEC.md § Communication
Testing infrastructureCreate per-project vitest.config.ts for TS packages, implement conftest.py fixtures for Python services, create one reference Go table-driven test per service, validate all test runners in CITECH-SPEC.md § Testing
OpenAPI spec workflowContent service auto-generates OpenAPI via FastAPI; set up gen-types.sh to pull /openapi.json and generate TS types, validate generation pipelineTECH-SPEC.md § API Design
Env validation schemasImplement Zod env schemas in @gospelib/config for web app env varsTECH-SPEC.md § Shared Packages
Error code catalogDefine UPPER_SNAKE error codes as constants in Go and Python (PASSAGE_NOT_FOUND, INVALID_REQUEST, UNAUTHORIZED, etc.)TECH-SPEC.md § Response Envelope
Corpus v1→v2 migrationRun convert_corpus.py --input corpus --output data, validate output against v2 schemas, confirm data/ is canonical for ingestGOSPELIB-MIGRATION-SPEC.md, REPO-MAP.md § Data & Corpus

⚠️ Note: GOSPELIB-MIGRATION-SPEC.md does not yet exist. The corpus v1→v2 migration is currently handled by convert_corpus.py at the repo root. The migration spec should be written before or during this milestone to document conversion rules and validation criteria.

Issues

IssueTitleStatusNotes
M00-001Dev Environment Validation✅ Donesetup.sh, health-check.sh, Docker infrastructure present
M00-002PostgreSQL Migration Framework✅ Donegolang-migrate in go.mod; auth/billing have migration dirs
M00-003Initial PostgreSQL Schemas✅ DoneMigrations 001-005 with CREATE TABLE + rollback files
M00-004Typesense Collection Schemas✅ Donetypesense.py (108 lines) defines all 3 collection schemas (passages, topics, lexicon) with idempotent create_collections(); handles ObjectAlreadyExists
M00-005Response Envelope Utilities✅ Doneresponse.go in gateway; Pydantic models in content/ai; TS envelope types in @gospelib/types
M00-006Structured Logging Configuration✅ Donelogger.go middleware in gateway; structlog in Python services
M00-007Health Check Contract✅ Donehealth.go handlers in all Go services with tests
M00-008Request ID Propagation✅ Donerequest_id.go middleware; all services bind to logging context
M00-009Testing Infrastructure✅ Donevitest.config.ts in all TS packages; Go table-driven tests; Python conftest.py
M00-010OpenAPI Spec Workflow✅ Donegen-types.sh + generated TS types at packages/types/src/api/generated.d.ts
M00-011Env Validation Schemas✅ Donepackages/config/src/env.ts with Zod validators
M00-012Error Code Catalog✅ Doneerrors.py in content/ai (18+ codes); Go error codes in gateway
M00-013Corpus v1→v2 Migration✅ Doneconvert_corpus.py executed; corpus v2 structure validated

Progress: 13 Done · 0 Partial · 0 To Do (100%)


Document References

DocContainsUse When Writing Stories For
MVP.mdFeature scope, tier breakdown, success criteria, budgetAcceptance criteria, scope boundaries
TECH-SPEC.mdArchitecture, service boundaries, data stores, API catalogTechnical implementation details
GOSPELIB-SCHEMAS.mdAll 7 schema families, node/edge types, validation rulesData models, Pydantic models, graph schema
GOSPELIB-INGEST-SPEC.md7-stage pipeline, Cypher templates, batch strategy, CLIIngest pipeline stories
DESIGN-SYSTEM.mdVisual identity, component catalog, reader modes, tokensUI component stories
Deployment & OperationsEnvironments, K8s, CI/CD, migrations, secrets, DRInfrastructure and deployment stories
REPO-MAP.mdDirectory structure, naming conventions, dependency rulesAll stories (coding standards)
GOSPELIB-MIGRATION-SPEC.mdv1→v2 corpus migration, conversion rules — doc not yet written; migration handled via convert_corpus.pyM00 data preparation stories
BusinessLEGAL.md, POLICY-TERMS.md, executive summary, market research, GTMLaunch readiness, legal/compliance stories

Sprint Mapping

SprintWeeksPrimary Focus
S01–2Tech Prep: Dev environment validation, PostgreSQL migration framework + initial schemas, response envelope utilities, structured logging, health check contracts, testing infrastructure, OpenAPI workflow

Sprint Load Warnings

No load warnings apply to S0. However, from the risk registry:

Tech prep scope creep — Strict 2-week timebox for Sprint 0; defer nice-to-haves; only implement what's blocking downstream work.


Release Info

No release tag — M00 is a prerequisite milestone with no release artifact. It produces verified readiness for all downstream work.


Relevant Risks

RiskImpactMitigation
Tech prep scope creepDelays P0 ingest workStrict 2-week timebox for Sprint 0; defer nice-to-haves; only implement what's blocking downstream work
Missing specification documentsGOSPELIB-MIGRATION-SPEC.md not writtenDocument corpus migration process before M01

Cross-Cutting Concerns

Testing

LayerFrameworkWhenSpec Reference
Python unit/integrationpytest + testcontainersEvery PRGOSPELIB-INGEST-SPEC.md § Testing
Go unitgo test -race + table-drivenEvery PRTECH-SPEC.md § Testing
TypeScript unitVitestEvery PRTECH-SPEC.md § Testing

Documentation

DocUpdate Trigger
ADRsEach major technical decision

CI/CD

AdditionDetail
Python test containers in CIFalkorDB + PostgreSQL service containers for ingest integration tests

Dependencies

Upstream (what M00 needs)

  • Functioning monorepo tooling (Nx, pnpm, CI, Docker Compose) — already operational
  • Corpus data files in corpus/ — already complete (~966 files, ~215 MB)

Downstream (what depends on M00)

  • M01: Data Pipeline — depends on testing infrastructure, corpus v1→v2 migration, dev environment validation
  • M02: Content API — depends on response envelope utilities, structured logging, health check contract, error code catalog, request ID propagation
  • M04: Annotations — depends on PostgreSQL schemas (annotation tables created here)
  • All milestones — depend on structured logging, response envelope, health check, and testing patterns established here

Issue Dependency Graph

M00-001 ──► M00-002 ──► M00-003
M00-001 ──► M00-004
M00-001 ──► M00-005 ──► M00-008
M00-001 ──► M00-006 ──► M00-008
M00-001 ──► M00-007
M00-001 ──► M00-009
M00-001 ──► M00-010
M00-001 ──► M00-013
M00-005 ──► M00-012
M00-011 (independent)

Legend: A ──► B means A blocks B (B is blocked by A)


Summary

MetricCount
Total Issues13
Sub-Issues3
Total Estimate (pts)54
SprintsS0
Dependencies (blocking)12
Dependencies (blocked by)12