Service Communication
All external traffic enters through the API gateway. Services communicate internally via HTTP and asynchronously via Redis Streams.
Communication Patterns
graph LR
Client["Client App"] -->|HTTPS| GW["API Gateway<br/>(port 8080)"]
GW -->|HTTP + Headers| Content["Content<br/>(8100)"]
GW -->|HTTP + Headers| Auth["Auth<br/>(8200)"]
GW -->|HTTP + Headers| Billing["Billing<br/>(8300)"]
GW -->|HTTP + Headers| AI["AI<br/>(8400)"]
GW -->|HTTP + Headers| Notif["Notifications<br/>(8500)"]
Stripe["Stripe"] -->|Webhook| GW
Clerk["Clerk"] -->|Webhook| GW
Auth -->|Redis Stream| Notif
Billing -->|Redis Stream| Notif
External API
All external APIs are REST + JSON, versioned at /api/v1/. No GraphQL on the external boundary — the query surface of a scripture graph is complex enough that a carefully designed REST API performs better and is easier to document, cache, and rate-limit.
Gateway Responsibilities
The gateway (Go/Chi) is the single ingress point. It handles:
- JWT validation against Clerk JWKS endpoint (cached, refreshed every 5 min)
- Rate limiting per user and per IP (token bucket, stored in Redis)
- Request routing to downstream services via reverse proxy
- Header injection —
X-Request-ID,X-User-Id,X-User-Plan - CORS configuration
- Access logging (structured JSON)
The gateway does not contain business logic or access databases directly.
Middleware Stack
Middleware executes in this order:
RequestID → RealIP → OpenTelemetry → Logger → Recoverer → CORS → RateLimiter
→ (auth group: ValidateJWT → InjectUserClaims → Entitlement checks)
Entitlement Middleware
The gateway reads X-User-Plan (from validated JWT claims) and enforces feature access before proxying. Entitlement definitions are cached in Redis with a 60-second TTL — the gateway never calls the billing service in the hot path.
Entitlement check: O(1) Redis GET — never a downstream service call
Internal HTTP
Services communicate via direct HTTP within the cluster. The gateway injects headers that downstream services trust without re-validating the JWT (services are not publicly reachable).
| Header | Injected By | Purpose |
|---|---|---|
X-Request-ID | Gateway | Correlation ID for distributed tracing |
X-User-Id | Gateway | Authenticated user ID |
X-User-Plan | Gateway | User's subscription plan (for plan-gated logic) |
Redis Streams (Async)
For fire-and-forget events (notifications, user sync), services publish to Redis Streams. Consumer groups ensure at-least-once delivery.
| Stream | Publisher | Consumer | Purpose |
|---|---|---|---|
gl:events:notifications | Any service | Notifications | Push/email dispatch |
gl:events:users | Auth | Multiple | User lifecycle events |
gl:events:ingest | Ingest | Content | Index refresh triggers |
sequenceDiagram
participant Auth
participant Redis as Redis Stream
participant Notifications
Auth->>Redis: XADD gl:events:users {user.created}
Notifications->>Redis: XREADGROUP notifications-workers
Redis-->>Notifications: {user.created}
Notifications->>Notifications: Send welcome email
Notifications->>Redis: XACK
Webhooks
External webhooks from Stripe and Clerk enter via the gateway's public routes (no auth required) and are forwarded to their respective services.
| Provider | Endpoint | Service | Idempotency |
|---|---|---|---|
| Stripe | POST /api/v1/billing/webhook | Billing | stripe_event_id in gl_stripe_events table |
| Clerk | POST /api/v1/auth/webhook | Auth | Clerk event ID checked before processing |
Both webhook handlers:
- Verify the provider's signature header
- Check idempotency key in PostgreSQL
- Process the event
- Mark the event as processed
- Return 200
Response Envelope
All API responses use a standard envelope format.
Success
{
"data": {
/* typed response */
},
"meta": {
"total": 150,
"next_cursor": "eyJpZCI6MTUwfQ=="
}
}
Error
{
"error": {
"code": "PASSAGE_NOT_FOUND",
"message": "No passage found with ID gen.99.99",
"request_id": "req_abc123"
}
}
Error codes are UPPER_SNAKE_CASE and machine-readable. The request_id is always included for support reference.
Cursor-Based Pagination
All list endpoints use cursor-based pagination via next_cursor in the meta object. This avoids the performance issues of offset-based pagination on large datasets.
GET /api/v1/passages/gen/1?cursor=eyJpZCI6MTB9
→ { data: [...], meta: { next_cursor: "eyJpZCI6MjB9" } }
When next_cursor is absent, the client has reached the end of the result set.
Rate Limiting
Rate limits are tier-based and endpoint-specific:
| Endpoint | Free Tier | Paid Tier | AI Tier |
|---|---|---|---|
| Search | 20/min | 200/min | — |
| Passages | 60/min | 600/min | — |
| Lexicon | 40/min | 400/min | — |
| AI requests | 5/hour | 50/hour | 200/hour |
Rate limit state stored in Redis: gl:ratelimit:<userId>:<endpoint>:<window>
Related Pages
- Architecture Overview — Core principles including "data at the boundary"
- Redis & Caching — Cache key format and TTLs
- Authentication Flow — JWT validation details
- Entitlements — Plan-gated route configuration