Deploy to Staging
The staging environment runs on a single EC2 instance with k3s (lightweight Kubernetes), managed data stores (RDS PostgreSQL, ElastiCache Redis), and in-cluster FalkorDB and Typesense. ArgoCD watches the stage branch and auto-syncs manifests. The CD workflow builds and pushes images on every push to stage.
Architecture Overview
| Component | Staging | Notes |
|---|---|---|
| Compute | k3s on EC2 t3.medium | 1 replica per service |
| Database | RDS db.t3.micro (PostgreSQL 16 + pgvector) | Free-tier eligible |
| Cache | ElastiCache cache.t3.micro (Redis 7) | Free-tier eligible |
| Graph DB | FalkorDB StatefulSet (in-cluster) | Redis protocol on port 6379 |
| Search | Typesense StatefulSet (in-cluster) | HTTP on port 8108 |
| Ingress | nginx-ingress controller | Installed via k3s user_data |
| TLS | cert-manager + Let's Encrypt | HTTP-01 solver via nginx |
| GitOps | ArgoCD | Auto-sync with prune + self-heal |
| CI/CD | GitHub Actions (cd-staging.yml) | Triggers on push to stage |
Deployed services: gateway, content, auth, billing, ai, notifications
Services with Dockerfiles but no k8s manifests yet: ingest, plugin-registry
Prerequisites
Complete these one-time setup steps before starting.
1. AWS Account Bootstrapping
The Terraform S3 backend requires a state bucket and DynamoDB lock table. These must exist before terraform init.
aws s3api create-bucket \
--bucket gospelib-terraform-state \
--region us-east-1
aws s3api put-bucket-versioning \
--bucket gospelib-terraform-state \
--versioning-configuration Status=Enabled
aws s3api put-bucket-encryption \
--bucket gospelib-terraform-state \
--server-side-encryption-configuration \
'{"Rules":[{"ApplyServerSideEncryptionByDefault":{"SSEAlgorithm":"AES256"}}]}'
aws dynamodb create-table \
--table-name gospelib-terraform-lock \
--attribute-definitions AttributeName=LockID,AttributeType=S \
--key-schema AttributeName=LockID,KeyType=HASH \
--billing-mode PAY_PER_REQUEST \
--region us-east-1
2. EC2 SSH Key Pair
Create a key pair in the AWS Console (or CLI) in us-east-1. The name must match the ssh_key_name variable in your terraform.tfvars.
aws ec2 create-key-pair \
--key-name gospelib-staging \
--query 'KeyMaterial' \
--output text > ~/.ssh/gospelib-staging.pem
chmod 600 ~/.ssh/gospelib-staging.pem
3. Route53 Hosted Zone
A hosted zone for gospelib.com must exist. Note the zone ID — you'll need it for terraform.tfvars.
aws route53 list-hosted-zones-by-name \
--dns-name gospelib.com \
--query 'HostedZones[0].Id' \
--output text
4. GitHub OIDC Provider for AWS
The CD workflow authenticates to AWS via OIDC (no long-lived credentials). Create the identity provider once per AWS account:
aws iam create-open-id-connect-provider \
--url https://token.actions.githubusercontent.com \
--client-id-list sts.amazonaws.com \
--thumbprint-list 6938fd4d98bab03faadb97b34396831e3780aea1
Then create an IAM role that trusts GitHub Actions for this repo. The role needs:
ecr:GetAuthorizationToken,ecr:BatchGetImage,ecr:GetDownloadUrlForLayer,ecr:PutImage,ecr:InitiateLayerUpload,ecr:UploadLayerPart,ecr:CompleteLayerUpload,ecr:BatchCheckLayerAvailability- Scoped to the repo via the trust policy's
subcondition (e.g.,repo:gospelib/main:ref:refs/heads/stage)
5. GitHub Repository Secrets
Set these in the repo's Settings > Secrets and variables > Actions under the staging environment:
| Secret | Value |
|---|---|
AWS_ROLE_ARN | ARN of the IAM role from step 4 |
ECR_REGISTRY | <account-id>.dkr.ecr.us-east-1.amazonaws.com |
6. Local Tooling
Install on your workstation:
- Terraform >= 1.9
- kubectl
- Helm (for ArgoCD installation)
- AWS CLI v2
pnpmand Node.js 20 (for the CD workflow; already required by the monorepo)
Step 1: Provision Infrastructure with Terraform
cd infra/terraform/environments/staging
# Copy the example and fill in real values
cp terraform.tfvars.example terraform.tfvars
Edit terraform.tfvars with your values:
environment = "staging"
aws_region = "us-east-1"
route53_zone_id = "Z0123456789ABCDEF" # Your actual zone ID
ssh_key_name = "gospelib-staging" # Must match the key pair name
k3s_instance_type = "t3.medium"
# REQUIRED: Your IP ranges for SSH and k3s API access
admin_cidr_blocks = ["203.0.113.0/24"]
# Pass the DB password via env var instead of committing it:
# export TF_VAR_db_password="<strong-random-password>"
Never commit terraform.tfvars. It is gitignored, but double-check. The db_password should be passed via TF_VAR_db_password environment variable.
terraform init
terraform plan -out=plan.tfplan
terraform apply plan.tfplan
This creates:
- ECR repositories for each service
- RDS PostgreSQL instance (free-tier
db.t3.micro) - ElastiCache Redis cluster (free-tier
cache.t3.micro) - S3 artifacts bucket
- Secrets Manager entries
- Route53 DNS records (
staging.gospelib.com,api-staging.gospelib.com) - EC2 instance with Elastic IP, pre-configured with:
- k3s (Traefik disabled)
- nginx-ingress controller
gospelib-stagingnamespace
Note the outputs — you'll need them for secrets:
terraform output
Step 2: Configure kubectl
The EC2 user_data installs k3s and writes kubeconfig to /home/ubuntu/.kube/config. Copy it locally:
K3S_IP=$(terraform output -raw k3s_public_ip)
scp -i ~/.ssh/gospelib-staging.pem \
ubuntu@${K3S_IP}:/home/ubuntu/.kube/config \
~/.kube/gospelib-staging.yaml
# Update the server address from localhost to the public IP
sed -i '' "s|127.0.0.1|${K3S_IP}|g" ~/.kube/gospelib-staging.yaml
export KUBECONFIG=~/.kube/gospelib-staging.yaml
kubectl get nodes
You should see a single node in Ready state.
Step 3: Install ArgoCD
ArgoCD watches the stage branch and auto-deploys when Kustomize manifests change.
kubectl create namespace argocd
kubectl apply -n argocd \
-f https://raw.githubusercontent.com/argoproj/argo-cd/stable/manifests/install.yaml
# Wait for ArgoCD to be ready
kubectl wait --for=condition=available deployment/argocd-server \
-n argocd --timeout=300s
Get the initial admin password:
kubectl -n argocd get secret argocd-initial-admin-secret \
-o jsonpath="{.data.password}" | base64 -d
Register the staging ArgoCD Application. The manifest at infra/k8s/argocd/application.yaml points to the stage branch and the infra/k8s/overlays/staging path:
kubectl apply -f infra/k8s/argocd/application.yaml
To access the ArgoCD UI, port-forward: kubectl port-forward svc/argocd-server -n argocd 8080:443. Then visit https://localhost:8080 and log in with admin and the password from above.
Step 4: Install cert-manager
cert-manager provisions TLS certificates from Let's Encrypt automatically.
kubectl apply -f https://github.com/cert-manager/cert-manager/releases/latest/download/cert-manager.yaml
# Wait for cert-manager to be ready
kubectl wait --for=condition=available deployment/cert-manager \
-n cert-manager --timeout=120s
kubectl wait --for=condition=available deployment/cert-manager-webhook \
-n cert-manager --timeout=120s
The ClusterIssuer (letsencrypt-prod) is included in the base Kustomize resources and will be applied by ArgoCD automatically. It uses HTTP-01 challenges via the nginx ingress class.
Step 5: Create Kubernetes Secrets
The Kustomize overlay injects secrets per-service via envFrom. Each service references specific secrets by name. Create them in the gospelib-staging namespace.
Database credentials
RDS_ENDPOINT=$(terraform output -raw rds_endpoint)
REDIS_ENDPOINT=$(terraform output -raw redis_endpoint)
kubectl create secret generic gospelib-database \
-n gospelib-staging \
--from-literal=DATABASE_URL="postgresql://gospelib:${TF_VAR_db_password}@${RDS_ENDPOINT}:5432/gospelib?sslmode=require" \
--from-literal=REDIS_URL="redis://${REDIS_ENDPOINT}:6379"
Auth service secrets
kubectl create secret generic gospelib-auth \
-n gospelib-staging \
--from-literal=CLERK_SECRET_KEY="sk_test_xxx"
Billing service secrets
kubectl create secret generic gospelib-billing \
-n gospelib-staging \
--from-literal=STRIPE_SECRET_KEY="sk_test_xxx" \
--from-literal=STRIPE_WEBHOOK_SECRET="whsec_xxx"
AI service secrets
kubectl create secret generic gospelib-ai \
-n gospelib-staging \
--from-literal=ANTHROPIC_API_KEY="sk-ant-xxx" \
--from-literal=OPENAI_API_KEY="sk-xxx"
Search service secrets
kubectl create secret generic gospelib-search \
-n gospelib-staging \
--from-literal=TYPESENSE_API_KEY="$(openssl rand -hex 32)"
Notifications service secrets
kubectl create secret generic gospelib-notifications \
-n gospelib-staging \
--from-literal=RESEND_API_KEY="re_xxx"
Observability (optional)
kubectl create secret generic gospelib-observability \
-n gospelib-staging \
--from-literal=SENTRY_DSN="https://xxx@sentry.io/xxx"
The gospelib-observability secret is referenced with optional: true in the Kustomize patches. Services will start without it — add it when you're ready to enable Sentry.
Step 6: Initial Deployment
Build and push images manually (first time)
Authenticate with ECR and push all service images:
ECR_REGISTRY=$(terraform output -raw ecr_repository_urls | jq -r 'to_entries[0].value' | cut -d/ -f1)
aws ecr get-login-password --region us-east-1 \
| docker login --username AWS --password-stdin ${ECR_REGISTRY}
for svc in gateway content auth billing ai notifications; do
docker build -t ${ECR_REGISTRY}/gospelib-${svc}:latest services/${svc}/
docker push ${ECR_REGISTRY}/gospelib-${svc}:latest
echo "Pushed ${svc}"
done
Trigger ArgoCD sync
ArgoCD should auto-sync within 3 minutes. To force an immediate sync:
kubectl -n argocd exec deploy/argocd-server -- \
argocd app sync gospelib-staging --force
Or if you have the argocd CLI installed:
argocd app sync gospelib-staging
Step 7: Run Initial Data Ingest
kubectl apply -f infra/k8s/jobs/ingest-full.yaml -n gospelib-staging
# Follow the logs
kubectl logs -f job/ingest-full -n gospelib-staging
Step 8: Verify
# All pods running
kubectl get pods -n gospelib-staging
# Health endpoints
curl https://api-staging.gospelib.com/health
curl https://api-staging.gospelib.com/ready
# Test a passage query
curl https://api-staging.gospelib.com/api/v1/passages/gen.1.1
# Web app
curl -I https://staging.gospelib.com
Continuous Deployment (Automatic)
After the initial setup, deployments are fully automatic:
- Code is pushed/merged to the
stagebranch - GitHub Actions (
cd-staging.yml) detects affected services viapnpm nx show projects --affected - Only changed services are built and pushed to ECR (tagged with the commit SHA)
- The workflow updates image tags in
infra/k8s/overlays/staging/kustomization.yamlviakustomize edit set imageand commits the change back tostage - ArgoCD detects the manifest change and syncs the cluster
- The workflow polls
https://api-staging.gospelib.com/healthfor up to 5 minutes to confirm the deploy succeeded
No manual intervention is needed after the initial setup.
Troubleshooting
Pods stuck in CrashLoopBackOff
kubectl logs <pod-name> -n gospelib-staging --previous
kubectl describe pod <pod-name> -n gospelib-staging
Common causes:
- Missing secrets: A secret referenced in
envFromdoesn't exist. Check the exact secret names match what's in the Kustomize overlay. - Incorrect DATABASE_URL: Verify the RDS endpoint and password are correct.
- Port conflicts: FalkorDB and Redis both use port 6379 — they're disambiguated by service DNS names.
ArgoCD not syncing
# Check application status
kubectl -n argocd get applications
# Check sync status and any errors
kubectl -n argocd describe application gospelib-staging
Common causes:
- ArgoCD can't reach the GitHub repo (check repo credentials)
targetRevisionis set to the wrong branch (should bestage)- Kustomize build errors in the overlay
Cannot reach FalkorDB
kubectl get svc -n gospelib-staging | grep falkordb
kubectl exec -it <falkordb-pod> -n gospelib-staging -- redis-cli PING
ECR pull failures
Ensure the EC2 instance profile has ECR read permissions:
ecr:GetAuthorizationTokenecr:BatchGetImageecr:GetDownloadUrlForLayerecr:BatchCheckLayerAvailability
DNS not resolving
Check that Route53 records point to the Elastic IP:
dig staging.gospelib.com
dig api-staging.gospelib.com
# Compare with
terraform output k3s_public_ip
TLS certificate not issuing
kubectl get certificates -n gospelib-staging
kubectl describe certificate <name> -n gospelib-staging
kubectl get challenges -n gospelib-staging
Common causes:
- cert-manager not installed or not ready
- Ingress class mismatch (ClusterIssuer expects
nginx) - Port 80 not reachable from the internet (check security group)
Cost Estimate
| Resource | Monthly Cost |
|---|---|
| EC2 t3.medium (on-demand) | ~$30 |
| RDS db.t3.micro | Free tier (first 12 months), then ~$15 |
| ElastiCache cache.t3.micro | Free tier (first 12 months), then ~$13 |
| Elastic IP | Free (while attached to running instance) |
| Route53 hosted zone | $0.50 |
| ECR storage | ~$1 (varies with image count) |
| S3 state bucket | < $0.10 |
| Total | ~$32/mo (or ~$2/mo within free tier) |