RFC-079: Full-Stack Docker Compose Topology¶
- Status: Implemented
- Authors: Marko
- Created: 2026-04-22
- Domain: Infrastructure / DevOps
- Related RFCs:
docs/rfc/RFC-077-viewer-feeds-and-serve-pipeline-jobs.md(jobs API, operator config)docs/rfc/RFC-078-ephemeral-acceptance-smoke-test.md(consumes this topology for CI smoke tests)- Tracking:
- Stack implementation (Phase 1): GitHub #659
- Jobs API ↔ Docker pipeline execution (Phase 2): GitHub #660
- #659 and #660 both implement this RFC (RFC-079). RFC-078 documents smoke acceptance on top of this stack; GitHub issues for that smoke work are not opened yet (RFCs are not the task tracker).
- Implementation (Phase 1 / #659):
compose/docker-compose.stack.yml,docker/viewer/(Nginx + Vue build),docker/api/(FastAPI serve), Makefilestack-*targets,config/examples/docker-stack.example.yaml, and Full stack (RFC-079) indocs/guides/DOCKER_SERVICE_GUIDE.md. Acceptance: build,up -d,curl …/api/healthvia Nginx,compose run pipelinewithCONFIG_FILE+output_dir: /app/output.
Abstract¶
Define a production-grade Docker Compose topology that packages the full podcast_scraper
stack into three containers: Nginx (Vue static build + reverse proxy), API (FastAPI
backend), and Pipeline (one-shot ML pipeline runner). All three share a single data volume
for corpus output. This gives us a single docker compose up that runs the viewer and API,
and docker compose run pipeline to execute ingestion — matching the production deployment
model from day one.
This RFC was delivered in two phases. Phase 1 (#659) shipped the compose topology,
images, Makefile targets, and documentation. Phase 2 (#660) implemented Docker job
execution: when PODCAST_PIPELINE_EXEC_MODE=docker, POST /api/jobs delegates to
docker compose run into the pipeline or pipeline-llm service via a factory
and host Docker socket (Option B from the design evaluation below). Native laptop / venv
subprocess jobs remain the default and are unchanged (see §Native vs Docker).
Problem Statement¶
Today the stack runs as three loosely coupled processes on the developer's machine:
make serve-api— FastAPI on port 8000, reads corpus output from a local directorymake serve-ui— Vite dev server on port 5173, proxies/api/*to FastAPI- Pipeline CLI —
python -m podcast_scraper.serviceor via Makefile, writes to output dir
This works for local dev but has no path to deployment:
- The viewer has no production build-and-serve story (only Vite dev server)
- The existing
docker/pipeline/Dockerfilepackages only the pipeline runner; the API server and viewer are not containerized - The existing
compose/docker-compose.ymldefines only apodcast_scraperservice (pipeline); there is no compose service for the API or viewer - Without a compose topology, RFC-078 (ephemeral smoke test) cannot spin up the full stack in CI
Use Cases:
- Local production-like environment: developer runs
docker compose upand gets the full viewer + API onlocalhost:80, backed by real corpus data - CI smoke test (RFC-078): GitHub Actions builds the compose images, runs the pipeline against fixtures, starts the server, and runs Playwright assertions
- Future prod deployment: the same compose file (with prod overrides) deploys to a VPS
via
docker compose pull && docker compose up -d
Goals¶
- Three-container topology: Nginx, API, Pipeline — each with a clear single responsibility
- Shared data volume: pipeline writes, API reads, Nginx never touches data
- One
docker compose upstarts the viewer (Nginx) + API; pipeline is invoked separately - Production-grade Nginx: serves pre-built Vue SPA, reverse-proxies
/api/*to FastAPI, handles static caching headers, gzip, and SPA fallback routing - Reuse existing pipeline Dockerfile (
docker/pipeline/Dockerfile) with minimal changes - New Dockerfiles under
docker/api/anddocker/viewer/for API (runtime HTTP stack (seedocker/api/Dockerfile; not full.[dev]tooling) + semantic-search deps; not full.[ml]`) and Nginx (multi-stage Vue build) - Health checks on all long-running containers
- Compose profiles for CI-override and prod-override use cases — resolved: no
compose/docker-compose.dev.yml; host dev usesmake serve-*; CI/prod uses stack-test overlay + optionalcompose/docker-compose.prod.yml(see OQ2). - Document and track the gap between viewer-triggered jobs and the
pipelineservice, closed by GitHub #660 once the stack exists
Constraints and Assumptions¶
Constraints:
- No orchestration beyond Docker Compose (no Kubernetes, no Swarm)
- No external services (no DB, no Redis, no message queue) — all state is file-based
- Pipeline is one-shot (runs and exits); it is NOT a long-running service
- Single-host deployment (one machine, one person)
- ML model loading makes the pipeline image large (3-4 GB); the API image must stay slim
Assumptions:
- The API server imports search/FAISS routes at startup:
pip install -e '.[dev]'alone is not sufficient forcreate_app. The stack API image installsruntime HTTP stack (seedocker/api/Dockerfile; not full.[dev]tooling) plus NumPy,faiss-cpu, andsentence-transformers(CPU torch). Full Whisper/spaCy/llama-cpp remains on the **pipeline** image. Seedocker/api/Dockerfileanddocs/guides/DOCKER_SERVICE_GUIDE.md`. - The Vue viewer builds successfully with
npm run build(producesdist/) - The FastAPI server already mounts
StaticFilesfromweb/gi-kg-viewer/distwhen available, but in this topology Nginx handles static serving and the API runs with--no-static - A future phase will add a database; the volume-based design accommodates that migration
Design and Implementation¶
Container Architecture¶
┌─────────────────────────────────────────────────────────────────┐
│ docker compose up │
│ │
│ ┌──────────────┐ ┌──────────────┐ │
│ │ nginx │ :80 │ api │ :8000 (internal) │
│ │ │───────>│ │ │
│ │ Vue dist/ │ /api/* │ FastAPI │ │
│ │ SPA fallback│ │ --no-static │ │
│ └──────────────┘ └──────┬───────┘ │
│ │ reads │
│ ┌──────┴───────┐ │
│ │ corpus_data │ (named volume) │
│ └──────┬───────┘ │
│ │ writes │
│ ┌──────────────┐ │ │
│ │ pipeline │───────────────┘ │
│ │ (one-shot) │ │
│ │ docker │ │
│ │ compose run │ │
│ └──────────────┘ │
└─────────────────────────────────────────────────────────────────┘
1. Nginx Container (Viewer)¶
Multi-stage build: Node stage builds the Vue app, Nginx stage serves it.
docker/viewer/Dockerfile:
# ── Build stage ──────────────────────────────────────────────
FROM node:22-alpine AS builder
WORKDIR /build
COPY web/gi-kg-viewer/package.json web/gi-kg-viewer/package-lock.json ./
RUN npm ci
COPY web/gi-kg-viewer/ ./
RUN npm run build
# ── Serve stage ──────────────────────────────────────────────
FROM nginx:1.27-alpine
COPY docker/viewer/nginx.conf /etc/nginx/conf.d/default.conf
COPY --from=builder /build/dist /usr/share/nginx/html
HEALTHCHECK --interval=15s --timeout=3s --start-period=5s --retries=3 \
CMD wget -qO- http://localhost:80/ || exit 1
EXPOSE 80
docker/viewer/nginx.conf:
upstream api {
server api:8000;
}
server {
listen 80;
server_name _;
root /usr/share/nginx/html;
index index.html;
# ── API reverse proxy ────────────────────────────────────
location /api/ {
proxy_pass http://api;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_read_timeout 120s;
}
# ── Static assets (cache-busted by Vite hash) ───────────
location /assets/ {
expires 1y;
add_header Cache-Control "public, immutable";
}
# ── SPA fallback ────────────────────────────────────────
location / {
try_files $uri $uri/ /index.html;
}
# ── Gzip ────────────────────────────────────────────────
gzip on;
gzip_types text/plain text/css application/json application/javascript text/xml;
gzip_min_length 256;
}
2. API Container¶
Slim image — no ML dependencies, no model preloading. Runs the FastAPI server with
--no-static (Nginx handles static files).
docker/api/Dockerfile:
FROM python:3.12-slim
ENV PYTHONUNBUFFERED=1 \
PYTHONDONTWRITEBYTECODE=1
WORKDIR /app
# Install HTTP + search stack (see live docker/api/Dockerfile; not full .[dev])
RUN pip install --no-cache-dir '.[search]' \
&& pip install --no-cache-dir \
"fastapi>=0.115.0,<1.0.0" \
"uvicorn[standard]>=0.32.0,<1.0.0"
# Non-root user
RUN useradd -m -u 1000 -s /bin/bash appuser && \
chown -R appuser:appuser /app
USER appuser
EXPOSE 8000
HEALTHCHECK --interval=15s --timeout=5s --start-period=10s --retries=3 \
CMD python -c "import urllib.request; urllib.request.urlopen('http://localhost:8000/api/health')" || exit 1
ENTRYPOINT ["python", "-m", "podcast_scraper.cli", "serve"]
CMD ["--host", "0.0.0.0", "--port", "8000", "--no-static", "--output-dir", "/data/output"]
The --output-dir points at the shared volume mount. Additional flags
(--enable-feeds-api, --enable-jobs-api, etc.) are passed via compose
command: or environment variables.
3. Pipeline Container¶
Reuses docker/pipeline/Dockerfile. The compose service definition
sets the one-shot behavior.
4. Compose File¶
compose/docker-compose.stack.yml (shipped; does not replace the existing
compose/docker-compose.yml which remains for standalone pipeline use). Paths below match the repo: build context: .. is the repository root (compose file lives under compose/).
# Schematic — see repo file for provider env blocks and exact flags.
services:
viewer:
build:
context: ..
dockerfile: docker/viewer/Dockerfile
ports:
- "${VIEWER_PORT:-8080}:80"
depends_on:
api:
condition: service_healthy
api:
build:
context: ..
dockerfile: docker/api/Dockerfile
expose:
- "8000"
volumes:
- corpus_data:/app/output
- ${CONFIG_FILE:-../config.yaml}:/app/config.yaml:ro
environment:
PODCAST_SCRAPER_CONFIG: /app/config.yaml
pipeline:
profiles: [pipeline]
build:
context: ..
dockerfile: docker/pipeline/Dockerfile
volumes:
- corpus_data:/app/output
- ${CONFIG_FILE:-../config.yaml}:/app/config.yaml:ro
volumes:
corpus_data: {}
Docker job mode (#660): optional second file compose/docker-compose.jobs-docker.yml merges extra api volumes + env (docker.sock, PODCAST_PIPELINE_EXEC_MODE, PODCAST_DOCKER_PROJECT_DIR). See docs/guides/DOCKER_SERVICE_GUIDE.md § Full stack.
Usage patterns:
# Start viewer + API (long-running)
docker compose -f compose/docker-compose.stack.yml up -d
# Run pipeline (one-shot, writes to shared volume)
docker compose -f compose/docker-compose.stack.yml run --rm pipeline
# View logs
docker compose -f compose/docker-compose.stack.yml logs -f api
# Tear down
docker compose -f compose/docker-compose.stack.yml down
The pipeline service is under a compose profile so docker compose up does not
start it. It is only invoked explicitly with docker compose run.
5. Compose Override for CI/Smoke (RFC-078)¶
compose/docker-compose.stack-test.yml (override layered on top of compose/docker-compose.stack.yml):
# Sketch — see repo ``compose/docker-compose.stack-test.yml`` for the canonical overlay.
services:
api:
volumes:
- corpus_data:/app/output
pipeline:
volumes:
- corpus_data:/app/output
- ./tests/fixtures/rss:/app/fixtures/rss:ro
- ./config/ci/stack-test-config.yaml:/app/config.yaml:ro
profiles: []
viewer:
ports:
- "8090:80"
volumes:
corpus_data:
6. File Layout¶
docker/
viewer/
Dockerfile # Multi-stage: Node build + Nginx serve (under docker/viewer/)
nginx.conf # Reverse proxy + SPA fallback
api/
Dockerfile # Slim FastAPI server (no ML) — under docker/api/
compose/
docker-compose.stack.yml # Full-stack topology (viewer + api + pipeline)
docker-compose.stack-test.yml # RFC-078 stack-test overlay (local + CI)
docker-compose.yml # Standalone pipeline compose
docker-compose.llm-only.yml # LLM-only variant
docker/pipeline/Dockerfile # Pipeline / ML / LLM runner
7. Makefile Targets¶
# Full-stack compose
stack-up:
docker compose -f compose/docker-compose.stack.yml up -d
stack-down:
docker compose -f compose/docker-compose.stack.yml down
stack-build:
docker compose -f compose/docker-compose.stack.yml build
stack-run-pipeline:
docker compose -f compose/docker-compose.stack.yml run --rm pipeline
stack-logs:
docker compose -f compose/docker-compose.stack.yml logs -f
Key Decisions¶
- Nginx in a separate container (not FastAPI serving static files)
- Decision: Nginx serves the Vue SPA and proxies
/api/*to FastAPI -
Rationale: Production-grade from day one. Exposes configuration complexity early (reverse proxy, caching, SPA fallback) rather than discovering it during a future migration. Negligible ongoing maintenance after initial setup.
-
Pipeline as one-shot, not long-running
- Decision: Pipeline runs via
docker compose run --rm, exits when done -
Rationale: ML models consume 2-4 GB RAM. Keeping them loaded 24/7 wastes resources on a single-host deployment. Cold start (model loading ~30-60s) is acceptable for a batch process that runs at most a few times per day. Scheduled execution is handled by host cron or a lightweight cron container, not an internal scheduler.
-
Separate compose file (
compose/docker-compose.stack.yml), not modifying existing - Decision: New file alongside existing
compose/docker-compose.yml -
Rationale: The existing
compose/docker-compose.ymlandcompose/docker-compose.llm-only.ymlare used for standalone pipeline runs and Docker CI tests. Modifying them would break those workflows. The stack compose is a superset that adds viewer + API. -
API container:
runtime HTTP stack (seedocker/api/Dockerfile; not full.[dev]tooling) + semantic stack, not full.[ml]` (implemented in #659) - Decision: Install
runtime HTTP stack (seedocker/api/Dockerfile; not full.[dev]tooling), CPUtorch,numpy,faiss-cpu, andsentence-transformerssocreate_appand/api/searchwork against an on-disk index. Omit Whisper/spaCy/llama-cpp fromapi` to avoid duplicating the pipeline image. - Rationale: Smaller than fat-ML
api, while satisfying import-time router dependencies. -
Caveat: In-process
POST /api/jobsstill usessys.executableinsideapi(RFC-077); jobs that need the full ML CLI may fail until GitHub #660 delegates execution to thepipelinecontainer. Usedocker compose … run pipelinefor ops runs. -
Shared named volume, not bind mount
- Decision:
corpus_datanamed volume shared between pipeline and API (both read-write). - Rationale: Named volumes are managed by Docker, survive container restarts, and
avoid host path permissions issues. Read-write on
apiavoids breaking index rebuild / lock files; for dev override, a bind mount can be layered via adocker-compose.override.yml.
Jobs API and pipeline execution (total solution, phased)¶
RFC-077 defines POST /api/jobs: enqueue a pipeline job, persist registry rows under the
corpus (e.g. .viewer/jobs/), stream logs, cancel, reconcile. The spawn path today is
implemented in src/podcast_scraper/server/pipeline_jobs.py:
build_pipeline_argvbuilds argv:[sys.executable, "-m", "podcast_scraper.cli", "--output-dir", <corpus>, … "--config", <operator.yaml>, …].spawn_pipeline_subprocess(unless tests setapp.state.jobs_subprocess_factory) runsasyncio.create_subprocess_exec(*argv, …, cwd=<corpus>).
So when an operator clicks Run job in the viewer, the pipeline runs as a child process
of the uvicorn process, using the same Python interpreter as the API server. On a
developer machine (make serve-api with a full venv), that matches expectations.
Native vs Docker — two supported workflows¶
The product keeps both execution stories:
| Workflow | Typical entry | Job spawn | Operator YAML |
|---|---|---|---|
| Native (default today) | make serve-api, laptop venv, tests |
Subprocess of serve — RFC-077 argv + create_subprocess_exec |
Unchanged from shipped RFC-077: no Docker-only keys required for PUT or for jobs. |
| Docker stack | docker compose -f compose/docker-compose.stack.yml up + viewer |
#660 (shipped): delegate to pipeline / pipeline-llm via factory + host Docker socket (RFC Option B) |
Docker path only: pipeline_install_extras: ml \| llm in viewer_operator.yaml (aligns with docker/pipeline/Dockerfile INSTALL_EXTRAS) required when the server is configured to spawn jobs into containers — omission → 400 with a clear message. Gate on PODCAST_PIPELINE_EXEC_MODE=docker. Optional PODCAST_DOCKER_COMPOSE_FILES lists compose files passed to docker compose -f (default compose/docker-compose.stack.yml only); merge compose/docker-compose.jobs-docker.yml at up time so api gets the socket and env. |
Principle: never require Docker-only metadata for operators who only ever run native subprocess jobs; validate pipeline_install_extras (and profile↔tier checks) only on the Docker enqueue/spawn path.
Phase 1 (this RFC, #659): compose capability without changing job spawn semantics¶
Delivering compose/docker-compose.stack.yml adds a parallel, ops-first execution path:
docker compose run --rm pipelineruns the existing pipeline image against the shared volume (cron, manual ops, RFC-078 smoke, deploy scripts).
Phase 1 does not by itself:
- Mount the Docker socket into
api - Set
app.state.jobs_subprocess_factoryfor compose - Replace subprocess spawn with HTTP to a worker
Therefore after Phase 1, behavior is:
| Where the API runs | What "Run job" does |
|---|---|
Laptop (make serve-api, full venv) |
Same as today — subprocess cli on the host |
Docker Compose (api container) |
Subprocess cli inside api — requires that image to carry everything build_pipeline_argv needs, or jobs will fail until Phase 2 |
The dedicated pipeline service image is the right place for ML; api should stay
thin once Phase 2 delegates execution.
Phase 2 (#660): Docker job execution (implemented — Option B)¶
Implemented: app.state.jobs_subprocess_factory is set by attach_docker_jobs_factory
when PODCAST_PIPELINE_EXEC_MODE=docker. The factory runs docker compose run into
the pipeline or pipeline-llm service, wiring stdout to the job log path. Requires
host Docker socket mounted into api (see compose/docker-compose.jobs-docker.yml) and
PODCAST_DOCKER_PROJECT_DIR pointing to the repo root visible to the Docker daemon.
Design alternatives evaluated during planning (historical context only):
| Option | Description | Outcome |
|---|---|---|
A — Fat api image |
Install full cli + ML extras in api so subprocess works unchanged. |
Not chosen — duplicates pipeline image, large api. |
B — jobs_subprocess_factory |
Factory runs docker compose run for the pipeline image via host socket. |
Implemented (pipeline_docker_factory.py). |
| C — Worker service | Separate worker claims jobs from disk/queue. | Not chosen — more moving parts than needed for single-host. |
Tracking: GitHub #660.
#660 implementation checklist (paste / track in the issue)¶
- Exec mode:
PODCAST_PIPELINE_EXEC_MODEselects native (unset / notdocker: current subprocess) vsdocker(factory /docker compose run). Native path: no new mandatory operator keys. - Image selection: when mode is docker, operator YAML used for the job must include
pipeline_install_extras∈{ ml, llm }— maps toINSTALL_EXTRAS/ the correct compose service (e.g.pipelinevspipeline-llmonce the LLM tier exists). No silent default tomlon omission. - Profile↔tier: optional script or test gate (see gap matrix /
make verify) so packaged profiles do not declare capabilities the chosen image lacks. - Secrets: keys only via
.env/ CI secrets — never committed; compose uses${VAR:-}pass-through (seeDOCKER_SERVICE_GUIDE+ §Secrets in internal plan). - Docs: update this RFC’s rollout row, DOCKER_SERVICE_GUIDE.md, and RFC-077 compose extension after merge.
- Optional follow-ups: skim §Optional follow-ups — close or defer items explicitly in the PR (e.g. new
viewer_operator.docker.example.yaml).
Alternatives Considered¶
- FastAPI serves static files (single container)
- Description: Use the existing
StaticFilesmount increate_app()to serve the Vue build directly from the API container - Pros: Simpler (one fewer container, no Nginx config)
- Cons: No caching headers, no gzip, no SPA fallback without custom middleware, mixes concerns, harder to scale later
-
Why Rejected: User decision — prefer production-grade from day one
-
Pipeline as long-running service with internal scheduler
- Description: Pipeline container stays up, runs ingestion on a cron schedule
- Pros: Models stay warm, no cold start
- Cons: 2-4 GB idle RAM, memory leak risk, more complex process management
- Why Rejected: Resource waste on single-host; one-shot with host cron is simpler
Testing Strategy¶
Validation of compose topology:
docker compose -f compose/docker-compose.stack.yml buildsucceeds (CI)docker compose -f compose/docker-compose.stack.yml up -dstarts viewer + API/api/healthreturns 200 from the Nginx port (proves proxy works)http://localhost/serves the Vue SPA (proves static serving works)docker compose run --rm pipeline --helpprints service help (entrypoint forwards--help)
Integration with RFC-078:
- The smoke test workflow layers
compose/docker-compose.stack-test.ymlon top - Pipeline runs against fixture feeds, writes to
corpus_datavolume - API reads the output, Playwright tests the viewer through Nginx
Where it is tracked: Stack contracts and compose/docker-compose.stack.yml are #659 (handoff checklist there). Smoke workflow + Playwright are RFC-078 and must be opened as their own GitHub issues when you start tracking that work — not as orphan bullets only in this RFC.
Rollout¶
- Phase 1a — #659: Create
docker/viewer/Dockerfile,docker/viewer/nginx.conf,docker/api/Dockerfile,compose/docker-compose.stack.yml. Validate locally withstack-upand manual browser test; validatedocker compose runforpipeline. - Jobs / Docker — #660: Implemented Option B —
pipeline_docker_factory+ host socket +docker compose run; native subprocess unchanged (see §Native vs Docker). - RFC-078 smoke: Add
compose/docker-compose.stack-test.ymland wire into CI smoke workflow (orthogonal to #660; can land in parallel). Not tracked in this RFC as orphan work — open GitHub issues for RFC-078 execution; #659 only carries the stack handoff checklist. - Phase 3 — #659: Shipped starter
compose/docker-compose.prod.yml(restart policies + commented VPS / external volume hints). Operators fork or extend for real prod.
Success Criteria:
docker compose -f compose/docker-compose.stack.yml up -dstarts viewer + API in under 60s (pre-built images)http://localhost:${VIEWER_PORT:-8080}/serves the Vue viewer;/api/healthreturns 200 via Nginxdocker compose run --rm pipelinecompletes ingestion and API serves the new data without restart
Pipeline image tiers and profile compatibility¶
The pipeline service in compose/docker-compose.stack.yml reuses docker/pipeline/Dockerfile with two
build args: STACK_PIPELINE_INSTALL_EXTRAS and STACK_PIPELINE_PRELOAD_ML. These
produce two image tiers (same binary, different dependency surface):
| Tier | INSTALL_EXTRAS |
PRELOAD_ML_MODELS |
Approx size | What it can run |
|---|---|---|---|---|
| ML (default) | ml |
true (or false for faster dev builds) |
3-4 GB | Any profile: local Whisper, spaCy NER, transformers summarization, FAISS index build, SummLlama, plus cloud LLM calls |
LLM (pyproject .[llm]) |
llm |
N/A (skipped) | ~1–1.5 GB (target) | Cloud/API-heavy profiles (e.g. cloud_thin) with no local torch/spaCy/FAISS stack; pairs with pipeline_install_extras: llm on the Docker job path |
| Core / minimal | "" |
N/A (skipped) | smallest | Bare pipeline without optional groups — dev-only or legacy; prefer llm tier once implemented for API-only profiles |
Profile → minimum image tier¶
| Profile | Transcription | NER | Summary | GI/KG source | Vector search | Minimum tier |
|---|---|---|---|---|---|---|
config/profiles/airgapped.yaml |
Whisper (local) | spaCy trf (local) | SummLlama (local) | summary_bullets | FAISS (local) | ML |
config/profiles/local.yaml |
Whisper (local) | spaCy trf (local) | Ollama (local daemon) | provider | FAISS (local) | ML |
config/profiles/cloud_balanced.yaml |
OpenAI whisper-1 (API) | spaCy trf (local) | Gemini (API) | provider | FAISS (local) | ML |
config/profiles/cloud_quality.yaml |
OpenAI whisper-1 (API) | spaCy trf (local) | Anthropic (API) | provider | FAISS (local) | ML |
config/profiles/cloud_thin.yaml |
Cloud API only | Cloud API only | Cloud API only | provider | false |
LLM (INSTALL_EXTRAS=llm) |
Key insight: today's "cloud" profiles (cloud_balanced, cloud_quality) still require
spaCy for NER and FAISS for vector indexing, so they need the ML tier. The
llm install tier (and cloud_thin) spans stack compose / images / validator (#659 for what is not already merged) and Docker job validation (#660 for pipeline_install_extras: llm path).
Recommended compose usage¶
# Default: ML tier (works with any profile)
make stack-build
CONFIG_FILE=$PWD/config/profiles/cloud_balanced.yaml make stack-run-pipeline
# Dev: faster build for API-only profiles — today often `STACK_PIPELINE_INSTALL_EXTRAS=""`
# Once the `llm` tier lands, prefer: STACK_PIPELINE_INSTALL_EXTRAS=llm
STACK_PIPELINE_INSTALL_EXTRAS="" STACK_PIPELINE_PRELOAD_ML=false make stack-build
CONFIG_FILE=$PWD/my-llm-only-profile.yaml make stack-run-pipeline
Optional follow-ups (deferred)¶
Each item below is also listed on #659 or #660 so nothing lives only here.
- RFC index — index.md Open RFCs (detail) table now includes one-line rows for RFC-078 and RFC-079 (smoke vs stack, issue pointers).
- Example operator YAML — Done:
config/examples/viewer_operator.example.yamlstays native-default (nopipeline_install_extras); Docker path documented inconfig/examples/viewer_operator.docker.example.yaml. #660 closure tracks guide/RFC cross-checks only. - Automated coverage — #660 includes unit tests for the Docker factory helpers, integration tests for Docker-mode validation (with a fake subprocess factory),
make verify-stack-profilesin CI whenconfig/profiles/**,compose/**, the tier validator script, orpython-app.ymlchanges, and manual stack +jobs-dockeracceptance recorded on #660. Merge-blocking realdocker compose runinside GitHub-hosted runners is out of scope for #660 (open a new issue if required).
Operational contracts (quick reference)¶
| Topic | Source of truth |
|---|---|
| Stack compose + images | compose/docker-compose.stack.yml, docker/pipeline/Dockerfile, docker/api/Dockerfile, docker/viewer/Dockerfile |
| Makefile targets | Makefile (stack-*, smoke-*, verify-stack-profiles) |
Secrets / .env |
DOCKER_SERVICE_GUIDE.md § Full stack → Secrets (stack) |
| Native vs Docker jobs | §Native vs Docker; #660 |
| Profile ↔ image tier | scripts/tools/validate_profile_docker_tier.py; RFC-079 §Pipeline image tiers |
| Ephemeral CI smoke | RFC-078, compose/docker-compose.stack-test.yml, make stack-test-* |
| Prod-style merge (restart, VPS notes) | compose/docker-compose.prod.yml + DOCKER_SERVICE_GUIDE § RFC-079 backlog |
Open Questions¶
- VITE_* build-time env vars: Resolved — the SPA uses relative
/api/paths; Nginx proxies toapi:8000. NoVITE_API_BASE_URLneeded. The only reference to127.0.0.1:8000is invite.config.ts(dev proxy), which is not used at build time. - Hot reload for dev: Resolved (Won’t ship) — no
compose/docker-compose.dev.yml. Usemake serve-api/make serve-ui/make servefor hot reload; Compose targets CI/prod-like runs. Seedocs/guides/DOCKER_SERVICE_GUIDE.md§ RFC-079 backlog → Compose “dev override”. - Config file mounting: Resolved —
CONFIG_FILEenv var (default./config.yaml) is bind-mounted at/app/config.yamlin bothapiandpipeline. The example configconfig/examples/docker-stack.example.yamlships withoutput_dir: /app/output. - API read-only volume race: Documented — many small writes use atomic temp+replace;
full corpus runs remain multi-file. Operational rule: finish
pipelinebefore serving new data fromapion the same volume (matches RFC-078 ordering). SeeDOCKER_SERVICE_GUIDE.md§ RFC-079 backlog → Concurrent pipeline writes and API reads. - FAISS index reload: Resolved —
/api/searchloadsFaissVectorStorefrom disk per request, so a completed pipeline write set is visible on the next search without restartingapi. UsePOST /api/index/rebuildfor an explicit rebuild. SeeDOCKER_SERVICE_GUIDE.md§ RFC-079 backlog → FAISS / vector index. - Jobs in compose: Default remains subprocess in
api.PODCAST_PIPELINE_EXEC_MODE=docker(with Docker socket +PODCAST_DOCKER_PROJECT_DIR+ operatorpipeline_install_extras) runs jobs viadocker compose runintopipeline/pipeline-llm; seepodcast_scraper.server.pipeline_docker_factoryand DOCKER_SERVICE_GUIDE § Full stack. Shipped under #660. - LLM pipeline tier vs cloud profiles: Shipped
cloud_thin.yaml+INSTALL_EXTRAS=llm(pipeline-llmservice) pairs thin cloud-only runs with the LLM image tier.cloud_balanced/cloud_qualityremain ML tier (spaCy + FAISS).