Architecture Decision Records (ADRs)¶

Purpose¶

Architecture Decision Records (ADRs) capture the what and why of decisions taken after technical discussion (often in an RFC or issue). RFCs hold the full design, alternatives, and implementation journey; Accepted ADRs are the short, immutable record of what we chose.

How ADRs Work¶

Immutable Records: Once an ADR is accepted, it remains unchanged unless superseded by a new ADR.
Context Driven: They explain the trade-offs and rationale behind a decision (not the whole solution document).
Reference for Developers: They provide onboarding context for why certain patterns (like the Provider Protocol) were chosen.

ADR Index¶

Code (last column): Yes = reflected in the codebase; Partial = incomplete or still rolling out; No = not started (including accepted ADRs waiting on implementation).

ADR	Title	Status	Related RFC	Description	Code
ADR-001	Hybrid Concurrency Strategy	Accepted	RFC-001	IO-bound threading, sequential CPU/GPU tasks	Yes
ADR-002	Security-First XML Processing	Accepted	RFC-002	Mandated use of defusedxml for RSS parsing	Yes
ADR-003	Deterministic Feed Storage	Accepted	RFC-004	Hash-based output directory derivation	Yes
ADR-004	Flat Filesystem Archive Layout	Accepted	RFC-004	Flat directory structure per feed run	Yes
ADR-005	Lazy ML Dependency Loading	Accepted	RFC-005	Function-level imports for heavy ML libraries	Yes
ADR-006	Context-Aware Model Selection	Accepted	RFC-010	Automatic English model promotion (.en)	Yes
ADR-007	Universal Episode Identity	Accepted	RFC-011	GUID-first deterministic episode ID generation	Yes
ADR-008	Database-Agnostic Metadata Schema	Accepted	RFC-011	Unified JSON format for SQL/NoSQL	Yes
ADR-009	Privacy-First Local Summarization	Accepted	RFC-012	Local Transformers over Cloud APIs	Yes
ADR-010	Hierarchical Summarization Pattern	Accepted	RFC-012	Map-reduce chunking for long transcripts	Yes
ADR-011	Secure Credential Injection	Accepted	RFC-013	Environment-based secret management	Yes
ADR-012	Provider-Agnostic Preprocessing	Accepted	RFC-013	Shared pre-inference cleaning pipeline	Yes
ADR-013	Standalone Experiment Configuration	Accepted	RFC-015	Separation of research params from code	Yes
ADR-014	Codified Comparison Baselines	Accepted	RFC-015, RFC-041	Objective delta measurement vs baseline artifacts	Yes
ADR-015	Deep Provider Fingerprinting	Accepted	RFC-016	Hardware and environment tracking for reproducibility	Yes
ADR-016	Typed Provider Parameter Models	Accepted	RFC-016	Pydantic validation for backend parameters	Yes
ADR-017	Registered Preprocessing Profiles	Accepted	RFC-016	Versioned cleaning logic tracking	Yes
ADR-018	Externalized Prompt Management	Accepted	RFC-017	Versioned Jinja2 templates in prompts/	Yes
ADR-019	Standardized Test Pyramid	Accepted	RFC-018, RFC-024	Strict unit/integration/e2e tiering	Yes
ADR-020	Protocol-Based Provider Discovery	Accepted	RFC-021	Decoupling via PEP 544 Protocols	Yes
ADR-021	Acceptance Test Tier as Final CI Gate	Accepted	RFC-023	Fourth test tier for README/documentation accuracy; runs last in CI	Yes
ADR-022	Flaky Test Defense	Accepted	RFC-025	Automated retries and health reporting	Yes
ADR-023	Public Operational Metrics	Accepted	RFC-026	Transparency via GitHub Pages dashboards	Yes
ADR-024	Unified Provider Pattern	Accepted	RFC-029	Type-based unified provider classes	Yes
ADR-025	Technology-Based Provider Naming	Accepted	RFC-029	Clear library-based option naming	Yes
ADR-026	Per-Capability Provider Selection	Accepted	RFC-032, RFC-033, RFC-034, RFC-035, RFC-036, RFC-037	Independent provider choice per capability; partial-protocol providers allowed	Yes
ADR-027	Unified Provider Metrics Contract	Accepted	-	Standardized `ProviderCallMetrics` pattern for all providers	Yes
ADR-028	Unified Retry Policy with Metrics	Accepted	-	Centralized retry for LLM/API providers with backoff and metrics (not RSS/media HTTP; see CONFIGURATION — Download resilience)	Yes
ADR-029	Grouped Dependency Automation	Accepted	RFC-038	Balanced Dependabot updates via grouping	Yes
ADR-030	Periodic Module Coupling Analysis	Accepted	RFC-038	Nightly visualization of architecture health	Yes
ADR-031	Mandatory Pre-Release Validation	Accepted	RFC-038	Standardized checklist script for releases	Partial
ADR-032	Git Worktree-Based Development	Accepted	RFC-039	Parallel stable dev environments	Yes
ADR-033	Stratified CI Execution	Accepted	RFC-039	Fast push checks vs. full PR validation	Yes
ADR-034	Isolated Runtime Environments	Accepted	RFC-039	Independent venv per worktree	Yes
ADR-035	Linear History via Squash-Merge	Accepted	RFC-039	Clean, revertible main branch history	Yes
ADR-036	Standardized Pre-Provider Audio Stage	Accepted	RFC-040	Mandatory optimization before any transcription	Yes
ADR-037	Content-Hash Based Audio Caching	Accepted	RFC-040	Shared optimized artifacts in .cache/	Yes
ADR-038	FFmpeg-First Audio Manipulation	Accepted	RFC-040	System-level performance for audio pipelines	Yes
ADR-039	Speech-Optimized Codec (Opus)	Accepted	RFC-040	MP3 (`libmp3lame` @ 64 kbps) for preprocessed audio; Opus rejected (see ADR)	Yes
ADR-040	Explicit Golden Dataset Versioning	Accepted	RFC-041	Approved, frozen ground truth data versions	Yes
ADR-041	Multi-Tiered Benchmarking Strategy	Accepted	RFC-041	Fast PR smoke tests vs nightly full benchmarks	Yes
ADR-042	Heuristic-Based Quality Gates	Accepted	RFC-041	Regex-based detection of common AI failure modes	Yes
ADR-043	Hybrid MAP-REDUCE Summarization	Accepted	RFC-042	Compression (Classic) + Abstraction (Instruct LLM)	Yes
ADR-044	Local LLM Backend Abstraction	Accepted	RFC-042	Support for llama.cpp, ollama, and transformers	Yes
ADR-045	Strict REDUCE Prompt Contract	Accepted	RFC-042	Mandatory markdown structure for LLM outputs	Yes
ADR-046	MPS Exclusive Mode for Apple Silicon	Accepted	RFC-042	Serialize GPU work on MPS to prevent memory contention; default on	Yes
ADR-047	Proactive Metric Regression Alerting	Accepted	RFC-043	Automated PR comments and webhook notifications	Partial
ADR-048	Centralized Model Registry	Accepted	RFC-044, RFC-029	Single source of truth for model architecture limits	Yes
ADR-049	Materialization Boundary for Evaluation Inputs	Accepted	RFC-046	Preprocessing becomes dataset definition via materialization_id; chunking stays in run config	Yes
ADR-050	Single Code Path for Evaluation and Application	Accepted	RFC-048	Eval and app share identical execution path; scorers are read-only observers	Yes
ADR-051	Per-Episode JSON Artifacts with Logical Union	Accepted	RFC-049, RFC-055, RFC-061	Shard by episode (gi.json, kg.json); union at query time; optional materialization	Yes
ADR-052	Separate GIL and KG Artifact Layers	Accepted	RFC-049, RFC-055	Independent schemas, feature flags, CLI namespaces, and evolution paths	Yes
ADR-053	Grounding Contract for Evidence-Backed Insights	Accepted	RFC-049, RFC-050	Explicit grounded boolean, verbatim quotes with spans, evidence chain	Yes
ADR-054	Relational Postgres Projection for GIL and KG	Accepted	RFC-051	Files canonical, Postgres is derived; separate GIL/KG tables; provenance on every row	No
ADR-055	Adaptive Summarization Routing	Proposed	RFC-053	Rule-based routing with episode profiling for summarization strategies	No
ADR-056	Composable E2E Mock Response Strategy	Proposed	RFC-054	Separation of functional responses from non-functional behavior in tests	No
ADR-057	AutoResearch Thin Harness with Credential Isolation	Accepted	RFC-057	Thin control layer reusing existing eval; immutable score.py; AUTORESEARCH_* credential vars	Yes
ADR-058	Additive pyannote Diarization with Separate `[diarize]` Extra	Accepted	RFC-058	pyannote as additive second pass; segment-level; separate [diarize] dependency group	No
ADR-059	Confidence-Scored Multi-Signal Commercial Detection	Accepted	RFC-060	Confidence-scored candidates replace binary detection; pattern primary, diarization adjusts	No
ADR-060	VectorStore Protocol with Backend Abstraction	Accepted	RFC-061	PEP 544 protocol decoupling FAISS (Phase 1) from Qdrant (Phase 2)	Yes
ADR-061	FAISS Phase 1 with Post-Filter Metadata Strategy	Accepted	RFC-061	Over-fetch + post-filter for CLI-scale; auto index type selection	Yes
ADR-062	Sentence-Boundary Transcript Chunking	Accepted	RFC-061	Regex sentence split, configurable target/overlap tokens, timestamp interpolation	Yes
ADR-063	Transparent Semantic Upgrade for gi explore	Accepted	RFC-061, RFC-050	Auto-detect vector index; semantic if available, substring fallback if not	Yes
ADR-064	Canonical Server Layer with Feature-Flagged Route Groups	Accepted	RFC-062	`server/` module with `podcast serve` CLI; viewer routes v2.6, platform routes v2.7	Yes
ADR-065	Vue 3 + Vite + Cytoscape.js Frontend Stack	Accepted	RFC-062	Unified frontend stack for viewer and future platform UI	Yes
ADR-066	Playwright for UI End-to-End Testing	Accepted	RFC-062	Browser regression testing; extends ADR-020 test pyramid with UI layer	Yes
ADR-067	Pegasus/LED Retirement for Podcast Content	Accepted	RFC-057	GSG pretraining mismatch → near-duplicate chunks → LED ngram exhaustion; reserved for news content type	Yes
ADR-068	BART+LED as Local ML Production Baseline	Accepted	RFC-057	Autoresearch sweep: +4.26% ROUGE-L over dev baseline (18.82%); 2 params accepted (max_new_tokens=550, num_beams=6)	Yes
ADR-069	Hybrid ML Pipeline as Primary Production Direction	Accepted	RFC-057, RFC-042	BART MAP + Llama 3.2:3b REDUCE at 23.1% ROUGE-L; closes 70% of cloud quality gap; temp=0.5, top_p=1.0	Yes
ADR-070	BART-base as Hybrid MAP Stage	Accepted	RFC-057	BART beats LongT5 as MAP (21.2% vs 20.8%); pretraining alignment > context window size	Yes
ADR-071	Four-Tier Summarization Strategy	Accepted	RFC-057	ML Dev / ML Prod / LLM Local / LLM Cloud — direct Llama 3.2:3b beats hybrid (24.3% vs 23.7%, 2x faster)	Yes
ADR-072	Llama 3.2:3b as Tier 3 Local LLM	Accepted	RFC-057	3B beats 7-12B models — instruction-following > size; temp=0.3 direct, temp=0.5 hybrid; 26.4% ROUGE-L @ 7.5s/ep	Yes
ADR-073	RFC-057 Autoresearch Loop — Closure and Final State	Accepted	RFC-057	Closes RFC-057; documents Track A/B outcomes, silver refs, 72-config matrix, production defaults	Yes
ADR-074	Multi-Feed Corpus Parent Layout and Machine-Readable Manifest	Accepted	RFC-063	Layout A corpus parent; unified discovery; `corpus_manifest.json` / optional summaries as operational artifacts	Yes
ADR-075	Frozen YAML Performance Profiles for Release Resource Baselines	Accepted	RFC-064	`data/profiles/*.yaml` + freeze/diff scripts; resource cost sibling to quality baselines	Yes
ADR-076	Streamlit for Operator Run Comparison and Performance Views	Accepted	RFC-047, RFC-066	Eval / Performance UI stays in `tools/run_compare/`; Vue viewer stays corpus-first	Yes
ADR-077	Local Ollama Model Selection	Accepted	—	Default Ollama models per profile and tier	Yes
ADR-078	GIL Evidence Stack Bundling — Per-Provider Champion Modes	Accepted	—	`bundled_ab` default; Mistral=`bundled_b_only`; Ollama bundled-only (staged unviable on local)	Yes
ADR-079	OpenTofu for Always-On Hosting IaC	Accepted	RFC-082	OpenTofu + `hcloud` + `tailscale` providers; `infra/tofu` entry	Yes
ADR-080	OpenTofu State Encrypted In-Repo (sops + age)	Accepted	RFC-082	Committed `.enc` state; `TFSTATE_AGE_KEY` in CI	Yes
ADR-081	Drill OpenTofu Workspace and Tailscale ACL Ownership	Accepted	RFC-082	Workspace `drill`, `HCLOUD_TOKEN_DRILL`, prod-only `tailscale_acl`	Yes
ADR-082	GitOps App Deploy via stack-test and GitHub Actions	Accepted	RFC-082	Stack-test gate + publish + `deploy-prod`; `workflow_run` target; infra apply manual	Yes
ADR-083	Tailscale as Private Ingress for Always-On VPS	Accepted	RFC-082	App on tailnet; `tag:gha-deployer` SSH path	Yes
ADR-084	Full-Stack Docker Compose Topology (API, Viewer, Pipeline)	Accepted	RFC-079	`compose/docker-compose.stack.yml`; shared volume; optional Docker job exec	Yes
ADR-085	Ephemeral Stack-Test Integration Gate on Main	Accepted	RFC-078, RFC-079	Compose overlay + Playwright `tests/stack-test/`; distinct from ADR-021	Yes
ADR-086	Canonical Identity Layer and Per-Episode bridge.json Cross-Layer Join	Accepted	RFC-072	CIL ids + `bridge.json` seam; GIL or KG stay separate (ADR-052)	Yes
ADR-087	Autoresearch Track A v2 — Dev or Held-Out Split and Judging	Accepted	RFC-073, RFC-057	Disjoint held-out; fraction contestation; Efficiency rubric; seed wiring	Yes
ADR-088	macOS Local CI Process Safety for ML Workloads	Accepted	RFC-074	No parse-time ML probes; cleanup or zombie checks; agent no-pileup rules	Yes
ADR-089	Prod Failover Orchestrator Separate from DR Drill	Accepted	RFC-083	Own workflow family; reuse drill workspace/secrets; no auto-destroy; GitHub #764	No
ADR-090	Prod Failover — DNS-First Cutover on Tailnet	Accepted	RFC-083	Canonical hostname DNS flip primary; floating IP optional	No
ADR-091	Prod Failover — GHA Triggers and Gates	Accepted	RFC-083	Manual cutover/failback/teardown; freeze prod schedules; spare schedules off after restore	No
ADR-092	Corpus Snapshot Backup Manifest and Newest-Compatible Restore Default	Accepted	RFC-084	`snapshot.manifest.json`; dual placement; newest-compatible default; fail closed; GitHub #763	Yes
ADR-093	Canonical Stack Contract Versus Environment Adapters	Accepted	RFC-082	One topology/health/`stack-test` discipline; adapters for transport/secrets only; steady vs restore playbooks separate; GitHub #762	Yes

Gap analysis¶

Counts (reconcile when adding ADRs): 93 files under docs/adr/ADR-*.md (ADR-001–ADR-093; numbering has historical gaps). From the index table: 2 Proposed (ADR-055, ADR-056), 7 Accepted with Code = No (ADR-054, ADR-058, ADR-059, ADR-089, ADR-090, ADR-091), 2 Accepted with Code = Partial (ADR-031, ADR-047). Accepted means ratified, not necessarily shipped.

When to extract a new ADR¶

Use an ADR when one or more of these hold; otherwise an RFC + normative doc (API guide, docs/api/*.md, UXS) is usually enough.

ADR type	When to extract	Recent examples
Closure / program outcome	A large RFC program ends; you need an immutable summary.	ADR-073 closes RFC-057
Empirical production defaults	Benchmarks change default models/tiers you must freeze for onboarding.	ADR-067–ADR-072
Stack & ownership boundary	Who owns HTTP, which frontend stack, which UI E2E runner.	ADR-064–ADR-066
Heavy optional dependencies	An extra bloats install or splits CUDA/CPU paths; defaults must not pay the cost.	ADR-058 (accepted; `[diarize]` not landed)
Cross-cutting protocol / contract	Multiple subsystems share the same interface.	ADR-060, ADR-053, ADR-051
Process / CI philosophy	A policy decision that outlives one RFC.	ADR-021

When not to add an ADR¶

Viewer milestones that do not change stack (e.g. RFC-069) — RFC + feature UXS (e.g. UXS-004) + UXS-001 hub + E2E map suffice.
Single-route APIs for the viewer with schema in code + tests (e.g. RFC-068) — Server Guide + tests suffice.
Operational tooling without architectural boundary moves (e.g. RFC-065) — RFC-first.
Frozen artifact workflows (e.g. RFC-064); profile YAML baselines are covered by ADR-075.

ADRs by implementation state¶

Proposed

ADR	Primary RFC	Note
ADR-055	RFC-053	No episode profiling / routing in pipeline yet
ADR-056	RFC-054	Composable ResponseProfile / Router not implemented

Accepted, code not landed (expected)

ADR	Primary RFC	Note
ADR-054	RFC-051	Postgres projection future
ADR-058	RFC-058	No `[diarize]` extra in `pyproject.toml` yet
ADR-059	RFC-060	Commercial detector as designed not landed
ADR-089	RFC-083	Prod-failover workflows not landed
ADR-090	RFC-083	Runbook or DNS automation follow-up
ADR-091	RFC-083	`repository_dispatch` + cutover gates not landed

Accepted, partial

ADR	Gap
ADR-031	`make pre-release` / checklist not fully aligned with RFC-038
ADR-047	Alerts exist; automated PR comments not complete

Stale-audit corrections (reference)¶

Trust the Code column in the table above: ADR-048 is implemented; ADR-062 / ADR-063 are Yes; ADR-064–ADR-066 are implemented; ADR-021 is reflected in script-based make test-acceptance.

Situation cheat sheet¶

Situation	Guidance
Prefer a new ADR	Irreversible stack boundary, cross-cutting protocol, frozen empirical default, heavy optional extra, or closure of a large program (e.g. ADR-073).
Often RFC-only	Bounded HTTP routes or viewer tabs where ADR-064–ADR-066 + UXS already fix the stack (e.g. RFC-067, RFC-068, RFC-069, RFC-071). Corpus layout + manifest: ADR-074. Frozen resource baselines: ADR-075. Streamlit vs Vue for eval tools: ADR-076. Full-stack Compose + stack-test gate: ADR-084, ADR-085. CIL + `bridge.json`: ADR-086. Autoresearch Track A v2: ADR-087. macOS ML `make` safety: ADR-088. Prod failover design: RFC-083; decisions ADR-089–ADR-091. Corpus snapshot backup manifest + restore defaults: RFC-084; ADR-092. Cross-surface stack contract vs adapters: ADR-093 (#762).
Proposed ADRs	Promote ADR-055 / ADR-056 to Accepted (or supersede) when RFC-053 / RFC-054 ship end-to-end.

Future triggers¶

Multi-feed manifest as an immutable external contract beyond CORPUS_MULTI_FEED_ARTIFACTS.md — partially addressed by ADR-074.
.pipeline_status.json schema if external monitors depend on it and breaking changes need versioning.
Profile YAML for non-Python consumers beyond tools/run_compare / make profile-diff — partially addressed by ADR-075.
RFC-070 + ADR-060 when platform vector backends land materially.
Full-stack Compose, stack-test, CIL or bridge, autoresearch v2, macOS ML process safety — see ADR-084–ADR-088 (normative detail remains in RFC-072, RFC-073 v2 file, RFC-074, RFC-078, RFC-079).
Prod failover (stand up spare, validate, gated cutover) — RFC-083 (Draft); decisions ADR-089–ADR-091; GitHub #764.
Corpus snapshot tarball metadata + version-aware restore — RFC-084 (Completed); ADR-092; GitHub #763.

Open decisions without ADRs: see Architecture Decision Candidates below.

Related: PRD gap analysis, RFC gap analysis.

Architecture Decision Candidates¶

These items have been identified as potential architectural decisions but are currently under review.

Creating New ADRs¶

Use the ADR Template to document new architectural decisions. Decisions typically originate from an RFC that has been reviewed and often Completed when implementation lands (RFCs use Completed, not Accepted — Accepted is the ADR status).