Architecture Decision Records (ADRs)¶
Purpose¶
Architecture Decision Records (ADRs) capture the what and why of decisions taken after technical discussion (often in an RFC or issue). RFCs hold the full design, alternatives, and implementation journey; Accepted ADRs are the short, immutable record of what we chose.
How ADRs Work¶
- Immutable Records: Once an ADR is accepted, it remains unchanged unless superseded by a new ADR.
- Context Driven: They explain the trade-offs and rationale behind a decision (not the whole solution document).
- Reference for Developers: They provide onboarding context for why certain patterns (like the Provider Protocol) were chosen.
ADR Index¶
Code (last column): Yes = reflected in the codebase; Partial = incomplete or still rolling out; No = not started (including accepted ADRs waiting on implementation).
| ADR | Title | Status | Related RFC | Description | Code |
|---|---|---|---|---|---|
| ADR-001 | Hybrid Concurrency Strategy | Accepted | RFC-001 | IO-bound threading, sequential CPU/GPU tasks | Yes |
| ADR-002 | Security-First XML Processing | Accepted | RFC-002 | Mandated use of defusedxml for RSS parsing | Yes |
| ADR-003 | Deterministic Feed Storage | Accepted | RFC-004 | Hash-based output directory derivation | Yes |
| ADR-004 | Flat Filesystem Archive Layout | Accepted | RFC-004 | Flat directory structure per feed run | Yes |
| ADR-005 | Lazy ML Dependency Loading | Accepted | RFC-005 | Function-level imports for heavy ML libraries | Yes |
| ADR-006 | Context-Aware Model Selection | Accepted | RFC-010 | Automatic English model promotion (.en) | Yes |
| ADR-007 | Universal Episode Identity | Accepted | RFC-011 | GUID-first deterministic episode ID generation | Yes |
| ADR-008 | Database-Agnostic Metadata Schema | Accepted | RFC-011 | Unified JSON format for SQL/NoSQL | Yes |
| ADR-009 | Privacy-First Local Summarization | Accepted | RFC-012 | Local Transformers over Cloud APIs | Yes |
| ADR-010 | Hierarchical Summarization Pattern | Accepted | RFC-012 | Map-reduce chunking for long transcripts | Yes |
| ADR-011 | Secure Credential Injection | Accepted | RFC-013 | Environment-based secret management | Yes |
| ADR-012 | Provider-Agnostic Preprocessing | Accepted | RFC-013 | Shared pre-inference cleaning pipeline | Yes |
| ADR-013 | Standalone Experiment Configuration | Accepted | RFC-015 | Separation of research params from code | Yes |
| ADR-014 | Codified Comparison Baselines | Accepted | RFC-015, RFC-041 | Objective delta measurement vs baseline artifacts | Yes |
| ADR-015 | Deep Provider Fingerprinting | Accepted | RFC-016 | Hardware and environment tracking for reproducibility | Yes |
| ADR-016 | Typed Provider Parameter Models | Accepted | RFC-016 | Pydantic validation for backend parameters | Yes |
| ADR-017 | Registered Preprocessing Profiles | Accepted | RFC-016 | Versioned cleaning logic tracking | Yes |
| ADR-018 | Externalized Prompt Management | Accepted | RFC-017 | Versioned Jinja2 templates in prompts/ | Yes |
| ADR-019 | Standardized Test Pyramid | Accepted | RFC-018, RFC-024 | Strict unit/integration/e2e tiering | Yes |
| ADR-020 | Protocol-Based Provider Discovery | Accepted | RFC-021 | Decoupling via PEP 544 Protocols | Yes |
| ADR-021 | Acceptance Test Tier as Final CI Gate | Accepted | RFC-023 | Fourth test tier for README/documentation accuracy; runs last in CI | Yes |
| ADR-022 | Flaky Test Defense | Accepted | RFC-025 | Automated retries and health reporting | Yes |
| ADR-023 | Public Operational Metrics | Accepted | RFC-026 | Transparency via GitHub Pages dashboards | Yes |
| ADR-024 | Unified Provider Pattern | Accepted | RFC-029 | Type-based unified provider classes | Yes |
| ADR-025 | Technology-Based Provider Naming | Accepted | RFC-029 | Clear library-based option naming | Yes |
| ADR-026 | Per-Capability Provider Selection | Accepted | RFC-032, RFC-033, RFC-034, RFC-035, RFC-036, RFC-037 | Independent provider choice per capability; partial-protocol providers allowed | Yes |
| ADR-027 | Unified Provider Metrics Contract | Accepted | - | Standardized ProviderCallMetrics pattern for all providers |
Yes |
| ADR-028 | Unified Retry Policy with Metrics | Accepted | - | Centralized retry for LLM/API providers with backoff and metrics (not RSS/media HTTP; see CONFIGURATION — Download resilience) | Yes |
| ADR-029 | Grouped Dependency Automation | Accepted | RFC-038 | Balanced Dependabot updates via grouping | Yes |
| ADR-030 | Periodic Module Coupling Analysis | Accepted | RFC-038 | Nightly visualization of architecture health | Yes |
| ADR-031 | Mandatory Pre-Release Validation | Accepted | RFC-038 | Standardized checklist script for releases | Partial |
| ADR-032 | Git Worktree-Based Development | Accepted | RFC-039 | Parallel stable dev environments | Yes |
| ADR-033 | Stratified CI Execution | Accepted | RFC-039 | Fast push checks vs. full PR validation | Yes |
| ADR-034 | Isolated Runtime Environments | Accepted | RFC-039 | Independent venv per worktree | Yes |
| ADR-035 | Linear History via Squash-Merge | Accepted | RFC-039 | Clean, revertible main branch history | Yes |
| ADR-036 | Standardized Pre-Provider Audio Stage | Accepted | RFC-040 | Mandatory optimization before any transcription | Yes |
| ADR-037 | Content-Hash Based Audio Caching | Accepted | RFC-040 | Shared optimized artifacts in .cache/ | Yes |
| ADR-038 | FFmpeg-First Audio Manipulation | Accepted | RFC-040 | System-level performance for audio pipelines | Yes |
| ADR-039 | Speech-Optimized Codec (Opus) | Accepted | RFC-040 | MP3 (libmp3lame @ 64 kbps) for preprocessed audio; Opus rejected (see ADR) |
Yes |
| ADR-040 | Explicit Golden Dataset Versioning | Accepted | RFC-041 | Approved, frozen ground truth data versions | Yes |
| ADR-041 | Multi-Tiered Benchmarking Strategy | Accepted | RFC-041 | Fast PR smoke tests vs nightly full benchmarks | Yes |
| ADR-042 | Heuristic-Based Quality Gates | Accepted | RFC-041 | Regex-based detection of common AI failure modes | Yes |
| ADR-043 | Hybrid MAP-REDUCE Summarization | Accepted | RFC-042 | Compression (Classic) + Abstraction (Instruct LLM) | Yes |
| ADR-044 | Local LLM Backend Abstraction | Accepted | RFC-042 | Support for llama.cpp, ollama, and transformers | Yes |
| ADR-045 | Strict REDUCE Prompt Contract | Accepted | RFC-042 | Mandatory markdown structure for LLM outputs | Yes |
| ADR-046 | MPS Exclusive Mode for Apple Silicon | Accepted | RFC-042 | Serialize GPU work on MPS to prevent memory contention; default on | Yes |
| ADR-047 | Proactive Metric Regression Alerting | Accepted | RFC-043 | Automated PR comments and webhook notifications | Partial |
| ADR-048 | Centralized Model Registry | Accepted | RFC-044, RFC-029 | Single source of truth for model architecture limits | Yes |
| ADR-049 | Materialization Boundary for Evaluation Inputs | Accepted | RFC-046 | Preprocessing becomes dataset definition via materialization_id; chunking stays in run config | Yes |
| ADR-050 | Single Code Path for Evaluation and Application | Accepted | RFC-048 | Eval and app share identical execution path; scorers are read-only observers | Yes |
| ADR-051 | Per-Episode JSON Artifacts with Logical Union | Accepted | RFC-049, RFC-055, RFC-061 | Shard by episode (gi.json, kg.json); union at query time; optional materialization | Yes |
| ADR-052 | Separate GIL and KG Artifact Layers | Accepted | RFC-049, RFC-055 | Independent schemas, feature flags, CLI namespaces, and evolution paths | Yes |
| ADR-053 | Grounding Contract for Evidence-Backed Insights | Accepted | RFC-049, RFC-050 | Explicit grounded boolean, verbatim quotes with spans, evidence chain | Yes |
| ADR-054 | Relational Postgres Projection for GIL and KG | Accepted | RFC-051 | Files canonical, Postgres is derived; separate GIL/KG tables; provenance on every row | No |
| ADR-055 | Adaptive Summarization Routing | Proposed | RFC-053 | Rule-based routing with episode profiling for summarization strategies | No |
| ADR-056 | Composable E2E Mock Response Strategy | Proposed | RFC-054 | Separation of functional responses from non-functional behavior in tests | No |
| ADR-057 | AutoResearch Thin Harness with Credential Isolation | Accepted | RFC-057 | Thin control layer reusing existing eval; immutable score.py; AUTORESEARCH_* credential vars | Yes |
| ADR-058 | Additive pyannote Diarization with Separate [diarize] Extra |
Accepted | RFC-058 | pyannote as additive second pass; segment-level; separate [diarize] dependency group | No |
| ADR-059 | Confidence-Scored Multi-Signal Commercial Detection | Accepted | RFC-060 | Confidence-scored candidates replace binary detection; pattern primary, diarization adjusts | No |
| ADR-060 | VectorStore Protocol with Backend Abstraction | Accepted | RFC-061 | PEP 544 protocol decoupling FAISS (Phase 1) from Qdrant (Phase 2) | Yes |
| ADR-061 | FAISS Phase 1 with Post-Filter Metadata Strategy | Accepted | RFC-061 | Over-fetch + post-filter for CLI-scale; auto index type selection | Yes |
| ADR-062 | Sentence-Boundary Transcript Chunking | Accepted | RFC-061 | Regex sentence split, configurable target/overlap tokens, timestamp interpolation | Yes |
| ADR-063 | Transparent Semantic Upgrade for gi explore | Accepted | RFC-061, RFC-050 | Auto-detect vector index; semantic if available, substring fallback if not | Yes |
| ADR-064 | Canonical Server Layer with Feature-Flagged Route Groups | Accepted | RFC-062 | server/ module with podcast serve CLI; viewer routes v2.6, platform routes v2.7 |
Yes |
| ADR-065 | Vue 3 + Vite + Cytoscape.js Frontend Stack | Accepted | RFC-062 | Unified frontend stack for viewer and future platform UI | Yes |
| ADR-066 | Playwright for UI End-to-End Testing | Accepted | RFC-062 | Browser regression testing; extends ADR-020 test pyramid with UI layer | Yes |
| ADR-067 | Pegasus/LED Retirement for Podcast Content | Accepted | RFC-057 | GSG pretraining mismatch → near-duplicate chunks → LED ngram exhaustion; reserved for news content type | Yes |
| ADR-068 | BART+LED as Local ML Production Baseline | Accepted | RFC-057 | Autoresearch sweep: +4.26% ROUGE-L over dev baseline (18.82%); 2 params accepted (max_new_tokens=550, num_beams=6) | Yes |
| ADR-069 | Hybrid ML Pipeline as Primary Production Direction | Accepted | RFC-057, RFC-042 | BART MAP + Llama 3.2:3b REDUCE at 23.1% ROUGE-L; closes 70% of cloud quality gap; temp=0.5, top_p=1.0 | Yes |
| ADR-070 | BART-base as Hybrid MAP Stage | Accepted | RFC-057 | BART beats LongT5 as MAP (21.2% vs 20.8%); pretraining alignment > context window size | Yes |
| ADR-071 | Four-Tier Summarization Strategy | Accepted | RFC-057 | ML Dev / ML Prod / LLM Local / LLM Cloud — direct Llama 3.2:3b beats hybrid (24.3% vs 23.7%, 2x faster) | Yes |
| ADR-072 | Llama 3.2:3b as Tier 3 Local LLM | Accepted | RFC-057 | 3B beats 7-12B models — instruction-following > size; temp=0.3 direct, temp=0.5 hybrid; 26.4% ROUGE-L @ 7.5s/ep | Yes |
| ADR-073 | RFC-057 Autoresearch Loop — Closure and Final State | Accepted | RFC-057 | Closes RFC-057; documents Track A/B outcomes, silver refs, 72-config matrix, production defaults | Yes |
| ADR-074 | Multi-Feed Corpus Parent Layout and Machine-Readable Manifest | Accepted | RFC-063 | Layout A corpus parent; unified discovery; corpus_manifest.json / optional summaries as operational artifacts |
Yes |
| ADR-075 | Frozen YAML Performance Profiles for Release Resource Baselines | Accepted | RFC-064 | data/profiles/*.yaml + freeze/diff scripts; resource cost sibling to quality baselines |
Yes |
| ADR-076 | Streamlit for Operator Run Comparison and Performance Views | Accepted | RFC-047, RFC-066 | Eval / Performance UI stays in tools/run_compare/; Vue viewer stays corpus-first |
Yes |
| ADR-077 | Local Ollama Model Selection | Accepted | — | Default Ollama models per profile and tier | Yes |
| ADR-078 | GIL Evidence Stack Bundling — Per-Provider Champion Modes | Accepted | — | bundled_ab default; Mistral=bundled_b_only; Ollama bundled-only (staged unviable on local) |
Yes |
| ADR-079 | OpenTofu for Always-On Hosting IaC | Accepted | RFC-082 | OpenTofu + hcloud + tailscale providers; infra/tofu entry |
Yes |
| ADR-080 | OpenTofu State Encrypted In-Repo (sops + age) | Accepted | RFC-082 | Committed .enc state; TFSTATE_AGE_KEY in CI |
Yes |
| ADR-081 | Drill OpenTofu Workspace and Tailscale ACL Ownership | Accepted | RFC-082 | Workspace drill, HCLOUD_TOKEN_DRILL, prod-only tailscale_acl |
Yes |
| ADR-082 | GitOps App Deploy via stack-test and GitHub Actions | Accepted | RFC-082 | Stack-test gate + publish + deploy-prod; workflow_run target; infra apply manual |
Yes |
| ADR-083 | Tailscale as Private Ingress for Always-On VPS | Accepted | RFC-082 | App on tailnet; tag:gha-deployer SSH path |
Yes |
| ADR-084 | Full-Stack Docker Compose Topology (API, Viewer, Pipeline) | Accepted | RFC-079 | compose/docker-compose.stack.yml; shared volume; optional Docker job exec |
Yes |
| ADR-085 | Ephemeral Stack-Test Integration Gate on Main | Accepted | RFC-078, RFC-079 | Compose overlay + Playwright tests/stack-test/; distinct from ADR-021 |
Yes |
| ADR-086 | Canonical Identity Layer and Per-Episode bridge.json Cross-Layer Join | Accepted | RFC-072 | CIL ids + bridge.json seam; GIL or KG stay separate (ADR-052) |
Yes |
| ADR-087 | Autoresearch Track A v2 — Dev or Held-Out Split and Judging | Accepted | RFC-073, RFC-057 | Disjoint held-out; fraction contestation; Efficiency rubric; seed wiring | Yes |
| ADR-088 | macOS Local CI Process Safety for ML Workloads | Accepted | RFC-074 | No parse-time ML probes; cleanup or zombie checks; agent no-pileup rules | Yes |
| ADR-089 | Prod Failover Orchestrator Separate from DR Drill | Accepted | RFC-083 | Own workflow family; reuse drill workspace/secrets; no auto-destroy; GitHub #764 | No |
| ADR-090 | Prod Failover — DNS-First Cutover on Tailnet | Accepted | RFC-083 | Canonical hostname DNS flip primary; floating IP optional | No |
| ADR-091 | Prod Failover — GHA Triggers and Gates | Accepted | RFC-083 | Manual cutover/failback/teardown; freeze prod schedules; spare schedules off after restore | No |
| ADR-092 | Corpus Snapshot Backup Manifest and Newest-Compatible Restore Default | Accepted | RFC-084 | snapshot.manifest.json; dual placement; newest-compatible default; fail closed; GitHub #763 |
Yes |
| ADR-093 | Canonical Stack Contract Versus Environment Adapters | Accepted | RFC-082 | One topology/health/stack-test discipline; adapters for transport/secrets only; steady vs restore playbooks separate; GitHub #762 |
Yes |
Gap analysis¶
Counts (reconcile when adding ADRs): 93 files under docs/adr/ADR-*.md (ADR-001–ADR-093;
numbering has historical gaps). From the index table: 2 Proposed (ADR-055, ADR-056),
7 Accepted with Code = No (ADR-054, ADR-058, ADR-059, ADR-089, ADR-090, ADR-091), 2 Accepted with
Code = Partial (ADR-031, ADR-047). Accepted means ratified, not necessarily shipped.
When to extract a new ADR¶
Use an ADR when one or more of these hold; otherwise an RFC + normative doc (API guide,
docs/api/*.md, UXS) is usually enough.
| ADR type | When to extract | Recent examples |
|---|---|---|
| Closure / program outcome | A large RFC program ends; you need an immutable summary. | ADR-073 closes RFC-057 |
| Empirical production defaults | Benchmarks change default models/tiers you must freeze for onboarding. | ADR-067–ADR-072 |
| Stack & ownership boundary | Who owns HTTP, which frontend stack, which UI E2E runner. | ADR-064–ADR-066 |
| Heavy optional dependencies | An extra bloats install or splits CUDA/CPU paths; defaults must not pay the cost. | ADR-058 (accepted; [diarize] not landed) |
| Cross-cutting protocol / contract | Multiple subsystems share the same interface. | ADR-060, ADR-053, ADR-051 |
| Process / CI philosophy | A policy decision that outlives one RFC. | ADR-021 |
When not to add an ADR¶
- Viewer milestones that do not change stack (e.g. RFC-069) — RFC + feature UXS (e.g. UXS-004) + UXS-001 hub + E2E map suffice.
- Single-route APIs for the viewer with schema in code + tests (e.g. RFC-068) — Server Guide + tests suffice.
- Operational tooling without architectural boundary moves (e.g. RFC-065) — RFC-first.
- Frozen artifact workflows (e.g. RFC-064); profile YAML baselines are covered by ADR-075.
ADRs by implementation state¶
Proposed
| ADR | Primary RFC | Note |
|---|---|---|
| ADR-055 | RFC-053 | No episode profiling / routing in pipeline yet |
| ADR-056 | RFC-054 | Composable ResponseProfile / Router not implemented |
Accepted, code not landed (expected)
| ADR | Primary RFC | Note |
|---|---|---|
| ADR-054 | RFC-051 | Postgres projection future |
| ADR-058 | RFC-058 | No [diarize] extra in pyproject.toml yet |
| ADR-059 | RFC-060 | Commercial detector as designed not landed |
| ADR-089 | RFC-083 | Prod-failover workflows not landed |
| ADR-090 | RFC-083 | Runbook or DNS automation follow-up |
| ADR-091 | RFC-083 | repository_dispatch + cutover gates not landed |
Accepted, partial
| ADR | Gap |
|---|---|
| ADR-031 | make pre-release / checklist not fully aligned with RFC-038 |
| ADR-047 | Alerts exist; automated PR comments not complete |
Stale-audit corrections (reference)¶
Trust the Code column in the table above: ADR-048 is implemented; ADR-062 / ADR-063
are Yes; ADR-064–ADR-066 are implemented; ADR-021 is reflected in script-based
make test-acceptance.
Situation cheat sheet¶
| Situation | Guidance |
|---|---|
| Prefer a new ADR | Irreversible stack boundary, cross-cutting protocol, frozen empirical default, heavy optional extra, or closure of a large program (e.g. ADR-073). |
| Often RFC-only | Bounded HTTP routes or viewer tabs where ADR-064–ADR-066 + UXS already fix the stack (e.g. RFC-067, RFC-068, RFC-069, RFC-071). Corpus layout + manifest: ADR-074. Frozen resource baselines: ADR-075. Streamlit vs Vue for eval tools: ADR-076. Full-stack Compose + stack-test gate: ADR-084, ADR-085. CIL + bridge.json: ADR-086. Autoresearch Track A v2: ADR-087. macOS ML make safety: ADR-088. Prod failover design: RFC-083; decisions ADR-089–ADR-091. Corpus snapshot backup manifest + restore defaults: RFC-084; ADR-092. Cross-surface stack contract vs adapters: ADR-093 (#762). |
| Proposed ADRs | Promote ADR-055 / ADR-056 to Accepted (or supersede) when RFC-053 / RFC-054 ship end-to-end. |
Future triggers¶
- Multi-feed manifest as an immutable external contract beyond CORPUS_MULTI_FEED_ARTIFACTS.md — partially addressed by ADR-074.
.pipeline_status.jsonschema if external monitors depend on it and breaking changes need versioning.- Profile YAML for non-Python consumers beyond
tools/run_compare/make profile-diff— partially addressed by ADR-075. - RFC-070 + ADR-060 when platform vector backends land materially.
- Full-stack Compose, stack-test, CIL or bridge, autoresearch v2, macOS ML process safety — see ADR-084–ADR-088 (normative detail remains in RFC-072, RFC-073 v2 file, RFC-074, RFC-078, RFC-079).
- Prod failover (stand up spare, validate, gated cutover) — RFC-083 (Draft); decisions ADR-089–ADR-091; GitHub #764.
- Corpus snapshot tarball metadata + version-aware restore — RFC-084 (Completed); ADR-092; GitHub #763.
Open decisions without ADRs: see Architecture Decision Candidates below.
Related: PRD gap analysis, RFC gap analysis.
Architecture Decision Candidates¶
These items have been identified as potential architectural decisions but are currently under review.
| Candidate Decision | Origin | Status | Description |
| :--- | :--- | :--- | :--- | :--- |
| Informational-Only Metric Gates | RFC-043 | Open | Should regressions (runtime, coverage) block PRs or just notify? |
| Excel-Based Result Aggregation | RFC-015 | Open | Should we maintain experiment_results.xlsx or move fully to web? |
| Manual vs. Automated Golden Creation | RFC-041 | Open | Should golden data creation always require manual approval? |
| ~~Diarization-Free Dialogue Formatting~~ | RFC-006 | Resolved → ADR-058 | Additive pyannote diarization accepted; gap-based rotation preserved as default fallback |
| Minimalist Parser Dependency Strategy | RFC-002 | Open | Raw ElementTree vs. external RSS libraries |
| Two-Phase Configuration Validation | RFC-007 | Open | argparse syntax + Pydantic semantic validation |
Creating New ADRs¶
Use the ADR Template to document new architectural decisions. Decisions typically originate from an RFC that has been reviewed and often Completed when implementation lands (RFCs use Completed, not Accepted — Accepted is the ADR status).