ADR-076: Streamlit for Operator Run Comparison and Performance Views¶

Status: Accepted
Date: 2026-04-11
Authors: Podcast Scraper Team
Related RFCs: RFC-047, RFC-066
Related PRDs: PRD-007, PRD-016

Context & Problem Statement¶

ADR-065 and RFC-062 standardize Vue 3 + Vite for the GI/KG viewer served by FastAPI. Separately, RFC-047 introduced a Streamlit app over data/eval/ artifacts for ML run comparison. RFC-066 extended that app with a Performance page joining eval runs and frozen YAML profiles (ADR-075).

Without an explicit decision, contributors might duplicate run-compare or performance charts inside the Vue app (splitting maintenance, auth, and data loading) or deprecate Streamlit prematurely.

Decision¶

Streamlit remains the home for operator-facing eval tooling: tools/run_compare/ — quality comparisons, diagnostics, and the Performance page — stay on Streamlit + Plotly (optional [compare] extra), not in web/gi-kg-viewer/.
Vue viewer scope: The SPA focuses on corpus exploration (graph, search, library, digest, dashboard) against a resolved corpus root and /api/* — not on batch eval directory workflows.
Join semantics: When UI needs both eval metrics and frozen profiles, release tag is the primary join key (RFC-066); implementation stays in tools/run_compare/.
Optional extra: Keeping Streamlit behind [compare] preserves lean installs for users who never open eval tools (RFC-047).

Rationale¶

Different data roots: Eval runs live under data/eval/; the viewer consumes live corpus roots — merging them in one SPA would couple unrelated release cycles.
Velocity: Streamlit is fast for internal Plotly dashboards; the viewer stack optimizes for Cytoscape, Pinia, and Playwright E2E.
Clear ownership: ML operators use make run-compare; corpus operators use podcast serve + viewer.

Alternatives Considered¶

Rebuild run compare in Vue + FastAPI: Rejected; large duplicate of charts, file scanners, and session state; slower iteration for eval workflows.
Single “mega” Streamlit for viewer + eval: Rejected; loses Cytoscape-first UX, typed API contracts, and ADR-064 server architecture.
Jupyter-only notebooks for comparison: Rejected for onboarding; Streamlit gives one command and shared README entrypoint.

Consequences¶

Positive: Stable split of stacks; RFC-047/066 remain authoritative for Streamlit behavior.
Negative: Two UI stacks to maintain (Python extras vs Node); acceptable given distinct users.
Neutral: Links from docs may point operators to both make run-compare and make serve.

Implementation Notes¶

Module: tools/run_compare/ (app.py, data.py, README).
Install: pip install -e ".[compare]"; run make run-compare or streamlit run tools/run_compare/app.py.