ADR-076: Streamlit for Operator Run Comparison and Performance Views¶
- Status: Accepted
- Date: 2026-04-11
- Authors: Podcast Scraper Team
- Related RFCs: RFC-047, RFC-066
- Related PRDs: PRD-007, PRD-016
Context & Problem Statement¶
ADR-065 and RFC-062
standardize Vue 3 + Vite for the GI/KG viewer served by FastAPI. Separately,
RFC-047 introduced a Streamlit app over
data/eval/ artifacts for ML run comparison. RFC-066
extended that app with a Performance page joining eval runs and frozen YAML profiles
(ADR-075).
Without an explicit decision, contributors might duplicate run-compare or performance charts inside the Vue app (splitting maintenance, auth, and data loading) or deprecate Streamlit prematurely.
Decision¶
- Streamlit remains the home for operator-facing eval tooling:
tools/run_compare/— quality comparisons, diagnostics, and the Performance page — stay on Streamlit + Plotly (optional[compare]extra), not inweb/gi-kg-viewer/. - Vue viewer scope: The SPA focuses on corpus exploration (graph, search, library, digest,
dashboard) against a resolved corpus root and
/api/*— not on batch eval directory workflows. - Join semantics: When UI needs both eval metrics and frozen profiles, release tag is the
primary join key (RFC-066); implementation
stays in
tools/run_compare/. - Optional extra: Keeping Streamlit behind
[compare]preserves lean installs for users who never open eval tools (RFC-047).
Rationale¶
- Different data roots: Eval runs live under
data/eval/; the viewer consumes live corpus roots — merging them in one SPA would couple unrelated release cycles. - Velocity: Streamlit is fast for internal Plotly dashboards; the viewer stack optimizes for Cytoscape, Pinia, and Playwright E2E.
- Clear ownership: ML operators use
make run-compare; corpus operators usepodcast serve+ viewer.
Alternatives Considered¶
- Rebuild run compare in Vue + FastAPI: Rejected; large duplicate of charts, file scanners, and session state; slower iteration for eval workflows.
- Single “mega” Streamlit for viewer + eval: Rejected; loses Cytoscape-first UX, typed API contracts, and ADR-064 server architecture.
- Jupyter-only notebooks for comparison: Rejected for onboarding; Streamlit gives one command and shared README entrypoint.
Consequences¶
- Positive: Stable split of stacks; RFC-047/066 remain authoritative for Streamlit behavior.
- Negative: Two UI stacks to maintain (Python extras vs Node); acceptable given distinct users.
- Neutral: Links from docs may point operators to both
make run-compareandmake serve.
Implementation Notes¶
- Module:
tools/run_compare/(app.py,data.py, README). - Install:
pip install -e ".[compare]"; runmake run-compareorstreamlit run tools/run_compare/app.py.