ADR-063: Transparent Semantic Upgrade for gi explore¶
- Status: Accepted
- Date: 2026-04-03
- Authors: Podcast Scraper Team
- Related RFCs: RFC-061, RFC-050
- Related PRDs: PRD-021
Context & Problem Statement¶
gi explore --topic and gi query currently use substring matching to find relevant
insights. Semantic search (RFC-061) adds a vector index that enables meaning-based
matching. The system needs to upgrade these commands to use semantic matching when
available — but without breaking behavior for users who have not built a vector index.
Decision¶
We adopt a transparent semantic upgrade pattern:
- Auto-detection: When
gi exploreorgi querystarts, it checks if a vector index exists at the default path (<output_dir>/search/). - Semantic path: If an index exists, topic/question matching uses
VectorStore.search()to find semantically similar insights. - Substring fallback: If no index exists, the existing substring matching logic
runs unchanged. Same
ExploreOutputcontract, same CLI flags, same output format. - Zero configuration: No flag required to activate semantic matching. The presence of the index is the trigger.
- Same output contract: Both paths produce
ExploreOutput(RFC-050). Consumers see the same shape — just better results when semantic matching is active.
Rationale¶
- Zero regression: Users who never enable
vector_searchsee identical behavior. No flag to learn, no config to change. - Gradual adoption: Build the index when ready;
gi exploreautomatically improves. - Single command surface: Users don't need to learn
podcast searchto benefit. Their existinggi explore --topicworkflow gets better transparently. - Reusable pattern: The "detect capability, use if available, fallback if not" pattern can apply to future KG queries or cross-layer search.
Alternatives Considered¶
- Require explicit
--semanticflag: Rejected; adds cognitive load, fragments the command surface, users must remember which mode they're in. - Separate
gi searchcommand: Rejected; duplicates the explore surface. The existingpodcast searchCLI handles dedicated search.gi exploreshould just get better. - Always require vector index: Rejected; forces all users to build an index even for simple substring queries on small corpora.
Consequences¶
- Positive: Seamless upgrade path. No breaking changes. No new commands to learn. Better results with zero effort once the index exists.
- Negative: Users may not realize their results improved (or degraded) because of index presence. Mitigated by logging which matching path was used.
- Neutral: Explore function gains an optional
vector_storeparameter. Internal change only; external contract unchanged.
Implementation Notes¶
- Module:
src/podcast_scraper/gi/explore.py - Function:
_insight_matches_topic(artifact, insight_id, insight_text, topic, vector_store=None)— ifvector_storeis provided, use semantic matching - Detection: Check for
<output_dir>/search/index_meta.jsonat explore startup - Logging:
INFO: Using semantic matching (vector index found)orINFO: Using substring matching (no vector index)