Skip to content

Product Requirements Documents (PRDs)

Purpose

Product Requirements Documents (PRDs) define the what and why behind each major feature or capability in podcast_scraper. They capture:

  • User needs and use cases
  • Functional requirements and success criteria
  • Design considerations and constraints
  • Integration points with existing features

PRDs serve as the foundation for technical design (RFCs) and help ensure features align with user needs and project goals.

Features with meaningful UI may also link UX specifications (UXS) for tokens, layout, and accessibility; RFCs then reference that UX contract alongside the PRD.

How PRDs Work

  1. Define Intent: PRDs describe the problem to solve and desired outcomes
  2. Guide Design: RFCs reference PRDs (and UXSs when UI is in scope) so technical solutions meet requirements and experience constraints
  3. Track Implementation: Release notes reference PRDs to show what was delivered
  4. Document Evolution: PRDs capture design decisions and rationale

Open PRDs

PRD Title Related RFCs Description
PRD-007 AI Quality & Experimentation Platform RFC-015, 016, 041 Integrated platform for experimentation and benchmarking
PRD-015 Engineering Governance & Productivity Platform RFC-018-024, 030, 031, 038, 039 Integrated system for developer velocity and quality
PRD-016 Operational Observability & Pipeline Intelligence RFC-025, 026, 027 System for managing operational health and visibility
PRD-017 Grounded Insight Layer (GIL) RFC-049, 050, 051 Evidence-backed insights and quotes with grounding relationships
PRD-018 Database Projection for GIL and KG RFC-049, 050, 051, 055, 056 Postgres projection of gi.json and KG artifacts (separate tables)
PRD-019 Knowledge Graph Layer (KG) RFC-055, 056 Entities, topics, and relationships; separate from GIL / GI
PRD-020 Audio-Based Speaker Diarization & Commercial Cleaning RFC-058, RFC-059, RFC-060 True speaker diarization via pyannote.audio; downstream commercial content cleaning
PRD-021 Semantic Corpus Search RFC-061 Meaning-based retrieval over insights, quotes, summaries, and transcripts via embeddings + vector index

Completed PRDs

PRD Title Version Related RFCs Description
PRD-001 Transcript Acquisition Pipeline v2.0.0 RFC-001, 002, 003, 004, 008, 009 Core pipeline for downloading transcripts
PRD-002 Whisper Fallback Transcription v2.0.0 RFC-004, 005, 006, 008, 010 Automatic transcription fallback
PRD-003 User Interfaces & Configuration v2.0.0 RFC-007, 008, 009 CLI interface and configuration
PRD-004 Per-Episode Metadata Generation v2.2.0 RFC-011, 012 Structured metadata documents
PRD-005 Episode Summarization v2.3.0 RFC-012 Automatic summary generation
PRD-006 OpenAI Provider Integration v2.4.0 RFC-013, 017, 021, 022, 029 OpenAI API as optional provider
PRD-008 Automatic Speaker Name Detection v2.1.0 RFC-010 Auto-detect host/guest names via NER
PRD-009 Anthropic Provider Integration v2.4.0 RFC-032 Anthropic Claude API as optional provider
PRD-010 Mistral Provider Integration v2.5.0 RFC-033 Mistral AI as complete OpenAI alternative
PRD-011 DeepSeek Provider Integration v2.5.0 RFC-034 DeepSeek AI - ultra low-cost provider
PRD-012 Google Gemini Provider Integration v2.5.0 RFC-035 Google Gemini - 2M context, native audio
PRD-013 Grok Provider Integration (xAI) v2.5.0 RFC-036 Grok - xAI's AI model with real-time information access
PRD-014 Ollama Provider Integration v2.5.0 RFC-037 Ollama - fully local/offline, zero cost
  • Architecture - System design and module responsibilities
  • RFCs - Technical design documents
  • Releases - Release notes and version history

Creating New PRDs

Use the PRD Template as a starting point for new product requirements documents.