Guides¶
Practical guides for using and developing Podcast Scraper.
Quick Start¶
| Guide | Description |
|---|---|
| Quick Reference | Common commands cheat sheet |
| Troubleshooting | Common issues and solutions |
| Glossary | Key terms and concepts |
Development¶
| Guide | Description |
|---|---|
| Development Guide | Development environment setup, workflow, and GI/KG browser viewer (make serve-gi-kg-viz) |
| Pipeline and Workflow Guide | Pipeline flow, module roles, quirks, run tracking |
| Git Worktree Guide | Git worktree-based development workflow |
| Dependencies Guide | Third-party dependencies and rationale |
| Markdown Linting | Markdown style and linting practices |
Testing¶
| Guide | Description |
|---|---|
| Testing Guide | Test execution and overview |
| Unit Testing Guide | Unit test patterns and mocking |
| Integration Testing Guide | Integration test guidelines |
| E2E Testing Guide | End-to-end test infrastructure |
| Critical Path Testing Guide | Test prioritization |
Provider System¶
| Guide | Description |
|---|---|
| AI Provider Comparison | Compare all 9 providers: cost, quality, speed, privacy |
| ML Model Comparison | Compare ML models: Whisper, spaCy, Transformers (BART/LED) |
| Provider Configuration | Quick provider configuration reference |
| Ollama Provider Guide | Ollama installation, setup, troubleshooting, and testing |
| Provider Implementation | Implementing new providers |
| ML Provider Reference | Technical reference for local ML models |
| Protocol Extension | Extending protocols |
Features¶
| Guide | Description |
|---|---|
| Semantic Search | RFC-061 corpus vector index: config (vector_search), search / index CLIs, semantic gi explore --topic |
| Grounded Insights | Grounded insights (insights + evidence quotes), enabling GIL, gi.json, CLI, schema; optional browser viewer |
| Knowledge Graph | KG (entities, topics, relationships): PRD-019 / RFC-055–056, artifacts, kg CLI; same browser viewer for kg.json |
| Preprocessing Profiles | Understanding and using preprocessing profiles for transcript cleaning |
| Docker Service Guide | Running podcast_scraper as a service-oriented Docker container |
| Docker Variants Guide | LLM-only vs ML-enabled Docker image variants |
AI Coding¶
| Guide | Description |
|---|---|
| Cursor AI Best Practices | AI-assisted development |
| Documentation Agent Guide | Documentation workflows |