ADR-043: Hybrid MAP-REDUCE Summarization¶

Status: Accepted
Date: 2026-01-11
Authors: Podcast Scraper Team
Related RFCs: RFC-042

Context & Problem Statement¶

Classic summarization models (BART, LED) are efficient at compressing text but struggle with instruction-following, structure adherence, and filtering conversational noise. They often produce "extractive" summaries that leak scaffolding text or repeat duplicate ideas from different chunks.

Decision¶

We adopt a Hybrid MAP-REDUCE Summarization Strategy:

MAP Phase: Uses Classic Summarizers (LED, LongT5) to compress transcript chunks into raw factual notes.
REDUCE Phase: Uses an Instruction-Tuned LLM (Qwen, LLaMA, Mistral) to synthesize those notes into a final, structured summary.

Rationale¶

Separation of Concerns: Classic models handle the "heavy lifting" of compression efficiently, while LLMs handle the "reasoning" of abstraction and structuring.
Quality: Instruction-tuned models are far better at ignoring ads, deduplicating ideas, and following output schemas.
Efficiency: Only the small, compressed notes are passed to the expensive LLM, keeping latency and memory usage manageable on local hardware.

Alternatives Considered¶

Pure Classic: Rejected due to poor structure and extraction bias.
Pure LLM: Rejected for long podcasts due to massive context window requirements and high local compute cost.

Consequences¶

Positive: Dramatically higher summary quality; structured output guarantee; better ad/noise filtering.
Negative: Requires loading two different classes of models.

References¶

RFC-042: Hybrid Podcast Summarization Pipeline