ADR-038: FFmpeg-First Audio Manipulation¶
- Status: Accepted
- Date: 2026-01-11
- Authors: Podcast Scraper Team
- Related RFCs: RFC-040
Context & Problem Statement¶
The pipeline needs to perform complex audio operations: format conversion, resampling, VAD (silence removal), and loudness normalization. Pure Python libraries are often slow or require complex chains of C-extensions.
Decision¶
We standardize on FFmpeg-First Audio Manipulation.
- The pipeline calls the system
ffmpegbinary directly viasubprocess. - We use complex filter chains (e.g.,
silenceremove,loudnorm) to perform multiple operations in a single pass.
Rationale¶
- Performance:
ffmpegis highly optimized and far faster than equivalent Python libraries likepyduborlibrosa. - Completeness: Supports every codec and filter we might ever need without adding dozens of Python dependencies.
- Reliability: Industry-standard tool with predictable behavior across platforms.
Alternatives Considered¶
- Python Libraries (pydub/webrtcvad): Rejected due to performance bottlenecks and fragmented feature sets.
Consequences¶
- Positive: Extremely fast audio pipelines; simplified Python dependency list.
- Negative: Requires the user to have
ffmpeginstalled on their system.