ADR-004: Flat Filesystem Archive Layout¶
- Status: Accepted
- Date: 2026-01-11
- Authors: Podcast Scraper Team
- Related RFCs: RFC-004
- Related PRDs: PRD-001
Context & Problem Statement¶
Podcast episodes generate multiple artifacts (transcripts, metadata, summaries). We need a structure that is easy for humans to browse and for machines to parse.
Decision¶
We adopt a Flat Layout within the feed run directory. All episode-specific files are stored in the same folder, using a consistent prefix to group related items:
<idx:04d> - <title_safe>.<ext>
<idx:04d> - <title_safe>.metadata.json
Rationale¶
- Simplicity: No deep nesting makes it easy to
grepor browse in a file explorer. - Sortability: The numeric prefix (
04d) ensures episodes are listed in their chronological/feed order. - Resilience: Flat structures are easier to diff and backup than complex nested trees.
Alternatives Considered¶
- Per-Episode Subdirectories: Rejected as it adds navigation friction for small numbers of files.
- Content-Type Directories: Rejected as it separates a transcript from its own metadata.
Consequences¶
- Positive: High visibility of results; simple implementation of "skip existing" logic.
- Negative: Can become cluttered if a single episode generates 10+ different artifact types.