ADR-020: Acceptance Test Tier as Final CI Gate¶

Status: Accepted
Date: 2026-04-03
Authors: Podcast Scraper Team
Related RFCs: RFC-023

Context & Problem Statement¶

The project's test pyramid (ADR-020) defines three tiers: unit, integration, and E2E. E2E tests verify feature completeness — "does the feature work?" — but nothing validates whether the README's installation commands, CLI examples, and feature claims are actually accurate. Documentation can drift silently from reality, and first-time users encounter broken examples with no automated detection.

A new tier is needed that sits after all other tests and tests documentation accuracy, not feature completeness.

Decision¶

We introduce a fourth test tier: acceptance tests.

Location: tests/acceptance/ — a new top-level directory, distinct from tests/e2e/.
Marker: @pytest.mark.acceptance — separate from e2e, integration, and slow.
Purpose: Verify that every executable example in the README works as documented. Tests are derived directly from README content and run the exact commands a new user would run.
CI role: Acceptance tests are the final CI gate. They run only after unit, integration, and E2E tests all pass. They are allowed to be slow (10–20 minutes).
Execution: Sequential (not parallelized), on main branch merges and workflow_dispatch only by default.
Makefile target: make test-acceptance.

Rationale¶

ADR-020 covers the classic unit/integration/E2E pyramid; none of those tiers test "does the README example actually work?" — that is a different question.
Documentation accuracy is a first-class quality property. Broken README examples erode user trust more than a failing internal test.
Running acceptance tests last avoids wasting time on slow doc-verification when fast tests already fail.
The acceptance marker lets developers skip these locally while CI enforces them on merge.

Alternatives Considered¶

Fold into E2E tests: Rejected; E2E tests verify features, not documentation. The purposes are different, and they need different CI triggers and tolerances.
Manual README verification before release: Rejected; error-prone and doesn't scale. Manual checks are forgotten under time pressure.
Documentation linting only (markdownlint, link checkers): Rejected; catches formatting issues but not "does the code example actually run?"

Consequences¶

Positive: README examples are continuously verified. Documentation drift is caught automatically. Users can trust that README commands work.
Negative: Adds a slow CI job (10–20 min). README changes require corresponding acceptance test updates. Total CI time increases.
Neutral: A new tests/acceptance/ directory and acceptance pytest marker are added to the project structure.

Implementation Notes¶

Module: tests/acceptance/
Pattern: Tests use subprocess to run exact README commands against E2E server fixtures (no external network).
CI: test-acceptance job depends on all other test jobs passing.
Relationship to ADR-020: Extends the three-tier pyramid with a fourth documentation-accuracy tier; does not replace any existing tier.