ADR-020: Standardized Test Pyramid¶
Context & Problem Statement¶
As the project grew, tests became a mix of unit, integration, and E2E logic in single files. This made it impossible to run "fast" tests only or to isolate why a failure occurred.
Decision¶
We enforce a Standardized Test Pyramid:
- Unit Tests (
tests/unit/): Pure logic, zero IO, sub-second execution. - Integration Tests (
tests/integration/): Module interactions, filesystem tests, mock-API calls. - End-to-End Tests (
tests/e2e/): Full CLI runs, real ML model loading, local server mocks.
Tests are further categorized using Pytest Markers (@pytest.mark.slow, @pytest.mark.ml_models).
Rationale¶
- Feedback Speed: Developers can run
make test-unitin seconds during the inner loop. - Reliability: Isolated tests make it clear whether a bug is in a specific function or an integration point.
- CI Control: Enables the "Stratified CI" (ADR-033) by providing clear targets for fast/full checks.
Alternatives Considered¶
- Feature-based Organization: Rejected as it makes it harder to separate fast vs. slow tests.
Consequences¶
- Positive: Highly predictable test suite; clear contribution guidelines for new tests.
- Negative: Requires strict discipline to keep IO out of unit tests.