ADR-056 — Deterministic e2e readiness signals: data-* attributes + window.__pickAt test hooks
Status · Accepted Date · 2026-05-09 (renumbered from 053 → 056 on 2026-05-10 to clear the v0.6 Fleet ADR slots planned by RFC-016 / PRD-012) Closes · v0.5.0 e2e flake-fix wave (46 stacked failures → 0) TA anchor · §components/testing Related ADRs · ADR-015 (Vitest + Playwright), ADR-018 (mobile-first design)
Context
The first run of the full e2e suite after the v0.5.0 UX wave landed exposed 46 stacked failures. Categorisation showed three distinct patterns:
- Hardcoded
waitForTimeout(<magic>)in tests — "wait 800ms for satellites to load" was tuned for local dev and raced GH-Actions slow CI. 10+ instances across earth/mars/moon/fly tests. - Reading state before the rAF loop populated it — tests fired clicks immediately after
goto(), before the page's render-state hook had a chance to write any attribute. - Spiral-click pickability tests on
/iss+/tiangong— the test fired click after click hoping to hit a pickable mesh, with software-rasterizer WebGL on CI not rendering reliably enough to make the spiral converge before the 25-min job timeout.
Three fixes were needed, each a pattern that should be reused for future routes.
Decision
Pattern 1 — data-*-count and data-<state> readiness attributes
The route's primary canvas (or render-state hook) exposes one or more data-* attributes that flip from a sentinel value (e.g. "0", missing) to a real value once the page has reached a known interactive state:
| Route | Attribute | Flips when |
|---|---|---|
/earth (canvas2d) | data-objects-count={objects.length} | earth-objects.json fetched + Svelte re-rendered |
/mars (canvas2d) | data-sites-count={sites.length} | mars-sites JSON fetched |
/moon (canvas2d) | data-sites-count={sites.length} | moon-sites JSON fetched |
/fly ([data-testid="fly-render-state"]) | data-view, data-sim-day, data-sc-phase | rAF loop has committed the new sim time + view |
Tests wait via Playwright's first-class assertion API:
// Replace `await page.waitForTimeout(800)`:
await expect(canvas).not.toHaveAttribute('data-objects-count', '0', { timeout: 10_000 });The page is now telling the test exactly when it's ready — the test polls a real condition, not a guessed time.
Pattern 2 — window.__pickAt(moduleId?) test hook for canvas pickability
3D scenes with raycast-driven module pickability (/iss, /tiangong) expose a small unconditional test hook that projects a known pickable mesh's world position to client-space pixels:
interface OrreryTestApi {
__issPickAt(moduleId?: string): { x: number; y: number; moduleId: string } | null;
}
(window as unknown as OrreryTestApi).__issPickAt = (moduleId?: string) => {
camera.updateMatrixWorld(true);
// iterate pickable meshes, return first one whose centre projects
// inside (-1, 1) NDC and z < 1; force matrixWorld update so the
// hook works before the first render.
…
};Tests use the hook + canvas.click({ position }) (canvas-relative, bypasses overlays):
const pos = await page.evaluate(() => window.__issPickAt());
const box = await canvas.boundingBox();
await canvas.click({ position: { x: pos.x - box.x, y: pos.y - box.y } });Replaces the previous spiral-click pattern (40 radii × 32 angles = 1280 clicks per test in the worst case) with a deterministic single click that exercises the real pointer pipeline.
Pattern 3 — Skip CI-only auto-fallbacks under navigator.webdriver
Routes with FPS-based fallbacks (/iss and /tiangong switch to list mode after 2s if measured FPS < 20) trip on GH-Actions software-rasterizer WebGL. Skip the gate when running under WebDriver:
const underTest = typeof navigator !== 'undefined' && navigator.webdriver === true;
if (fps < 20 && viewBag.mode === '3d' && !underTest) {
/* fall back to list mode */
}Real users on real hardware still get the auto-fallback. Test runs always see the canvas.
Rationale
- Pattern 1 — Readiness attributes beat magic timeouts because the test is asserting a real condition (data loaded) instead of a wall-clock guess. Slow CI makes the test wait longer; fast local runs are instant. Same code, no flakes.
- Pattern 2 — Test hook with deterministic position beats spiral-click because (a) it uses the exact pickability code path the user does (raycast on the pointer event), but (b) lands the click at a position guaranteed to hit a mesh. Spiral was a probabilistic "does at least one of 1280 clicks hit something?" — test failure is silent if rendering is off. Deterministic click fails fast and obviously.
- Pattern 3 —
navigator.webdriverskip is the smallest possible app change. Real users never trigger the test path. It's a single-line guard.
These patterns together took the full suite from 25-minute timeout (cancelled mid-run) to 16-minute clean pass.
Alternatives considered
waitForLoadState('networkidle')alone — fires before the page parses + draws data; not a fix.- Add fixed-but-bigger
waitForTimeout— works on most CI runs but still fails under load; doesn't address the root cause. - Disable Playwright
retries— nope, retries are great for true flakes; the goal here is to not have flakes in the first place. - Mock
requestAnimationFramein tests — fragile; tests would diverge from production behaviour. - Use Playwright's
expect(...).toHaveScreenshot()for visual regressions — separate concern; still useful, but doesn't replace pickability tests.
Consequences
Positive:
- E2e flakes from time-based bets are gone. Tests pass deterministically on fast local hardware AND slow CI runners.
- Route-page state is now test-observable via
data-*attributes — same attributes are useful for future debug overlays. - Test hooks (
window.__pickAt) document the canvas-pick interaction surface in a single place. - New routes inherit a clear pattern: expose a readiness signal, expose a pick hook for canvas tests.
Negative:
- ~10 extra lines of "test-mode" code per 3D route page (the
__pickAthook + thenavigator.webdriverguard). Harmless in production but visible in source. - Adding a new
data-*attribute requires both the page to publish it AND the test to wait on it — drift risk if a future change removes the attribute.
Implementation notes
src/routes/earth/+page.svelte—data-objects-count={objects.length}on<canvas class="layer">.src/routes/mars/+page.svelte—data-sites-count={sites.length}on the 2D canvas.src/routes/moon/+page.svelte—data-sites-count={sites.length}on the 2D canvas.src/routes/fly/+page.svelte—[data-testid="fly-render-state"]element withdata-view,data-sim-day,data-sc-phaseattributes.src/routes/iss/+page.svelte,src/routes/tiangong/+page.svelte—window.__issPickAt/window.__tiangongPickAthooks +navigator.webdriverperf-fallback guard.- Tests:
tests/e2e/earth.spec.ts,mars.spec.ts,moon.spec.ts,fly.spec.ts,fly-render-validation.spec.ts,iss.spec.ts,tiangong.spec.ts,plan-porkchop-refresh.spec.ts,explore.spec.ts.
The pre-push hook (.husky/pre-push) runs npm run preflight which mirrors CI step-for-step. Trust the exit code; the readiness attributes mean preflight failures reflect real bugs, not timing.
When a new e2e test is written, the rule (per CLAUDE.md "What not to do") is: never waitForTimeout(<magic>). Always wait on a readiness signal — either an existing data-* attribute, a waitForFunction polling a real condition, or a expect(...).toHave... assertion.