Skip to content

ADR-056 — Deterministic e2e readiness signals: data-* attributes + window.__pickAt test hooks

Status · Accepted Date · 2026-05-09 (renumbered from 053 → 056 on 2026-05-10 to clear the v0.6 Fleet ADR slots planned by RFC-016 / PRD-012) Closes · v0.5.0 e2e flake-fix wave (46 stacked failures → 0) TA anchor · §components/testing Related ADRs · ADR-015 (Vitest + Playwright), ADR-018 (mobile-first design)

Context

The first run of the full e2e suite after the v0.5.0 UX wave landed exposed 46 stacked failures. Categorisation showed three distinct patterns:

  1. Hardcoded waitForTimeout(<magic>) in tests — "wait 800ms for satellites to load" was tuned for local dev and raced GH-Actions slow CI. 10+ instances across earth/mars/moon/fly tests.
  2. Reading state before the rAF loop populated it — tests fired clicks immediately after goto(), before the page's render-state hook had a chance to write any attribute.
  3. Spiral-click pickability tests on /iss + /tiangong — the test fired click after click hoping to hit a pickable mesh, with software-rasterizer WebGL on CI not rendering reliably enough to make the spiral converge before the 25-min job timeout.

Three fixes were needed, each a pattern that should be reused for future routes.

Decision

Pattern 1 — data-*-count and data-<state> readiness attributes

The route's primary canvas (or render-state hook) exposes one or more data-* attributes that flip from a sentinel value (e.g. "0", missing) to a real value once the page has reached a known interactive state:

RouteAttributeFlips when
/earth (canvas2d)data-objects-count={objects.length}earth-objects.json fetched + Svelte re-rendered
/mars (canvas2d)data-sites-count={sites.length}mars-sites JSON fetched
/moon (canvas2d)data-sites-count={sites.length}moon-sites JSON fetched
/fly ([data-testid="fly-render-state"])data-view, data-sim-day, data-sc-phaserAF loop has committed the new sim time + view

Tests wait via Playwright's first-class assertion API:

ts
// Replace `await page.waitForTimeout(800)`:
await expect(canvas).not.toHaveAttribute('data-objects-count', '0', { timeout: 10_000 });

The page is now telling the test exactly when it's ready — the test polls a real condition, not a guessed time.

Pattern 2 — window.__pickAt(moduleId?) test hook for canvas pickability

3D scenes with raycast-driven module pickability (/iss, /tiangong) expose a small unconditional test hook that projects a known pickable mesh's world position to client-space pixels:

ts
interface OrreryTestApi {
  __issPickAt(moduleId?: string): { x: number; y: number; moduleId: string } | null;
}
(window as unknown as OrreryTestApi).__issPickAt = (moduleId?: string) => {
  camera.updateMatrixWorld(true);
  // iterate pickable meshes, return first one whose centre projects
  // inside (-1, 1) NDC and z < 1; force matrixWorld update so the
  // hook works before the first render.

};

Tests use the hook + canvas.click({ position }) (canvas-relative, bypasses overlays):

ts
const pos = await page.evaluate(() => window.__issPickAt());
const box = await canvas.boundingBox();
await canvas.click({ position: { x: pos.x - box.x, y: pos.y - box.y } });

Replaces the previous spiral-click pattern (40 radii × 32 angles = 1280 clicks per test in the worst case) with a deterministic single click that exercises the real pointer pipeline.

Pattern 3 — Skip CI-only auto-fallbacks under navigator.webdriver

Routes with FPS-based fallbacks (/iss and /tiangong switch to list mode after 2s if measured FPS < 20) trip on GH-Actions software-rasterizer WebGL. Skip the gate when running under WebDriver:

ts
const underTest = typeof navigator !== 'undefined' && navigator.webdriver === true;
if (fps < 20 && viewBag.mode === '3d' && !underTest) {
  /* fall back to list mode */
}

Real users on real hardware still get the auto-fallback. Test runs always see the canvas.

Rationale

  • Pattern 1 — Readiness attributes beat magic timeouts because the test is asserting a real condition (data loaded) instead of a wall-clock guess. Slow CI makes the test wait longer; fast local runs are instant. Same code, no flakes.
  • Pattern 2 — Test hook with deterministic position beats spiral-click because (a) it uses the exact pickability code path the user does (raycast on the pointer event), but (b) lands the click at a position guaranteed to hit a mesh. Spiral was a probabilistic "does at least one of 1280 clicks hit something?" — test failure is silent if rendering is off. Deterministic click fails fast and obviously.
  • Pattern 3 — navigator.webdriver skip is the smallest possible app change. Real users never trigger the test path. It's a single-line guard.

These patterns together took the full suite from 25-minute timeout (cancelled mid-run) to 16-minute clean pass.

Alternatives considered

  • waitForLoadState('networkidle') alone — fires before the page parses + draws data; not a fix.
  • Add fixed-but-bigger waitForTimeout — works on most CI runs but still fails under load; doesn't address the root cause.
  • Disable Playwright retries — nope, retries are great for true flakes; the goal here is to not have flakes in the first place.
  • Mock requestAnimationFrame in tests — fragile; tests would diverge from production behaviour.
  • Use Playwright's expect(...).toHaveScreenshot() for visual regressions — separate concern; still useful, but doesn't replace pickability tests.

Consequences

Positive:

  • E2e flakes from time-based bets are gone. Tests pass deterministically on fast local hardware AND slow CI runners.
  • Route-page state is now test-observable via data-* attributes — same attributes are useful for future debug overlays.
  • Test hooks (window.__pickAt) document the canvas-pick interaction surface in a single place.
  • New routes inherit a clear pattern: expose a readiness signal, expose a pick hook for canvas tests.

Negative:

  • ~10 extra lines of "test-mode" code per 3D route page (the __pickAt hook + the navigator.webdriver guard). Harmless in production but visible in source.
  • Adding a new data-* attribute requires both the page to publish it AND the test to wait on it — drift risk if a future change removes the attribute.

Implementation notes

  • src/routes/earth/+page.sveltedata-objects-count={objects.length} on <canvas class="layer">.
  • src/routes/mars/+page.sveltedata-sites-count={sites.length} on the 2D canvas.
  • src/routes/moon/+page.sveltedata-sites-count={sites.length} on the 2D canvas.
  • src/routes/fly/+page.svelte[data-testid="fly-render-state"] element with data-view, data-sim-day, data-sc-phase attributes.
  • src/routes/iss/+page.svelte, src/routes/tiangong/+page.sveltewindow.__issPickAt / window.__tiangongPickAt hooks + navigator.webdriver perf-fallback guard.
  • Tests: tests/e2e/earth.spec.ts, mars.spec.ts, moon.spec.ts, fly.spec.ts, fly-render-validation.spec.ts, iss.spec.ts, tiangong.spec.ts, plan-porkchop-refresh.spec.ts, explore.spec.ts.

The pre-push hook (.husky/pre-push) runs npm run preflight which mirrors CI step-for-step. Trust the exit code; the readiness attributes mean preflight failures reflect real bugs, not timing.

When a new e2e test is written, the rule (per CLAUDE.md "What not to do") is: never waitForTimeout(<magic>). Always wait on a readiness signal — either an existing data-* attribute, a waitForFunction polling a real condition, or a expect(...).toHave... assertion.

Orrery — architecture documentation · MIT · No tracking