RFC-021 · Immersive Mode — WebXR (Android) + ARKit Swift plugin (iPhone wrapped) + Exhibit Mode
Status: Draft v0.4 · 2026-05-16 · Closes: PRD-019
Why this is an RFC. Immersive Mode binds five interlocking architectural commitments that ripple through every layer of the product: (1) a Three.js whole-codebase upgrade (r128 → current) touching all 7 existing 3D scenes, (2) a dual AR code path (WebXR on Android, ARKit via Capacitor Swift plugin on iPhone wrapped) sharing the same Three.js scene code via a thin AR-abstraction interface, (3) the spatial-audio listener swap that hooks PRD-017's existing sonification graphs into AR's XR camera, (4) the narrator auto-play + ducking contract that depends on PRD-016 audio episodes shipping first, (5) an Exhibit Mode that lazy-loads a chrome-less cinematic player as a separate Vite chunk. These are the dependency tip of the entire product stack; one wrong cut early forces ugly retrofits across already-shipped features.
1 · Architecture overview
┌──────────────────────────────────────────────────────────────────┐
│ Flat-screen 3D scenes (existing, 7 routes) │
│ Three.js current (r170+ after v1 prereq upgrade) │
│ Camera A (perspective) ← AudioListener attaches here normally │
└──────────────────────────────────────────────────────────────────┘
│
│ user taps "Enter AR" on a globe route
▼
┌──────────────────────────────────────────────────────────────────┐
│ AR session start │
│ ┌─────────────────────────┬───────────────────────────────┐ │
│ │ Android (web + wrapped) │ iPhone wrapped (Capacitor) │ │
│ │ WebXRManager.startAR() │ @orrery/ar-bridge plugin │ │
│ │ Native browser API │ Swift wraps ARKit ARSession │ │
│ └─────────────┬───────────┴──────────────┬────────────────┘ │
│ │ │ │
│ ▼ ▼ │
│ ┌────────────────────────────────────────────┐ │
│ │ AR abstraction layer (lib/ar.ts) │ │
│ │ Common interface: hit-test, anchor, │ │
│ │ camera-pose, session lifecycle. │ │
│ │ Provider-agnostic API for Three.js. │ │
│ └─────────────────┬──────────────────────────┘ │
│ │ │
│ ▼ │
│ AR scene builder (lazy chunk, <50 KB) │
│ Camera B (XR) ← AudioListener swaps here │
│ Simplified geometry (100 stars, no particles, no trails) │
│ Spatial audio per object (PannerNode follows world position) │
│ Narrator auto-plays 2 s after placement (Guide episode) │
└──────────────────────────────────────────────────────────────────┘
iPhone Safari: "Enter AR" greyed-out with App Store link in tooltip.
Desktop: "Enter AR" hidden; Exhibit Mode available via ?mode=exhibit.The AR abstraction layer (src/lib/ar.ts) is the only file that imports either WebXR (Android) or the Capacitor Swift plugin (iPhone wrapped). Three.js scene code is unaware of which AR backend is active.
2 · The five anchor decisions (locked from v0.4 walkthrough)
| ID | Choice | Reason |
|---|---|---|
| X-C | XR strategy = hybrid (lightweight AR scene variants sharing flat-screen data + physics) | Same scene data, simplified rendering for AR perf (72 fps target). Not full XR fork; not pure JS replay. |
| A-A | Audio graph reuse = same Web Audio graph; XR camera as AudioListener | PRD-017 sonification carries over verbatim; only the listener attachment point changes. Zero new audio code. |
| N-B | Narrator position = omniscient (30 cm above + behind listener); ducking via RMS amplitude follower | Voice stays in the centre of the head; sonification ducks during voiced segments. Per PRD-017 audio-bus contract. |
| S-E | AR scope = 4 globe scenes (/explore, /earth, /moon, /mars) | Tabletop placement works naturally for globes. /fly AU-scale is hard; /iss + /tiangong are station-models (different UX). Both deferred. |
| NE-B | Narrator auto-play = 2 s after placement | User has placed the scene; they want it explained. 2 s lets them look first. |
| M-B | Exhibit Mode = chrome-less + QR | Museum / classroom use case. No nav, no HUD; auto-narrated cinematic orbits; QR for AR continuation. |
M-C synced exhibit (multi-projector wall): deferred to v2 (requires WebSocket infra; out of v1 scope).
3 · AR backend abstraction
3.1 · The interface
// src/lib/ar.ts
export interface ArBackend {
readonly name: 'webxr' | 'arkit-capacitor';
readonly platform: 'android-web' | 'android-wrapped' | 'iphone-wrapped';
// Lifecycle
isSupported(): Promise<boolean>;
startSession(): Promise<void>;
endSession(): Promise<void>;
// Per-frame (called from RAF)
getCameraPose(): {
position: [number, number, number];
rotation: [number, number, number, number]; // quaternion
};
// Hit-testing (tap on screen → real-world point)
hitTest(screenX: number, screenY: number): Promise<{
worldPosition: [number, number, number];
worldNormal: [number, number, number];
} | null>;
// Anchors (lock a scene origin to a real-world point)
addAnchor(worldPosition: [number, number, number]): Promise<string>; // anchor id
removeAnchor(anchorId: string): Promise<void>;
// Events
on(event: 'session-started' | 'session-ended' | 'frame', handler: (...args: any[]) => void): () => void;
}Two implementations live behind this interface:
src/lib/ar/webxr.ts— uses Three.js'sWebXRManagerdirectly (Android web + Android wrapped both run this path).src/lib/ar/arkit-capacitor.ts— imports@orrery/ar-bridge(Capacitor Swift plugin) and exposes the same interface. Only loaded on iPhone wrapped.
Backend selection happens once at module load via Capacitor.getPlatform() + WebXR feature detection. Three.js scene code never knows which backend is active.
3.2 · Why this shape
Same pattern as PRD-016 TtsProvider + PRD-018 VisionProvider. The product has settled into "abstract over provider, implement once per platform, swap with config" as the architectural template. Familiar to maintain.
4 · ARKit Capacitor plugin (@orrery/ar-bridge)
4.1 · Why this exists
Apple has not shipped WebXR on Safari/WebKit. iPhone browsers (all of which use WebKit under the hood — Apple's App Store policy enforced this even after the EU DMA) have no AR API. The wrapped Capacitor app can use native ARKit; the Swift plugin is the bridge.
4.2 · Scope (Swift code)
~600 lines Swift in ios/App/App/Plugins/ar-bridge/:
ArBridgePlugin.swift— Capacitor plugin entry; bridges JS calls to ARKit Swift API.ArSessionManager.swift— wrapsARSession,ARWorldTrackingConfiguration, frame callback loop.ArHitTester.swift— wrapsARHitTestResult+ARRaycastQuery.ArAnchorTracker.swift— wrapsARAnchoradd/remove + ID lifecycle.ArCameraPoseEmitter.swift— converts ARKit'ssimd_float4x4transform to Three.js-friendly position+quaternion via a small SIMD helper.
The plugin emits JS events:
session-started— afterARSession.run()succeeds.frame— per AR frame (30-60 Hz), carrying the camera pose.session-ended— onARSession.pause()or user-exit.
Plus methods JS calls:
requestSession()→ starts AR session, returns when world tracking is initialised.endSession()→ pauses and tears down.hitTest(x, y)→ returns world point under the tap, or null if no surface.addAnchor(position)/removeAnchor(id)→ manage anchors.
4.3 · Build + distribution
The plugin lives in the ios/ directory committed to the Orrery repo (per Capacitor convention from PRD-015 / RFC-018 §3). Native bundle goes through npx cap sync ios then Xcode archive → App Store Connect upload (per RFC-018 §9 Android-first → iOS-second build pipeline).
No separate plugin repo, no npm-published Capacitor plugin. Internal to the project.
4.4 · Maintenance burden
ARKit API has been stable since iOS 13 (ARKit 3). Apple bumps it minorly each iOS release. Estimated annual maintenance: 1-2 days of Swift work + Xcode-version updates.
Marko (per PRD-015 §iOS code-signing) is the owner.
5 · Three.js whole-codebase upgrade (v1 prerequisite)
5.1 · Scope of the upgrade
All 7 existing 3D scenes:
/explore(solar system)/flyheliocentric (mission arc)/flycislunar (Earth-Moon system)/earth(orbit regimes)/moon(lunar surface + landing sites)/mars(surface + rover sites)/iss(station model + module pickability)/tiangong(station model)
All migrate from r128 to current (likely r170+ as of 2026-05).
5.2 · Major breaking changes r128 → r170
- Materials API:
MeshPhongMaterial/MeshStandardMaterialparameter rename or removal in some cases. Texture binding model adjusted. - Lighting: Direct light intensity values changed (physically-based defaults). All scene lighting needs re-tuning.
BufferGeometrymandatory:Geometryclass removed in r125 (already gone); should be fine.WebGLRenderer.outputEncoding→outputColorSpace: affects gamma + sRGB pipeline.Color.toJSON()shape change.- WebXRManager evolution: hit-test was experimental in r128; mature in r140+.
- Shader chunks (
onBeforeCompile): small naming changes.
5.3 · Migration plan
| Step | Work | Estimate |
|---|---|---|
| 1. Audit r128 → current breaking changes against our actual usage | Read changelogs r128–r170; grep our codebase for affected APIs | 1-2 days |
| 2. Bump package.json, fix top-level build errors | npm install three@latest, fix compile errors | 1 day |
| 3. Per-scene migration | 7 scenes; ~2-3 days each w/ visual diff against baselines | 2-3 weeks |
| 4. Visual regression sweep | tests/e2e/visual.spec.ts baselines + manual review of each scene | 3-5 days |
| 5. Lighting + colour-space re-tuning | Material params may need adjustment for visual parity | 3-5 days |
| 6. Performance retest | Frame-rate budget on Pixel 6a + iPhone 12 | 1 day |
Total estimate: 4-6 weeks, single-developer serial. Can parallelise by route. The 7-scene scope is bounded; not open-ended.
5.4 · Risk + mitigation
| Risk | Mitigation |
|---|---|
| Visual regressions on flat-screen scenes | tests/e2e/visual.spec.ts baselines + manual reviewer pass per scene |
| Lighting / colour-space changes break the editorial look | Per-scene tuning pass; commit visual baseline at each step |
| WebXR feature stability | The upgrade is what enables reliable WebXR hit-test, so this is the WIN not the risk |
| Two Three.js versions co-existing during migration | Avoided — single big upgrade in one branch, not incremental |
6 · Audio graph in AR (PRD-017 sonification reuse)
The PRD-017 per-route sonification graphs are reused verbatim. The only change: in AR mode, the AudioListener is attached to the XR camera instead of the flat-screen perspective camera.
// Flat-screen mode
flatCamera.add(audioListener);
// AR session start
flatCamera.remove(audioListener);
xrCamera.add(audioListener);
// AR session end
xrCamera.remove(audioListener);
flatCamera.add(audioListener);Each oscillator's PannerNode already tracks its 3D world position. As the user walks around the scene, the listener (XR camera) moves through the audio graph; PannerNodes spatial-pan based on the relative geometry. Saturn at [2, 0, -3] in world space sounds like it's coming from 2 metres right + 3 metres ahead when the listener is at the origin.
The narrator audio (PRD-016 episode MP3 playback) attaches to the listener directly (not as a PannerNode child) so it stays centred in the user's head regardless of XR camera position. This matches the omniscient narrator-position choice (N-B).
6.1 · Narrator-sonification ducking in AR
Same audio-bus contract as PRD-017 RFC-020 §4 — no new mechanism. When narrator plays, sonification ducks to ~0.02 gain (−34 dB); restored 200 ms after narration ends.
In AR specifically, ducking matters more — the user can't visually distract from the audio because the visual is the planets themselves; the audio mix must be clean. v1 uses RMS amplitude follower for ducking; semantic ducking ("duck during equation explanation but not transition") deferred to v1.x.
7 · Headphone-aware audio rendering
async function detectAudioOutput(): Promise<'headphones' | 'speakers'> {
if (!navigator.mediaDevices?.enumerateDevices) return 'speakers';
const devices = await navigator.mediaDevices.enumerateDevices();
const audioOut = devices.find(d => d.kind === 'audiooutput' && d.deviceId !== 'default');
// Heuristic: presence of a non-default audio output suggests headphones or external speaker
return audioOut ? 'headphones' : 'speakers';
}
function configureSpatialAudio(mode: 'headphones' | 'speakers') {
// On headphones, HRTF gives accurate 3D positioning
// On speakers, HRTF can sound bizarre — use equal-power stereo
pannerNodes.forEach(p => {
p.panningModel = mode === 'headphones' ? 'HRTF' : 'equalpower';
});
}Listens to devicechange event to swap mode on plug/unplug. Heuristic is imperfect (Bluetooth speakers register as audiooutput too), but better than always-HRTF-or-always-stereo.
8 · AR-specific haptics (Capacitor Haptics reuse)
PRD-017 RFC-020 §5.2 already specs the haptic patterns. AR adds two AR-specific events:
| Event | Pattern (web) | Capacitor style |
|---|---|---|
| Anchor placed (after user tap) | 15 | ImpactStyle.Light |
| Narrator episode start in AR | (no pulse — voice is the cue) | (n/a) |
| Narrator section transition | 8 | ImpactStyle.Light |
| Narrator episode end | [5, 30, 5, 30, 5] | Haptics.notification({ type: NotificationType.Success }) |
iOS wrapped: uses @capacitor/haptics (Taptic Engine). Android web: navigator.vibrate. Android wrapped: also @capacitor/haptics (which falls through to navigator.vibrate on Android).
9 · Exhibit Mode (M-B)
9.1 · Trigger
?mode=exhibit URL parameter on any flat-screen route. Loads a separate Vite lazy chunk (src/lib/exhibit.ts, <20 KB).
9.2 · Behaviour
- All chrome hidden via
body.exhibit-mode { /* hide nav, footer, HUDs */ }. - Auto-starts a playlist:
- 22 min
/explore— Curator open + Kepler chord ambient + planet-by-planet tour - 18 min
/earth— orbit regime tour (LEO → MEO → GEO → HEO → L-points) - 22 min
/moon— terminator crossing + far-side reveal + Apollo sites - 18 min
/mars— landing sites + signal-delay moment + 14.5-second narration - 10 min — Curator close from PRD-016's Full Tour
- 22 min
- Cinematic camera paths per scene (pre-authored splines; the camera moves itself, no user input).
- Auto-narrated throughout (PRD-016 Curator + Guide voices).
- Sonification at full level (no ducking competition — narrator is centre-mixed, sonification spatial).
- QR code in bottom-right corner (44 px) pointing to the same Orrery URL minus
?mode=exhibit; visitor scans → opens the AR experience on their phone (deep-linked to current scene if possible).
9.3 · Performance + bundle
- Cinematic playlist scripts are tiny (~5 KB of JSON path data).
- Chunk total <20 KB minified gzipped.
- Lazy-loaded; zero impact on flat-screen / AR bundle.
9.4 · QR code link configurability
?qr=<base64-shortlink> URL param overrides the default QR target. Lets museum operators point the QR at their own landing page (e.g., "Visit chipi.github.io/orrery — Orrery on display at MoMA Aug-Sep 2026").
10 · Bundle layout
build/
├── _app/immutable/chunks/
│ ├── flat-3d.<hash>.js ← existing 3D scenes (unchanged in size after Three.js upgrade)
│ ├── ar-webxr.<hash>.js ← NEW lazy chunk for Android AR (~30 KB)
│ ├── ar-arkit.<hash>.js ← NEW lazy chunk for iPhone wrapped (~15 KB; imports Capacitor plugin)
│ ├── ar-scene.<hash>.js ← NEW lazy chunk for AR scene builders (~25 KB)
│ ├── exhibit.<hash>.js ← NEW lazy chunk for Exhibit Mode (~18 KB)
│ └── (existing chunks)
├── ios/ ← Capacitor Xcode project (extended with @orrery/ar-bridge plugin)
│ └── App/App/Plugins/ar-bridge/ ← Swift plugin (NEW)
└── android/ ← Capacitor Android project (no native plugin needed; WebXR via Chrome)Total v1 ship bundle impact: ~88 KB gzipped of new JS, all lazy-loaded only when AR or Exhibit Mode is entered. Plus ~600 lines Swift (iOS bundle only).
11 · Failure modes
| Failure | Detection | Handling |
|---|---|---|
| WebXR not supported on Android browser | navigator.xr?.isSessionSupported('immersive-ar') returns false | Hide "Enter AR" button + show fallback message |
| ARKit Swift plugin fails to start session | Plugin emits error in requestSession() | Show fallback message; log to telemetry; offer "Continue in flat-screen mode" CTA |
| Hit-test returns null (no surface detected) | User taps but no surface found | Display non-modal hint: "Point at a flat surface (floor, table)"; retry on next tap |
| Camera permission denied | WebXR / ARKit returns permission error | "Camera access is needed for AR. Continue without?" → falls back to flat-screen |
| Narrator audio fails to load mid-AR session | PRD-016 audio fetch error | AR session continues; sonification stays at full level; banner: "Narration unavailable; sound continues" |
| Performance drops below 30 fps in AR | Frame budget telemetry | Reduce particle count / shadow detail (already minimal in AR); log to telemetry |
| Three.js upgrade introduces visual regression on a flat-screen scene | Visual baseline diff in CI | Block merge; per-scene tuning pass required before AR work begins |
12 · Privacy
Per PRD-019 M16:
- Camera frames are NEVER stored, transmitted, or logged.
- Only spatial-tracking metadata (camera pose + anchor positions) is used by JS.
- Permission grant UI is explicit about this ("camera frames are processed on your device and never sent anywhere").
- iOS
Info.plistaddsNSCameraUsageDescription: "Orrery uses your camera to place the solar system in your room. The camera feed stays on your device." - Android
AndroidManifest.xmladds<uses-permission android:name="android.permission.CAMERA" />(no extra description needed).
No camera-frame analytics. No frame-based telemetry. The wrapper does not expose camera-frame buffers to native plugins beyond ARKit/ARCore.
13 · Testing
13.1 · Manual test matrix
| Device | Browser / wrapper | Tests |
|---|---|---|
| Pixel 6a | Chrome | WebXR hit-test, anchor placement, spatial audio, narrator auto-play, 72 fps target |
| Pixel 6a | Capacitor wrapper | Same as Chrome (WebXR path) |
| Samsung Galaxy mid-range | Chrome | Performance on lower-tier Android (60 fps acceptable) |
| iPhone 12 | Safari | "Enter AR" greyed-out + App Store fallback message |
| iPhone 12 | Capacitor wrapper | ARKit session start, hit-test, anchor placement, parity with Android |
| iPhone 14 Pro | Capacitor wrapper | ARKit on 3× pixel ratio device; perf check |
| MacBook + Chrome (desktop) | — | "Enter AR" hidden; ?mode=exhibit loads Exhibit Mode |
13.2 · Cross-platform parity test
Marko + 2 reviewers (one Android, one iPhone), same scene (/explore), same hour. Compare editorial experience subjectively. v1 ship-gate: ≥ 90 % parity score (Marko-judged).
13.3 · Visual regression for the Three.js upgrade
tests/e2e/visual.spec.ts baselines stay relevant for the non-3D surfaces (/credits, /library, /science strip). For the 7 3D scenes, manual reviewer pass per scene. No automated visual diff for canvas — too flaky.
14 · Resolved decisions + open questions
Resolved 2026-05-16:
- XR strategy — RESOLVED: X-C hybrid (simplified AR scene variants sharing data).
- Spatial audio — RESOLVED: A-A (same PRD-017 audio graph, XR camera as listener).
- Narrator position — RESOLVED: N-B (omniscient, 30 cm above + behind; centre-panned; RMS amplitude ducking).
- AR scope — RESOLVED: S-E (4 globe scenes:
/explore/earth/moon/mars). - AR entry — RESOLVED: NE-B (auto-play Guide episode 2 s after placement).
- Exhibit Mode — RESOLVED: M-B (chrome-less + QR; M-C synced exhibit deferred to v2).
- iPhone AR — RESOLVED: Capacitor + ARKit Swift plugin (
@orrery/ar-bridge). iPhone Safari has no AR; gets fallback messaging. - Three.js version — RESOLVED: Whole-codebase upgrade (r128 → current) as v1 prerequisite. Touches all 7 existing 3D scenes.
- Web vs app split — RESOLVED: Web stays web-only; mobile app may use native (ARKit example). Evolves PRD-015 framing.
- Vision Pro — RESOLVED: Dropped from v1 entirely. Focus: mobile (Android + iPhone).
- AR session storage — RESOLVED: In-memory only per ADR-057. No localStorage. No sessionStorage. Original draft's sessionStorage references corrected.
- Headphone detection — RESOLVED:
navigator.mediaDevices.enumerateDevices(), HRTF on headphones, equal-power on speakers. - AR-specific haptics — RESOLVED: Reuse
@capacitor/hapticsfrom RFC-020. Two new AR events (anchor placed, narrator section transition).
Operational follow-ups:
- ARKit Swift plugin owner. Marko per PRD-015 iOS signing. Confirm long-term commitment is OK.
- Three.js upgrade risk envelope. 4-6 weeks single-dev. Operational.
- AR onboarding UX iteration. Likely 2-3 design rounds. Implementation-time.
- Exhibit Mode QR → AR deep-link handshake. Implementation-time refinement.
- iPhone TestFlight before App Store? Recommend yes — 2 weeks TestFlight.
Deferred to v2:
- AR anchor persistence (put phone down + pick up). ARCore/ARKit support varies; needs design + impl effort.
- AR occlusion (hands blocking planets). Depth API on both platforms.
- Multi-user shared AR. Cloud Anchors / ARKit collaborative sessions.
- Script-aware semantic narrator ducking. v1 uses RMS amplitude only.
- Synced multi-projector Exhibit (M-C). WebSocket-coordinated multi-screen.
/flymission arc in AR +/iss+/tiangongstation-model AR.
RFC-021 · Orrery · Immersive Mode · Drafted 2026-05-16 · Closes-into-PRD-019