Skip to content

RFC-021 · Immersive Mode — WebXR (Android) + ARKit Swift plugin (iPhone wrapped) + Exhibit Mode

Status: Draft v0.4 · 2026-05-16 · Closes: PRD-019

Why this is an RFC. Immersive Mode binds five interlocking architectural commitments that ripple through every layer of the product: (1) a Three.js whole-codebase upgrade (r128 → current) touching all 7 existing 3D scenes, (2) a dual AR code path (WebXR on Android, ARKit via Capacitor Swift plugin on iPhone wrapped) sharing the same Three.js scene code via a thin AR-abstraction interface, (3) the spatial-audio listener swap that hooks PRD-017's existing sonification graphs into AR's XR camera, (4) the narrator auto-play + ducking contract that depends on PRD-016 audio episodes shipping first, (5) an Exhibit Mode that lazy-loads a chrome-less cinematic player as a separate Vite chunk. These are the dependency tip of the entire product stack; one wrong cut early forces ugly retrofits across already-shipped features.


1 · Architecture overview

┌──────────────────────────────────────────────────────────────────┐
│  Flat-screen 3D scenes (existing, 7 routes)                      │
│   Three.js current (r170+ after v1 prereq upgrade)               │
│   Camera A (perspective) ← AudioListener attaches here normally  │
└──────────────────────────────────────────────────────────────────┘

                          │ user taps "Enter AR" on a globe route

┌──────────────────────────────────────────────────────────────────┐
│  AR session start                                                │
│   ┌─────────────────────────┬───────────────────────────────┐   │
│   │ Android (web + wrapped) │ iPhone wrapped (Capacitor)    │   │
│   │ WebXRManager.startAR()  │ @orrery/ar-bridge plugin      │   │
│   │ Native browser API      │ Swift wraps ARKit ARSession   │   │
│   └─────────────┬───────────┴──────────────┬────────────────┘   │
│                 │                          │                     │
│                 ▼                          ▼                     │
│       ┌────────────────────────────────────────────┐            │
│       │ AR abstraction layer (lib/ar.ts)           │            │
│       │  Common interface: hit-test, anchor,       │            │
│       │  camera-pose, session lifecycle.           │            │
│       │  Provider-agnostic API for Three.js.       │            │
│       └─────────────────┬──────────────────────────┘            │
│                         │                                        │
│                         ▼                                        │
│   AR scene builder (lazy chunk, <50 KB)                          │
│    Camera B (XR) ← AudioListener swaps here                      │
│    Simplified geometry (100 stars, no particles, no trails)      │
│    Spatial audio per object (PannerNode follows world position) │
│    Narrator auto-plays 2 s after placement (Guide episode)       │
└──────────────────────────────────────────────────────────────────┘

iPhone Safari: "Enter AR" greyed-out with App Store link in tooltip.
Desktop: "Enter AR" hidden; Exhibit Mode available via ?mode=exhibit.

The AR abstraction layer (src/lib/ar.ts) is the only file that imports either WebXR (Android) or the Capacitor Swift plugin (iPhone wrapped). Three.js scene code is unaware of which AR backend is active.


2 · The five anchor decisions (locked from v0.4 walkthrough)

IDChoiceReason
X-CXR strategy = hybrid (lightweight AR scene variants sharing flat-screen data + physics)Same scene data, simplified rendering for AR perf (72 fps target). Not full XR fork; not pure JS replay.
A-AAudio graph reuse = same Web Audio graph; XR camera as AudioListenerPRD-017 sonification carries over verbatim; only the listener attachment point changes. Zero new audio code.
N-BNarrator position = omniscient (30 cm above + behind listener); ducking via RMS amplitude followerVoice stays in the centre of the head; sonification ducks during voiced segments. Per PRD-017 audio-bus contract.
S-EAR scope = 4 globe scenes (/explore, /earth, /moon, /mars)Tabletop placement works naturally for globes. /fly AU-scale is hard; /iss + /tiangong are station-models (different UX). Both deferred.
NE-BNarrator auto-play = 2 s after placementUser has placed the scene; they want it explained. 2 s lets them look first.
M-BExhibit Mode = chrome-less + QRMuseum / classroom use case. No nav, no HUD; auto-narrated cinematic orbits; QR for AR continuation.

M-C synced exhibit (multi-projector wall): deferred to v2 (requires WebSocket infra; out of v1 scope).


3 · AR backend abstraction

3.1 · The interface

typescript
// src/lib/ar.ts
export interface ArBackend {
  readonly name: 'webxr' | 'arkit-capacitor';
  readonly platform: 'android-web' | 'android-wrapped' | 'iphone-wrapped';

  // Lifecycle
  isSupported(): Promise<boolean>;
  startSession(): Promise<void>;
  endSession(): Promise<void>;

  // Per-frame (called from RAF)
  getCameraPose(): {
    position: [number, number, number];
    rotation: [number, number, number, number]; // quaternion
  };

  // Hit-testing (tap on screen → real-world point)
  hitTest(screenX: number, screenY: number): Promise<{
    worldPosition: [number, number, number];
    worldNormal: [number, number, number];
  } | null>;

  // Anchors (lock a scene origin to a real-world point)
  addAnchor(worldPosition: [number, number, number]): Promise<string>; // anchor id
  removeAnchor(anchorId: string): Promise<void>;

  // Events
  on(event: 'session-started' | 'session-ended' | 'frame', handler: (...args: any[]) => void): () => void;
}

Two implementations live behind this interface:

  • src/lib/ar/webxr.ts — uses Three.js's WebXRManager directly (Android web + Android wrapped both run this path).
  • src/lib/ar/arkit-capacitor.ts — imports @orrery/ar-bridge (Capacitor Swift plugin) and exposes the same interface. Only loaded on iPhone wrapped.

Backend selection happens once at module load via Capacitor.getPlatform() + WebXR feature detection. Three.js scene code never knows which backend is active.

3.2 · Why this shape

Same pattern as PRD-016 TtsProvider + PRD-018 VisionProvider. The product has settled into "abstract over provider, implement once per platform, swap with config" as the architectural template. Familiar to maintain.


4 · ARKit Capacitor plugin (@orrery/ar-bridge)

4.1 · Why this exists

Apple has not shipped WebXR on Safari/WebKit. iPhone browsers (all of which use WebKit under the hood — Apple's App Store policy enforced this even after the EU DMA) have no AR API. The wrapped Capacitor app can use native ARKit; the Swift plugin is the bridge.

4.2 · Scope (Swift code)

~600 lines Swift in ios/App/App/Plugins/ar-bridge/:

  • ArBridgePlugin.swift — Capacitor plugin entry; bridges JS calls to ARKit Swift API.
  • ArSessionManager.swift — wraps ARSession, ARWorldTrackingConfiguration, frame callback loop.
  • ArHitTester.swift — wraps ARHitTestResult + ARRaycastQuery.
  • ArAnchorTracker.swift — wraps ARAnchor add/remove + ID lifecycle.
  • ArCameraPoseEmitter.swift — converts ARKit's simd_float4x4 transform to Three.js-friendly position+quaternion via a small SIMD helper.

The plugin emits JS events:

  • session-started — after ARSession.run() succeeds.
  • frame — per AR frame (30-60 Hz), carrying the camera pose.
  • session-ended — on ARSession.pause() or user-exit.

Plus methods JS calls:

  • requestSession() → starts AR session, returns when world tracking is initialised.
  • endSession() → pauses and tears down.
  • hitTest(x, y) → returns world point under the tap, or null if no surface.
  • addAnchor(position) / removeAnchor(id) → manage anchors.

4.3 · Build + distribution

The plugin lives in the ios/ directory committed to the Orrery repo (per Capacitor convention from PRD-015 / RFC-018 §3). Native bundle goes through npx cap sync ios then Xcode archive → App Store Connect upload (per RFC-018 §9 Android-first → iOS-second build pipeline).

No separate plugin repo, no npm-published Capacitor plugin. Internal to the project.

4.4 · Maintenance burden

ARKit API has been stable since iOS 13 (ARKit 3). Apple bumps it minorly each iOS release. Estimated annual maintenance: 1-2 days of Swift work + Xcode-version updates.

Marko (per PRD-015 §iOS code-signing) is the owner.


5 · Three.js whole-codebase upgrade (v1 prerequisite)

5.1 · Scope of the upgrade

All 7 existing 3D scenes:

  • /explore (solar system)
  • /fly heliocentric (mission arc)
  • /fly cislunar (Earth-Moon system)
  • /earth (orbit regimes)
  • /moon (lunar surface + landing sites)
  • /mars (surface + rover sites)
  • /iss (station model + module pickability)
  • /tiangong (station model)

All migrate from r128 to current (likely r170+ as of 2026-05).

5.2 · Major breaking changes r128 → r170

  • Materials API: MeshPhongMaterial / MeshStandardMaterial parameter rename or removal in some cases. Texture binding model adjusted.
  • Lighting: Direct light intensity values changed (physically-based defaults). All scene lighting needs re-tuning.
  • BufferGeometry mandatory: Geometry class removed in r125 (already gone); should be fine.
  • WebGLRenderer.outputEncodingoutputColorSpace: affects gamma + sRGB pipeline.
  • Color.toJSON() shape change.
  • WebXRManager evolution: hit-test was experimental in r128; mature in r140+.
  • Shader chunks (onBeforeCompile): small naming changes.

5.3 · Migration plan

StepWorkEstimate
1. Audit r128 → current breaking changes against our actual usageRead changelogs r128–r170; grep our codebase for affected APIs1-2 days
2. Bump package.json, fix top-level build errorsnpm install three@latest, fix compile errors1 day
3. Per-scene migration7 scenes; ~2-3 days each w/ visual diff against baselines2-3 weeks
4. Visual regression sweeptests/e2e/visual.spec.ts baselines + manual review of each scene3-5 days
5. Lighting + colour-space re-tuningMaterial params may need adjustment for visual parity3-5 days
6. Performance retestFrame-rate budget on Pixel 6a + iPhone 121 day

Total estimate: 4-6 weeks, single-developer serial. Can parallelise by route. The 7-scene scope is bounded; not open-ended.

5.4 · Risk + mitigation

RiskMitigation
Visual regressions on flat-screen scenestests/e2e/visual.spec.ts baselines + manual reviewer pass per scene
Lighting / colour-space changes break the editorial lookPer-scene tuning pass; commit visual baseline at each step
WebXR feature stabilityThe upgrade is what enables reliable WebXR hit-test, so this is the WIN not the risk
Two Three.js versions co-existing during migrationAvoided — single big upgrade in one branch, not incremental

6 · Audio graph in AR (PRD-017 sonification reuse)

The PRD-017 per-route sonification graphs are reused verbatim. The only change: in AR mode, the AudioListener is attached to the XR camera instead of the flat-screen perspective camera.

typescript
// Flat-screen mode
flatCamera.add(audioListener);

// AR session start
flatCamera.remove(audioListener);
xrCamera.add(audioListener);

// AR session end
xrCamera.remove(audioListener);
flatCamera.add(audioListener);

Each oscillator's PannerNode already tracks its 3D world position. As the user walks around the scene, the listener (XR camera) moves through the audio graph; PannerNodes spatial-pan based on the relative geometry. Saturn at [2, 0, -3] in world space sounds like it's coming from 2 metres right + 3 metres ahead when the listener is at the origin.

The narrator audio (PRD-016 episode MP3 playback) attaches to the listener directly (not as a PannerNode child) so it stays centred in the user's head regardless of XR camera position. This matches the omniscient narrator-position choice (N-B).

6.1 · Narrator-sonification ducking in AR

Same audio-bus contract as PRD-017 RFC-020 §4 — no new mechanism. When narrator plays, sonification ducks to ~0.02 gain (−34 dB); restored 200 ms after narration ends.

In AR specifically, ducking matters more — the user can't visually distract from the audio because the visual is the planets themselves; the audio mix must be clean. v1 uses RMS amplitude follower for ducking; semantic ducking ("duck during equation explanation but not transition") deferred to v1.x.


7 · Headphone-aware audio rendering

typescript
async function detectAudioOutput(): Promise<'headphones' | 'speakers'> {
  if (!navigator.mediaDevices?.enumerateDevices) return 'speakers';
  const devices = await navigator.mediaDevices.enumerateDevices();
  const audioOut = devices.find(d => d.kind === 'audiooutput' && d.deviceId !== 'default');
  // Heuristic: presence of a non-default audio output suggests headphones or external speaker
  return audioOut ? 'headphones' : 'speakers';
}

function configureSpatialAudio(mode: 'headphones' | 'speakers') {
  // On headphones, HRTF gives accurate 3D positioning
  // On speakers, HRTF can sound bizarre — use equal-power stereo
  pannerNodes.forEach(p => {
    p.panningModel = mode === 'headphones' ? 'HRTF' : 'equalpower';
  });
}

Listens to devicechange event to swap mode on plug/unplug. Heuristic is imperfect (Bluetooth speakers register as audiooutput too), but better than always-HRTF-or-always-stereo.


8 · AR-specific haptics (Capacitor Haptics reuse)

PRD-017 RFC-020 §5.2 already specs the haptic patterns. AR adds two AR-specific events:

EventPattern (web)Capacitor style
Anchor placed (after user tap)15ImpactStyle.Light
Narrator episode start in AR(no pulse — voice is the cue)(n/a)
Narrator section transition8ImpactStyle.Light
Narrator episode end[5, 30, 5, 30, 5]Haptics.notification({ type: NotificationType.Success })

iOS wrapped: uses @capacitor/haptics (Taptic Engine). Android web: navigator.vibrate. Android wrapped: also @capacitor/haptics (which falls through to navigator.vibrate on Android).


9 · Exhibit Mode (M-B)

9.1 · Trigger

?mode=exhibit URL parameter on any flat-screen route. Loads a separate Vite lazy chunk (src/lib/exhibit.ts, <20 KB).

9.2 · Behaviour

  • All chrome hidden via body.exhibit-mode { /* hide nav, footer, HUDs */ }.
  • Auto-starts a playlist:
    • 22 min /explore — Curator open + Kepler chord ambient + planet-by-planet tour
    • 18 min /earth — orbit regime tour (LEO → MEO → GEO → HEO → L-points)
    • 22 min /moon — terminator crossing + far-side reveal + Apollo sites
    • 18 min /mars — landing sites + signal-delay moment + 14.5-second narration
    • 10 min — Curator close from PRD-016's Full Tour
  • Cinematic camera paths per scene (pre-authored splines; the camera moves itself, no user input).
  • Auto-narrated throughout (PRD-016 Curator + Guide voices).
  • Sonification at full level (no ducking competition — narrator is centre-mixed, sonification spatial).
  • QR code in bottom-right corner (44 px) pointing to the same Orrery URL minus ?mode=exhibit; visitor scans → opens the AR experience on their phone (deep-linked to current scene if possible).

9.3 · Performance + bundle

  • Cinematic playlist scripts are tiny (~5 KB of JSON path data).
  • Chunk total <20 KB minified gzipped.
  • Lazy-loaded; zero impact on flat-screen / AR bundle.

?qr=<base64-shortlink> URL param overrides the default QR target. Lets museum operators point the QR at their own landing page (e.g., "Visit chipi.github.io/orrery — Orrery on display at MoMA Aug-Sep 2026").


10 · Bundle layout

build/
├── _app/immutable/chunks/
│   ├── flat-3d.<hash>.js        ← existing 3D scenes (unchanged in size after Three.js upgrade)
│   ├── ar-webxr.<hash>.js       ← NEW lazy chunk for Android AR (~30 KB)
│   ├── ar-arkit.<hash>.js       ← NEW lazy chunk for iPhone wrapped (~15 KB; imports Capacitor plugin)
│   ├── ar-scene.<hash>.js       ← NEW lazy chunk for AR scene builders (~25 KB)
│   ├── exhibit.<hash>.js        ← NEW lazy chunk for Exhibit Mode (~18 KB)
│   └── (existing chunks)
├── ios/                         ← Capacitor Xcode project (extended with @orrery/ar-bridge plugin)
│   └── App/App/Plugins/ar-bridge/  ← Swift plugin (NEW)
└── android/                     ← Capacitor Android project (no native plugin needed; WebXR via Chrome)

Total v1 ship bundle impact: ~88 KB gzipped of new JS, all lazy-loaded only when AR or Exhibit Mode is entered. Plus ~600 lines Swift (iOS bundle only).


11 · Failure modes

FailureDetectionHandling
WebXR not supported on Android browsernavigator.xr?.isSessionSupported('immersive-ar') returns falseHide "Enter AR" button + show fallback message
ARKit Swift plugin fails to start sessionPlugin emits error in requestSession()Show fallback message; log to telemetry; offer "Continue in flat-screen mode" CTA
Hit-test returns null (no surface detected)User taps but no surface foundDisplay non-modal hint: "Point at a flat surface (floor, table)"; retry on next tap
Camera permission deniedWebXR / ARKit returns permission error"Camera access is needed for AR. Continue without?" → falls back to flat-screen
Narrator audio fails to load mid-AR sessionPRD-016 audio fetch errorAR session continues; sonification stays at full level; banner: "Narration unavailable; sound continues"
Performance drops below 30 fps in ARFrame budget telemetryReduce particle count / shadow detail (already minimal in AR); log to telemetry
Three.js upgrade introduces visual regression on a flat-screen sceneVisual baseline diff in CIBlock merge; per-scene tuning pass required before AR work begins

12 · Privacy

Per PRD-019 M16:

  • Camera frames are NEVER stored, transmitted, or logged.
  • Only spatial-tracking metadata (camera pose + anchor positions) is used by JS.
  • Permission grant UI is explicit about this ("camera frames are processed on your device and never sent anywhere").
  • iOS Info.plist adds NSCameraUsageDescription: "Orrery uses your camera to place the solar system in your room. The camera feed stays on your device."
  • Android AndroidManifest.xml adds <uses-permission android:name="android.permission.CAMERA" /> (no extra description needed).

No camera-frame analytics. No frame-based telemetry. The wrapper does not expose camera-frame buffers to native plugins beyond ARKit/ARCore.


13 · Testing

13.1 · Manual test matrix

DeviceBrowser / wrapperTests
Pixel 6aChromeWebXR hit-test, anchor placement, spatial audio, narrator auto-play, 72 fps target
Pixel 6aCapacitor wrapperSame as Chrome (WebXR path)
Samsung Galaxy mid-rangeChromePerformance on lower-tier Android (60 fps acceptable)
iPhone 12Safari"Enter AR" greyed-out + App Store fallback message
iPhone 12Capacitor wrapperARKit session start, hit-test, anchor placement, parity with Android
iPhone 14 ProCapacitor wrapperARKit on 3× pixel ratio device; perf check
MacBook + Chrome (desktop)"Enter AR" hidden; ?mode=exhibit loads Exhibit Mode

13.2 · Cross-platform parity test

Marko + 2 reviewers (one Android, one iPhone), same scene (/explore), same hour. Compare editorial experience subjectively. v1 ship-gate: ≥ 90 % parity score (Marko-judged).

13.3 · Visual regression for the Three.js upgrade

tests/e2e/visual.spec.ts baselines stay relevant for the non-3D surfaces (/credits, /library, /science strip). For the 7 3D scenes, manual reviewer pass per scene. No automated visual diff for canvas — too flaky.


14 · Resolved decisions + open questions

Resolved 2026-05-16:

  1. XR strategy — RESOLVED: X-C hybrid (simplified AR scene variants sharing data).
  2. Spatial audio — RESOLVED: A-A (same PRD-017 audio graph, XR camera as listener).
  3. Narrator position — RESOLVED: N-B (omniscient, 30 cm above + behind; centre-panned; RMS amplitude ducking).
  4. AR scope — RESOLVED: S-E (4 globe scenes: /explore /earth /moon /mars).
  5. AR entry — RESOLVED: NE-B (auto-play Guide episode 2 s after placement).
  6. Exhibit Mode — RESOLVED: M-B (chrome-less + QR; M-C synced exhibit deferred to v2).
  7. iPhone AR — RESOLVED: Capacitor + ARKit Swift plugin (@orrery/ar-bridge). iPhone Safari has no AR; gets fallback messaging.
  8. Three.js version — RESOLVED: Whole-codebase upgrade (r128 → current) as v1 prerequisite. Touches all 7 existing 3D scenes.
  9. Web vs app split — RESOLVED: Web stays web-only; mobile app may use native (ARKit example). Evolves PRD-015 framing.
  10. Vision Pro — RESOLVED: Dropped from v1 entirely. Focus: mobile (Android + iPhone).
  11. AR session storage — RESOLVED: In-memory only per ADR-057. No localStorage. No sessionStorage. Original draft's sessionStorage references corrected.
  12. Headphone detection — RESOLVED: navigator.mediaDevices.enumerateDevices(), HRTF on headphones, equal-power on speakers.
  13. AR-specific haptics — RESOLVED: Reuse @capacitor/haptics from RFC-020. Two new AR events (anchor placed, narrator section transition).

Operational follow-ups:

  1. ARKit Swift plugin owner. Marko per PRD-015 iOS signing. Confirm long-term commitment is OK.
  2. Three.js upgrade risk envelope. 4-6 weeks single-dev. Operational.
  3. AR onboarding UX iteration. Likely 2-3 design rounds. Implementation-time.
  4. Exhibit Mode QR → AR deep-link handshake. Implementation-time refinement.
  5. iPhone TestFlight before App Store? Recommend yes — 2 weeks TestFlight.

Deferred to v2:

  1. AR anchor persistence (put phone down + pick up). ARCore/ARKit support varies; needs design + impl effort.
  2. AR occlusion (hands blocking planets). Depth API on both platforms.
  3. Multi-user shared AR. Cloud Anchors / ARKit collaborative sessions.
  4. Script-aware semantic narrator ducking. v1 uses RMS amplitude only.
  5. Synced multi-projector Exhibit (M-C). WebSocket-coordinated multi-screen.
  6. /fly mission arc in AR + /iss + /tiangong station-model AR.

RFC-021 · Orrery · Immersive Mode · Drafted 2026-05-16 · Closes-into-PRD-019

Orrery — architecture documentation · MIT · No tracking