ADR-054 — Fleet i18n strategy: locale overlay parity at 137 × 14

Status · Accepted (retrospective; shipped v0.6.0) Date · 2026-05-15 Closes · RFC-016 OQ-5 TA anchor · §components/fleet/i18n · §contracts/fleet-overlay Related ADRs · ADR-017 (Paraglide-js i18n + locale overlay architecture), ADR-031 (i18n language list + rollout waves), ADR-032 (font + script strategy Wave 1), ADR-033 (translation workflow: LLM-only first-pass), ADR-043 (sr-Cyrl font gate), ADR-044 (CJK fonts Wave 2), ADR-045 (RTL Arabic), ADR-057 (locale override cookie)

Context

Fleet ships 137 entries × 14 locales = 1,918 overlay files at full parity. That's almost double the locale-overlay surface of the entire pre-v0.6 missions corpus (37 missions × 14 = 518 files). The naive plan — author every overlay by hand — would have blocked v0.6 ship by months. The alternative — ship en-US only and let the UI fall back — would have violated the "i18n from the start" constraint (TA §constraints) and made fleet a second-class citizen on the 13 non-English locales.

Three sub-questions had to be resolved:

What gets translated — every editorial field, or just the headline name + a short blurb?
What pipeline — same Tiangong rollout (ADR-033 LLM-first-pass + argos-translate fallback + manual review) or something fleet-specific?
Locale fallback — when a locale overlay is missing, fall back to what?

Decision

Overlay scope (closes OQ-5)

The fleet overlay carries exactly five editorial fields per entry, matching the user-visible surface of the panel:

json

{
  "id": "saturn-v",
  "name": "Saturn V",
  "best_known_for": "Carried every crewed Apollo lunar mission.",
  "specs_labels": { "height_m": "Height (m)", "payload_lEO_kg": "LEO payload (kg)", "stages": "Stages" },
  "flights": [
    { "mission_id": "apollo11", "flight_designation_display": "AS-506 · Apollo 11", "notes": "First crewed lunar landing." }
  ]
}

Locked by static/data/schemas/fleet-overlay.schema.json. The base file (static/data/fleet/<category>/<id>.json per ADR-052) holds everything language-neutral: id, category, agency, country, manufacturer, dates, status, era, epoch, specs values, linked_missions, linked_sites, flights structure (mission_ids + patch paths + crew names + crew roles + crew countries), credit, links.

What is NOT translated:

specs values (units are universal; "110.6 m" reads identically in every locale).
agency / country / manufacturer — these surface as flag chips + agency badges, not free-text fields.
crew[].name — proper nouns; we do not transliterate names (Neil Armstrong is "Neil Armstrong" on /fleet?id=apollo-csm-block-ii&locale=ja).
crew[].role — partially translated: roles like "Commander" / "Pilot" / "Mission Specialist" come from the Paraglide-js UI strings catalogue (per ADR-017), not from per-entry overlays.
links[] URLs — link label strings translate via ADR-051's locale fallback chain, not via the fleet overlay.

Pipeline (closes OQ-5)

Same as Tiangong rollout per ADR-033 — three phases:

en-US authored by hand. The 137 en-US overlays ship as the canonical editorial truth. Every other locale derives from this.
Wave 1 + 2 locales (es, fr, de, pt-BR, it, nl, zh-CN, ja, ko, hi, ar, ru) translated by argos-translate offline NMT in batch per ADR-033. argos-translate is the explicit fallback when an LLM round-trip is unavailable or undesirable — it ships free, runs locally, and produces consistent technical translations for the specs-heavy fleet vocabulary. Output written by scripts/wave23/apply-translations.ts.
sr-Cyrl authored manually per ADR-043 — argos-translate does not ship a Cyrillic Serbian model, and the Latin → Cyrillic transliteration is not mechanical for the technical vocabulary. The 137 sr-Cyrl overlays are hand-authored against the en-US source. Same pattern as Tiangong sr-Cyrl rollout.

scripts/wave23/ toolchain (catalog → maps → apply-translations) is reused unchanged from Tiangong; fleet just passed a different content surface through the same pipe.

Locale fallback (closes OQ-5)

The data client (src/lib/data.ts) applies the standard ADR-017 shallow-merge:

Fetch base file static/data/fleet/<category>/<id>.json (always present).
Fetch overlay static/data/i18n/<locale>/fleet/<category>/<id>.json if present.
Shallow-merge overlay over base — overlay wins for every field it carries.
If a non-en-US overlay is missing, fall back to en-US (not to the base file) so the entry still shows translated UI strings around it.

This chain means the corpus degrades gracefully: missing a Korean overlay shows English text on a Korean-UI page, which is strictly better than rendering nothing or rendering a Romanised stub.

What ships at v0.6.0

en-US: 137/137 (100 %) — full parity, hand-authored.
Wave 1 (es, fr, de, pt-BR, it, nl): 137/137 × 6 = 822 files — argos-translate batch, no manual review pass yet.
Wave 2 (zh-CN, ja, ko, hi, ar, ru): 137/137 × 6 = 822 files — argos-translate batch.
sr-Cyrl: 137/137 — manual authoring (per ADR-043).

Total: 1,918 overlay files committed to source. Quality varies by locale: en-US is the editorial truth; Wave 1 + 2 are post-edit pending; sr-Cyrl is hand-checked.

Locale switching honours the orrery_locale cookie per ADR-057.

Rationale

Five-field overlay surface keeps the per-entry translation cost minimal (~150 words per entry) while still translating everything the user actually reads. Names, dates, specs, and IDs need no translation.
argos-translate over LLM for batch translation: free, deterministic, offline, and ADR-033 already committed to this fallback. LLMs for spot-fix only.
Hand-authored sr-Cyrl is the only path; pretending argos handles Cyrillic Serbian would produce mis-script output (Latin transliteration on a Cyrillic page).
en-US fallback (not base file) keeps the surrounding UI consistent — a Korean user with a missing Korean overlay still sees Korean for "Saturn V" if Wave 2 has it, then Korean UI strings around the panel; falling back to the base file would mix English content with Korean chrome.
No crew-name translation is a deliberate honesty rule: Wernher von Braun, Yuri Gagarin, Liu Yang are not transliterations targets; the user sees the name the historical record uses. Roles are translated because they are role labels, not personal identifiers.

Alternatives considered

One overlay file for all entries per locale (rejected) — would have made PRs ugly and made per-entry translation review impossible to scope. Per-entry files match the base-file shape (ADR-017 standard).
Crowd-sourced translations (rejected for v0.6) — quality control overhead exceeds the budget for a one-person curator; reconsider post-1.0 if community contribution lands.
Ship en-US only and rely on browser auto-translate (rejected) — violates the "i18n from the start" constraint; browser MT is worse than argos for technical vocabulary; defeats the purpose of having a translated UI shell.
LLM round-trip for every overlay (rejected for batch) — non-deterministic, costly at 1,918 × ~150 words, and ADR-033 already locks argos as the batch tool.

Consequences

Positive

100 % overlay parity across 14 locales at v0.6.0 ship — fleet does not lag any other route on i18n coverage.
argos-translate pipeline is reusable: any future content surface of comparable size (fleet expansion, science encyclopedia, surface hotspot LOD content per RFC-017) routes through the same toolchain.
Locale fallback is graceful: a missing overlay never shows as a broken or empty panel.

Negative

Wave 1 + 2 argos output has not had a manual review pass; translation quality is "MT-first-pass good" — adequate for technical specs, occasionally awkward for editorial prose like best_known_for. Post-edit is tracked in docs/wip/fleet-translation-review.md.
sr-Cyrl is the bottleneck for any future fleet expansion — every new entry needs hand-authored Cyrillic, no shortcut.
1,918 files in git are visible in git status after every locale-overlay rebuild; cleanup tooling lives in scripts/wave23/ but the noise is real.
The "names not translated" rule occasionally produces a mixed-script line in CJK locales when a Russian name is romanised in en-US but appears next to Hiragana/Hangul body text. Acceptable per the editorial-honesty rationale, but visually unusual.

Implementation notes

Per-entry overlay path: static/data/i18n/<locale>/fleet/<category>/<id>.json. Schema: static/data/schemas/fleet-overlay.schema.json.
Batch pipeline: scripts/wave23/catalog.ts → scripts/wave23/maps.ts → scripts/wave23/apply-translations.ts (catalog → per-locale maps → apply translations in JSON overlays). Tracked locally in the maintainer's Claude Code memory for repeatable invocation across releases.
Locale-override cookie: orrery_locale (ADR-057). UI exposes locale switcher in the top nav.
Fallback chain validation: handled by scripts/validate-data.ts — every base entry must have an en-US overlay; non-English overlays are optional but warned-on-missing.

ADR-017 — Paraglide-js i18n + locale overlay architecture (the parent pattern).
ADR-031 — i18n language list and rollout waves.
ADR-033 — Translation workflow: LLM-only first-pass (argos-translate is the explicit batch fallback this ADR exercises).
ADR-043 — Serbian Cyrillic font gate for sr-Cyrl.
ADR-044 — CJK font strategy for Wave 2 locales.
ADR-045 — RTL strategy for Arabic locale.
ADR-052 — Fleet schema + bidirectional cross-reference contract.
ADR-053 — Fleet imagery sourcing.
ADR-057 — Narrow exception to "no client storage": one functional cookie for locale override.
RFC-016 — Spaceflight Fleet · architecture, schema, and dataset boundaries.
PRD-012 — Spaceflight Fleet product spec.

ADR-054 — Fleet i18n strategy: locale overlay parity at 137 × 14 ​

Context ​

Decision ​

Overlay scope (closes OQ-5) ​

Pipeline (closes OQ-5) ​

Locale fallback (closes OQ-5) ​

What ships at v0.6.0 ​

Rationale ​

Alternatives considered ​

Consequences ​

Positive ​

Negative ​

Implementation notes ​

Related ​