RFC-010 — Translation & Internationalisation Strategy

Orrery · Closed RFC · v0.1 · May 2026

Status: Closed (2026-05-04) Author: product Closed by: ADR-031 (language list + waves), ADR-032 (font & script strategy — Wave 1; CJK + RTL deferred), ADR-033 (LLM-only translation workflow) Slice gate: v2.0 — i18n architecture already in place per ADR-017; this RFC governed language selection and rollout Why this is an RFC · ADR-017 locked the i18n architecture. Which languages to ship and in what order, how to handle non-Latin scripts (CJK fonts, Arabic RTL, Devanagari), and what translation workflow to adopt are three independent questions with real trade-offs (reach vs implementation cost; bundle-size vs typographic identity; quality vs scalability). Each has multiple defensible answers — hence the RFC.

Context

The i18n architecture was established in Slice 6 (v1.0) with the following constraint locked in:

"Languages other than English (i18n architecture is in place — add translations without code changes)"

This means translation strings are already externalised, locale switching is implemented, and no code changes are required to add a new language — only translation files.

This RFC governs three open questions that must be resolved before v2.0 work begins:

Which languages do we ship, in which order?
How do we handle non-Latin scripts (CJK, Arabic RTL, Devanagari)?
What is the translation workflow — who translates, how do we QA?

Why these languages

The candidate list was derived by unioning two sets:

Set A — World top 5 languages by total speakers

Language	Speakers
English	1.5B (shipped)
Mandarin Chinese	1.1B
Hindi	600M
Spanish	560M
French	310M

Set B — Languages of active space-faring nations

Agencies with human spaceflight, deep-space missions, or active launch programmes, grouped by tier:

Tier	Agency	Country	Language
1 — Human spaceflight + deep space	NASA	USA	English ✅
1	Roscosmos	Russia	Russian
1	CNSA / CMSA	China	Mandarin
2 — Active launches + deep space	ISRO	India	Hindi
2	JAXA	Japan	Japanese
2	ESA (23 nations)	Europe	French, German, Italian, Spanish, Dutch, Portuguese
2	CNES	France	French
2	DLR	Germany	German
2	ASI	Italy	Italian
3 — Active satellite ops + growing	UAESA	UAE	Arabic
3	KARI	South Korea	Korean
3	AEB	Brazil	Portuguese
3	CSA	Canada	English / French
3	UKSA	UK	English
3	ISA	Israel	Hebrew
3	ASA	Australia	English

The union of both sets yields the candidate list in §Open Questions below.

Open Questions

OQ-1 — Which languages ship in v2.0, and in what order?

Proposed priority order (optimises for: reach × STEM audience × implementation effort):

Priority	Language	Script	Direction	Why
1	Spanish	Latin	LTR	Largest new reach; Latin America + Spain; easy implementation; strong STEM community
2	Mandarin (Simplified)	CJK	LTR (vertical optional)	China is a Tier-1 Mars player; 1.1B speakers; CNSA audience
3	French	Latin	LTR	ESA working language; covers France + 29 Francophone countries
4	Japanese	CJK + Hiragana/Katakana	LTR	JAXA; high engagement; pays attention to science accuracy
5	German	Latin	LTR	DLR + strong STEM culture; ESA contributor
6	Hindi	Devanagari	LTR	ISRO; fastest-growing tech audience; 600M speakers
7	Portuguese (BR)	Latin	LTR	Brazil AEB; huge web gaming market
8	Arabic	Arabic	RTL	UAE space programme; growing STEM investment
9	Korean	Hangul	LTR	KARI; highly engaged tech audience
10	Russian	Cyrillic	LTR	Roscosmos heritage; large space-history community
11	Italian	Latin	LTR	ASI; ESA contributor

Alternatives to consider:

Ship all 11 at once — maximises reach; risk is uneven quality across languages
Two waves: Latin-script first, then non-Latin — reduces technical risk; delays CJK + RTL
Community-translation model — open all slots, let contributors fill them; risk is quality control

Question for this RFC: Do we ship a defined list per-wave, or do we open a translation contribution model?

OQ-2 — How do we handle non-Latin scripts?

Three distinct technical problems:

2a. CJK (Mandarin, Japanese, Korean)

Requires CJK-capable font. Current stack uses 'Space Mono' for data and 'Bebas Neue' for titles — neither covers CJK.
Options:
- Load Noto Sans SC / Noto Sans JP / Noto Sans KR from Google Fonts per locale (adds ~500KB per font)
- Use system fonts with fallback stack — font-family: 'Space Mono', 'Noto Sans SC', system-ui — zero bundle cost, inconsistent rendering
- Single Noto Sans CJK variable font — covers all three CJK locales; ~2MB; load on demand

2b. RTL (Arabic)

Requires dir="rtl" on <html> or per-section
All flex/grid layouts need RTL mirroring (margin-inline-start vs margin-left)
The canvas (Three.js, 2D porkchop plot) is direction-neutral — only the UI chrome changes
Options:
- CSS logical properties throughout — margin-inline, padding-block, etc. — correct approach but requires audit of all existing CSS
- RTL override stylesheet — a separate rtl.css that overrides directional rules — faster to ship, harder to maintain
- Defer Arabic until CSS logical properties audit is complete

2c. Devanagari (Hindi)

Standard Latin fonts do not cover Devanagari.
Noto Sans Devanagari covers it; ~400KB.
Rendering complexity is lower than CJK; no direction issue.

Question for this RFC: Do we require CSS logical properties as a prerequisite for Arabic, or do we ship an RTL override stylesheet?

OQ-3 — What is the translation workflow?

The i18n architecture uses JSON locale files (one per language). The question is how those files get populated and QA'd.

Option A — Professional translation agency

Highest quality; most expensive
Suitable for v2.0 launch languages (Spanish, French at minimum)
Estimated cost: $0.10–0.20/word × ~2,000 strings = $200–$400 per language

Option B — Community / contributor model

Open a translations/ folder in the repo
Contributors submit PRs for their language
Maintainers review for technical accuracy (especially physics terminology)
Risk: inconsistent quality; physics errors in translation can mislead learners

Option C — AI-assisted + native speaker review

Use an LLM to produce a first-pass translation
Native speaker (contributor or paid reviewer) reviews for naturalness and physics accuracy
Most scalable; quality depends on reviewer

Option D — Hybrid

Professional translation for Tier-1 languages (Spanish, French, Mandarin)
Community model for remaining languages
AI-assisted first pass to reduce translator effort

Question for this RFC: Which workflow do we adopt, and does it differ by language tier?

Maintainer decisions — May 2026

Recorded ahead of formal ADR closure (see "Evidence needed to close" below).

OQ-1 — Language selection: Adopt the priority order proposed in this RFC verbatim (Spanish 1, Mandarin 2, French 3, Japanese 4, German 5, Hindi 6, Portuguese-BR 7, Arabic 8, Korean 9, Russian 10, Italian 11). v0.3.x ships Spanish only; remaining Wave 1 languages move to PLANNED but unscheduled. Wave 2 (CJK) and Wave 3 (RTL) gated on the OQ-2 follow-ups.

OQ-2 — Script handling: Wave 1 needs no new font work — Bebas Neue + Space Mono + Crimson Pro already cover the Latin-script extended characters needed for es/fr/de/pt/it. CJK font choice and RTL CSS-logical-properties audit are explicitly deferred to follow-up ADRs scheduled with their respective waves; not blocking v0.3.x.

OQ-3 — Translation workflow: LLM-only first-pass translation, no native-speaker review for v0.3.x. Quality risk on physics terminology accuracy and idiomatic naturalness is accepted. Re-evaluation triggers: first user-reported translation issue, or first contributor offering native-speaker review for a given language. Formalised in ADR-033 once written.

These decisions close the RFC into ADR-031 (language list + waves), ADR-032 (font/script strategy — Wave 1 scope), and ADR-033 (workflow). RFC moves to Closed when those three ADRs land.

What is NOT in scope for this RFC

Changes to the i18n architecture (already locked in ADR-017)
Translation of the codebase itself (comments, variable names remain in English)
Localisation of units (km vs miles — a separate concern, may warrant its own ADR)
RTL support for the canvas / Three.js scenes (canvas is direction-neutral)
Locale-specific mission data (all missions use the same data regardless of locale)

Evidence needed to close

This RFC closes when the following are answered and documented in ADRs:

Question	Evidence needed	Closes into
OQ-1: Language list + wave structure	Decision on list and phasing	ADR-0xx
OQ-2: Font strategy for CJK + RTL	Technical spike on font loading; CSS logical properties audit	ADR-0xx
OQ-3: Translation workflow	Decision on workflow per tier	ADR-0xx

References

What	Where
i18n architecture decision	`docs/adr/ADR-017.md`
Accessibility RFC (a11y patterns relevant to i18n)	`docs/rfc/RFC-005.md`
v1.0 scope (i18n deferred)	`docs/prd/PRD-006.md §deferred`
ROADMAP	`ROADMAP.md`

RFC-010 — Translation & Internationalisation Strategy ​

Context ​

Why these languages ​

Open Questions ​

OQ-1 — Which languages ship in v2.0, and in what order? ​

OQ-2 — How do we handle non-Latin scripts? ​

OQ-3 — What is the translation workflow? ​

Maintainer decisions — May 2026 ​

What is NOT in scope for this RFC ​

Evidence needed to close ​

References ​