RFC-015 — LEARN-link rollout
Status · Closed (planning carried into ADR-051 + Milestones L-A through L-E) Author · product Closes into · ADR-051 (outbound learn-link stewardship) Slice gate · v0.5.0 Why this is an RFC · The audit in
docs/research/learn-link-audit.mdshows that 76 % of outbound LEARN-tab links go to NASA + Wikipedia, and only 4 % to non-US agency sites. Fixing that is more than a data change — it is a contract decision (which sources are allowed, which language wins for a given user, what gets shown on the public page) and a pipeline decision (when to fail-close, where to integrate the link-checker). RFC-015 is the planning record; ADR-051 is the lock.
Context
ADR-046 (May 2026) corrected the imagery side: agency-first build-time fetching with NASA Images API + curated Wikimedia as fallbacks. ADR-047 (also May 2026) added per-image and per-text-fragment provenance with a fail-closed pipeline and a public /credits page.
Outbound LEARN-tab links — visible on every primary detail panel — were not covered. The audit shows the same drift pattern that motivated ADR-046:
- 340 outbound link references inventoried (296 unique
(entity_id, url)pairs). - ~82 % to NASA-domain + Wikipedia.
- ~10 % to non-US agency sites.
- Zero native-language links to operator portals despite their existence.
- No per-link provenance, no
hreflang, norel="external", no link-checker.
The user-visible cost is real: a researcher landing on a Chang'e or Tianwen-1 detail panel today is sent to Wikipedia first, when the operating agency publishes its own English and Chinese pages. The project is downstream of the world's space agencies; we should send users to them, not past them.
Open Questions (resolved)
OQ-1 — One contract or two?
Question. Should outbound link stewardship live as an extension to ADR-047, or as its own ADR?
Decision. Own ADR (ADR-051). ADR-047's manifest model is the right pattern, but the data shape, the fair-player rendering rules (rel attributes, hreflang), the locale fallback chain, and the public surface (/library) are distinct enough from imagery that a self-contained ADR is clearer. ADR-051 cites ADR-047 as the parent contract.
OQ-2 — Allow Wikipedia-only entities?
Question. Some entities have no operator page (genuinely or because the operator is defunct). Allow Wikipedia-only?
Decision. No. Every entity must have at least two distinct sources. When the operator is defunct (e.g. some Soviet missions), use the most authoritative secondary source (NSSDC, Lavochkin Association, scientific publisher). Wikipedia may stay as one of the links — never as the only one.
OQ-3 — Native-language priority versus UI-locale priority?
Question. When a user is browsing in es and the entity is a CNSA mission, should the first link be Spanish Wikipedia (matches UI locale) or the Chinese CNSA page (matches operator)?
Decision. UI-locale wins, then operator-native, then English, then multi-lingual. The reasoning: a Spanish-speaking user landing on Tianwen-1 still benefits more from the Chinese CNSA page than from English Wikipedia, but if a Spanish-language source from a credible publisher exists, it wins. The fallback is additive: native-language operator pages remain visible to every user, only their order changes.
OQ-4 — rel="external" opt-in?
Question. Add rel="external" to every outbound link, or only some?
Decision. Every outbound link: rel="noopener noreferrer external". The semantic value (search-engine signal, screen-reader cue) outweighs the mild bloat. Not adding rel="dofollow" or rel="nofollow" — they are ambiguous and we don't want to make a SEO statement either way.
OQ-5 — Link-check failure policy?
Question. Should a single 404 on a deep link fail the build?
Decision. No. intro and core 4xx/5xx fail validate-data because those are the links a user sees on a quick visit. deep warns in the diff report only — those links are the third-tier reference reading and are most likely to move (academic papers being archived, agency CMSes redirecting to new portals). Redirects always warn but never fail; build-link-provenance picks up the canonical URL on next refetch.
OQ-6 — Where does the link-checker live in the build pipeline?
Question. On every PR, on every refetch, both?
Decision. Refetch only. CI does not have reliable outbound network for agency CMSes (rate limits, geo-fences). npm run fetch runs locally or on a maintainer-triggered job; npm run build and PR CI rely on validate-data reading the most recent last-link-check.md from the repo. The check is versioned alongside the manifest.
OQ-7 — How big can /library get?
Question. With ~300 links, will a single-page render OK?
Decision. Yes for v0.5.0. The /credits page already renders ~622 image rows and the page is fast. ~300 link rows is comfortably under that. If the page crosses ~250 ms render time on a low-end device once L-C grows the manifest, the docs/wip/learn-link-backlog.md has a virtualisation backlog item.
OQ-8 — Page name?
Question. What is the public page called?
Decision. /library. Owner-chosen. Existing "Mission Library" UI label on /missions is renamed to Mission Catalog to free the word.
Rollout
Five milestones, mirroring the L-A through L-E layout used for the imagery rollout (ADR-046 / ADR-047):
- L-A (this RFC + ADR-051 + audit doc + backlog + index updates) — planning gate. Blocks everything else.
- L-B — provenance manifest infrastructure:
link-provenance.json+ AJV schema +LinkCredit.svelte+build-link-provenance.ts+validate-dataintegration + i18n keys + panel wiring. Blocks C/D/E. - L-C — editorial enrichment of all non-US entities; native-language priority.
- L-D —
/libraryroute + Mission Catalog rename + footer link + library-grouping helper. - L-E —
check-learn-links.ts+ chained intonpm run fetch+ fail-closedintro/corepolicy.
Each milestone validates green (typecheck, lint, test, validate-data) and is committed and pushed independently before the next starts.
Acceptance gates per milestone
- L-A: ADR-051 committed; this RFC committed and closed; audit doc shipped.
- L-B: every existing LEARN link has a provenance entry;
validate-datagreen;LinkCredit.svelterenders under every panel link. - L-C: zero entities link to Wikipedia as their only source; non-US share rises from ~4 % to ≥ 15 %.
- L-D:
/libraryroute renders all sources, grouped by source, newest first; Mission Catalog label landed in all 14 locales. - L-E:
npm run fetchchains the link-checker; planted brokenintrolink failsvalidate-data; planted brokendeeplink warns.
Out of scope
- Direct-permission outreach to agencies for content beyond their public sites → #46.
- NASA-partnership-credit imagery enrichment → #45.
- Tiangong Explorer (RFC-014) is unrelated despite the adjacent number.
Related
- ADR-046 — Agency-first build-time imagery sourcing.
- ADR-047 — Provenance manifests + license stewardship.
- ADR-051 — Outbound learn-link stewardship.
- Epic #51 — LEARN-link stewardship rollout.