Skip to content

00 — Introduction

The entry point for the Chemigram concept package. Read this first. Then read the rest in the order it suggests.

This is the introduction to the Chemigram concept package — the set of documents that describe what Chemigram is at the idea level, before formal product or technical specification work begins.

If you're new to the project, read this document first, then read the others in the order suggested below.


What this package is

A concept package is a set of documents that describes a project at the idea level — what it is, what it does, what it uses, how it works, and how it looks — before any formal requirements are written.

This package contains six numbered documents (00 through 05), each with a specific purpose:

# Name Purpose
00 Introduction (this document) What the package is, reading order, glossary
01 Vision The soul of the project. The why.
02 Project Concept What we're building at the idea level — the loop, the sessions, the modes
03 Data and Content Catalog What feeds the system — sources, characteristics, access
04 Technical Architecture How to build it. All tech decisions with rationale, plus open questions for Phase 2
05 Design System How it looks and feels. Intentionally minimal because Chemigram v1 is an MCP server, not a UI app.

The package is the input to formal definition work (PRDs, RFCs, ADRs, UXSes), not the output. It exists so that requirements writing has a shared concept to build on.


What this package is not

Not a... Because...
PRD Product Requirements Documents come after. The package is the input to PRD writing, not the output.
RFC Open questions are flagged in 04 but the RFCs themselves come later.
ADR Decisions are described with rationale in 04 but the formal ADR documents come later.
Build plan The package does not define sprints, milestones, or implementation order. It defines what is being built.
A Lightroom replacement See 01/"What this is not" for the full list of what the project itself is not — distinct from what the package is not.

How to read this package

Documents are numbered for reading order, not writing order. Read in this order:

Step Document Time Why
1 00 (this document) 5 min Orient yourself; learn the vocabulary
2 01 Vision 10 min Understand why the project exists
3 02 Project Concept 30-40 min Understand the full project at idea level
4 03 Data Catalog 15 min Understand what feeds the system
5 04 Technical Architecture 45-60 min Understand how it's built and what's still open
6 05 Design System 5 min Understand the (intentionally minimal) design surface

Total reading time: roughly two hours. The bulk of the package is 02 (project concept) and 04 (architecture).

If you have less time, the minimum is 00 + 01 + the first three sections of 02. That gives you the soul of the project plus the structuring metaphor (Chemigram is to photos what Claude Code is to code) and is enough to engage with anyone working on the project.

If you're returning to the package after time away: read 00 again — the glossary section is the fastest way to refresh on terminology.


Where to go after the concept package

After the concept package, the project's other content includes:

Document Purpose
docs/LICENSING.md What's MIT, what's separate (engine vs. personal vocabularies)
docs/CONTRIBUTING.md Code and vocabulary contribution flows (different review processes)
docs/TODO.md Research backlog, deferred items, "watch for" items
examples/iguana-galapagos.md A worked Mode A session demonstrating the full loop
examples/phase-0-notebook.md Hands-on validation lab notebook for Phase 0
docs/briefs/ Historical design-conversation artifacts that predate this package

The definition documents (PRDs, RFCs, ADRs) in docs/prd/, docs/rfc/, and docs/adr/ complete the doc system. They were produced after Phase 0 closed, anchoring into the concept package and into the per-plane reference docs (PA for product, TA for tech). The full project phase plan lives in docs/IMPLEMENTATION.md.


Document naming convention

Documents in this package are numbered 00 through 05. Numbers reflect reading order; their topics remain stable across projects following this concept-package process.

When other artifacts reference these documents — RFCs, PRDs, future technical work — references in headers and links use the number-and-section path style (e.g., 04/5 for section 5 of the architecture doc), while references in body prose use topic names (e.g., "see the architecture doc").


Glossary

The vocabulary used across the package. When in doubt about what a term means, find it here.

Core concepts

Chemigram — the project itself. Named after the cameraless photographic process where an image emerges from chemical reaction on light-sensitive paper, guided but not fully controlled. The name fits because each edit emerges from a loop between photographer's intent, agent's moves, and tool's response.

Engine — the Python code that does the orchestration. Includes XMP composition, vocabulary loading, render pipeline, versioning, drawn-mask serialization, and MCP server. See 04/2.

Agent — the AI capability that drives Mode A or Mode B sessions. Per the BYOA principle, the agent is photographer-configured (Claude, GPT, etc.), not bundled with Chemigram.

Photographer — the human user of Chemigram. Always in control; always the source of intent and judgment.

Apprentice model — the framing for the photographer/agent relationship. The agent is a patient, capable apprentice who reads context, executes vocabulary, surfaces uncertainty. The photographer is the master who provides briefs, judges results, and curates accumulated context.

Modes

Mode A — the journey. Collaborative editing where photographer and agent work through one photo together, conversationally. 5-30 turns per session. The primary mode.

Mode B — the autonomous fine-tuner. Agent runs alone with a brief and evaluation criteria, branching to explore variants, self-evaluating, converging to a winner or running out of budget. Future mode, deferred to Phase 4+.

Session — one conversation between photographer and agent on one image, from start to end-of-session synthesis. Captured as a JSONL transcript in sessions/.

Vocabulary

Vocabulary — the agent's action space. A finite set of named, single-module darktable styles (.dtstyle files) that the agent composes to make edits. The bulk of the project's character lives in vocabulary.

Vocabulary primitive (or just primitive) — one entry in the vocabulary. A single-module darktable style with a name like expo_+0.5 or colorcal_underwater_recover_blue_subject.

.dtstyle — the file format of a vocabulary primitive. XML, captures one module's parameters and blend operation. Authored by photographer (or community) in darktable's GUI; loaded by Chemigram at session start.

Manifest — the JSON metadata accompanying vocabulary entries. Contains layer assignment, modules touched, tags, description, optional mask_spec (drawn-form geometry for mask-bound entries), and other engine-relevant metadata.

Vocabulary gap — when the agent needs a primitive that doesn't exist. Worked around by composing existing primitives, then logged to vocabulary_gaps.jsonl for later authoring. Gaps are content, not failure.

Starter vocabulary — the minimal OSS vocabulary bundled with Chemigram. Generic, conservative, intended to bootstrap new users. Lives in vocabulary/starter/.

Community packs — vocabulary collections borrowed from existing community projects (Fuji sims, etc.) and redistributed with attribution. Live in vocabulary/packs/.

Personal vocabulary — a photographer's private taste, encoded as their own vocabulary entries. Not part of the OSS distribution. Loaded from a separate private repo.

Layers

L0 — darktable internals (rawprepare, demosaic, color profiles). Always-on. Not authored by anyone in the Chemigram sense.

L1 — Technical correction (lens, profiled denoise). Empty by default; opt-in per camera+lens via config.toml bindings. Pre-baked into baseline before the agent starts.

L2 — Look establishment. Either neutralizing (e.g. underwater_pelagic_blue) or look-committed (e.g. fuji_acros). Photographer-chosen, per-image, pre-baked into baseline.

L3 — Taste. The agent's vocabulary, mutable in the loop. The agent's playground.

Layer model — see 04/5 for the full model. Layers separate authorship moments, not editing moves.

Versioning

Snapshot — one content-addressed XMP state. SHA-256 hash over canonical XMP serialization. Lives in objects/.

Branch — a movable ref pointing at a snapshot. Like git branches.

Tag — an immutable ref pointing at a snapshot. Used for marking final states (v1_export, instagram_crop).

HEAD — the current ref or hash the working state points at.

DAG — directed acyclic graph of snapshots, formed by the parent relationships. The full version history of an image.

Mode B exploration tree — the branching tree of variants Mode B produces during autonomous exploration. Inspectable via the versioning DAG.

Masks

Mask — a spatial selection that restricts an effect to part of the frame. As of v1.9.0 Chemigram supports four mask sources, all serializing to bytes darktable's mask system consumes (per ADR-076): drawn, parametric, retouch, and LLM-vision-derived. (The earlier raster-PNG path was retired in v1.5.0 — darktable doesn't read external PNGs for raster masks.)

Parametric mask (RFC-024 / ADR-085, v1.9.0) — a mask defined by pixel-value conditions in blendop_params. The agent-facing surface is mask_spec.range_filter with kind ∈ {luminance, color_h, color_s, color_l}. Content-agnostic at the geometry level; intersects with drawn masks for "drawn AND parametric" composition.

Drawn mask (RFC-029 / ADR-084) — a mask defined by geometric primitives (gradient, ellipse, rectangle, path) encoded into <darktable:masks_history> and bound to plugins via blendop_params.mask_id. Constructed inline from mask_spec.dt_form + dt_params at apply time — see mask-shapes-from-words.md for the spatial-English-to-parameter mapping.

mask_spec — a vocabulary-entry field (or apply-time argument) carrying mask geometry. Shape: optional dt_form + dt_params (drawn) + optional range_filter (parametric). Three valid combinations: drawn only, parametric only, both AND-composed. See 04/6.2.

Retouch / spot heal/clone (RFC-025 / ADR-087, v1.9.0) — pixel-replacement primitive class via the apply_spot MCP tool, sister to apply_primitive. HEAL + CLONE algorithms on circle geometry; single form per call. AI auto-spot detection deferred to RFC-030.

Content-aware masking (RFC-026 / ADR-086, v1.9.0) — coarse subject masks via the chat-client's vision-capable LLM (Claude.ai / ChatGPT / Claude Code). Conversation-native, zero deployment. Pixel-precise silhouettes and depth masks deferred to RFC-030's deployed sibling-provider scaffolding.

Context files

taste.md — the photographer's externalized taste. Lives at ~/.chemigram/taste.md. Read by the agent at every session start. Curated through propose-and-confirm over months.

brief.md — what a specific image is for. Lives at <image_id>/brief.md. Written at session start, sometimes updated mid-session.

notes.md — what we've learned about a specific image. Lives at <image_id>/notes.md. Accumulates across sessions on the same image.

config.toml — user configuration. Vocabulary sources, L1/L2 binding rules, storage paths.

Disciplines

Agent is the only writer — the photographer reads previews and gives feedback; the agent is the sole mutator of edit state. See 04/1.1.

darktable does the photography, Chemigram does the loop — every image-processing capability comes from darktable. Chemigram contributes orchestration, vocabulary, agent loop, versioning, session capture. See 04/1.2.

BYOA (Bring Your Own AI) — Chemigram doesn't ship AI capabilities; it integrates them via MCP. The photo agent itself is photographer-configured (Claude, GPT, etc.); future evaluators and content-aware maskers are sibling projects, not bundled in core. See 04/1.3.

Engineering terms

MCP — Model Context Protocol. Anthropic's protocol for agent tool-calling. Chemigram exposes its capabilities as an MCP server.

darktable-cli — darktable's headless command-line interface. Runs without GUI. Chemigram's render pipeline invokes it as a subprocess.

XMP — Extensible Metadata Platform. The RDF/XML sidecar format darktable uses to store edit state. Each <rdf:li> in <darktable:history> is one module application.

op_params / blendop_params — hex-encoded C structs in XMP that hold module parameters and blend operation parameters. Treated as opaque blobs by Chemigram; copied verbatim from .dtstyle files.

SET semantics — when the agent applies a vocabulary primitive, it replaces any existing entry with matching (operation, multi_priority) rather than accumulating. Idempotent action space.

Pipeline stage — one step in the render pipeline, conforming to the PipelineStage protocol. v1 has one stage (darktable-cli); the abstraction admits future stages (external CLIs, GenAI tools, custom processors).

EXIF auto-binding — Chemigram's automatic resolution of L1 and L2 bindings from a raw's EXIF data. See 04/9.

modversion — darktable's per-module version number. op_params encoding is modversion-specific. Vocabulary needs re-capture when darktable bumps a module's modversion.

multi_priority — darktable's mechanism for having multiple instances of the same module in the history. Chemigram uses (operation, multi_priority) as the SET key.

Project structure

Photo project — one image, structured the way a code project is. Per-image directory with raw, briefs, notes, snapshots, sessions, masks. See 02/4.

Per-image repo — synonym for photo project. The structure is content-addressed and ref-based, mirroring git.

Concept package — this set of six documents (00 through 05). The Phase 1 deliverable.

Brief (in process-guide sense) — the photographer's intent statement for an image. Distinct from "concept package" or the historical design-conversation artifacts in briefs/.

Briefs folder — the docs/briefs/ directory, holding the original design-conversation documents from before the concept package was formalized. Historical artifacts.


How this package was produced

The Chemigram concept package was produced through a multi-session design conversation between Marko (the photographer who initiated the project) and an AI assistant. The original conversation artifacts are preserved in docs/briefs/ as historical record.

The transition to the formal numbered structure (this package) happened after the briefs accumulated enough thinking to justify formal organization. The structure follows the Concept Package Process Guide v2 (an external methodology document).

The two main document deliverables, 02 and 04, draw heavily from the briefs. The briefs are kept available because the formal package abstracts the conversational reasoning that produced the architecture; future-readers wanting to understand why a decision was made may find the briefs more illuminating than the package's distilled statements.

If you find a contradiction between the package and a brief, the package is correct (and the brief reflects an earlier moment of thinking).


Status

Aspect Status
Concept package complete Yes (v1.0)
Phase 0 validation done ✅ Closed green (8 findings logged)
Doc system populated ✅ Complete (PRDs, RFCs, ADRs in docs/prd, docs/rfc, docs/adr)
Phase 1 complete ✅ Yes — Slices 1–6 shipped (v0.1.0 through v1.0.0). Issues #1–#29 closed; thirteen RFCs closed (ADR-050..061).
Phase 1.1–1.5 complete ✅ Yes — comprehensive validation, engine unblock, CLI, expressive-baseline authoring, mask architecture cleanup (ADR-062..076).
Phase 1.6 complete ✅ Yes (v1.6.0) — parameterized vocabulary (RFC-021 → ADR-077..080). 18 parameterized entries across 11 modules.
Phase 1.7 complete ✅ Yes (v1.7.0) — Tier 2 expansion + Lightroom-parity Bucket A (RFC-022 → ADR-081).
Phase 1.8 complete ✅ Yes (v1.8.0) — HSL via colorequal (RFC-023 → ADR-083); Lightroom daily-use parity 51/52 (98%).
Phase 1.9 complete ✅ Yes (v1.9.0) — mask + retouch architecture trilogy (RFC-024/025/026/029 → ADR-084..087). 83 vocabulary entries, 1811 tests, apply_spot MCP tool, compositional masks (drawn + parametric range_filter + LLM-vision provider + retouch).
Phase 2 (vocab maturation) In progress — use-driven, not slice-and-gate.

For the canonical phase plan and history, see docs/IMPLEMENTATION.md.

Phase 1 closed at v1.0.0: the engine ships a working agent loop end-to-end. v1.1–v1.5 hardened the engine, added the CLI, authored the expressive-baseline pack, and cleaned up the mask architecture (drawn-only, ADR-076). v1.6–v1.8 closed Lightroom daily-use parity (51/52 controls). v1.9.0 closed the mask + retouch architecture trilogy: spatial masks (RFC-029 / ADR-084), parametric range filters (RFC-024 / ADR-085), LLM-vision-as-provider for content-derived masks (RFC-026 / ADR-086), and spot heal/clone (RFC-025 / ADR-087). The deployed sibling-provider precision tier (RFC-030) is drafted and deferred.

Phase 2 (vocabulary maturation) is in progress. It's a use-phase, not a build-phase: the photographer runs real Mode A sessions, the agent flags gaps via log_vocabulary_gap, and a vocabulary-authoring evening per month grows the personal vocabulary pack. See docs/IMPLEMENTATION.md Phase 2 section for the work shape and vocabulary/starter/README.md for the personal-vocabulary growth pattern.


00 · Introduction · v1.0 · Written last after 01 through 05