EXP-004: The Color That Doesn't Exist Yet

Context

Build a color palette generator for my ceramics studio.

There's a fundamental honesty problem hiding in this prompt. Screen colors are RGB light. Ceramic glazes are physical chemistry. A color that looks like a warm terracotta on screen might fire to a muddy brown — or a gorgeous amber, depending on the kiln temperature, the clay body, the thickness of application, and a dozen other variables.

Any palette generator for a ceramics studio has to reckon with this gap. The question is whether the agent will.

Hypothesis

The prompt-only agent would build a generic color palette generator with ceramics-themed styling — glaze names, earthy tones — but treat it as a digital design tool. The intent-driven agent would identify the gap between screen colors and fired glazes as a domain truth that must be addressed in the UI.

Initial Intent Artifact

The intent artifact identified the "screen-to-kiln gap" as a protected value: screen colors are approximations of physical materials that change unpredictably during firing. It required a visible disclaimer in the UI — not hidden in a tooltip, but present where the user encounters the palette.

It also identified the studio context: large touch targets for clay-covered hands, a printable reference card (because you can't easily use a phone at the wheel), and organization by color harmony types with domain metadata like cone temperatures and surface finishes.

Method

Both agents received only the raw prompt. The intent-driven agent generated a structured intent artifact before implementation. Both produced single-file HTML apps and written summaries.

Observation

Both agents knew the truth

Both branches used ceramics vocabulary — glaze names, traditional techniques — and generated palettes from ceramics-relevant color ranges. Neither built a generic color picker. The domain signal in "for my ceramics studio" was strong enough for both.

Here's where it gets interesting. The prompt-only agent wrote in its summary:

"The ceramics theming might be superficial... colors are arbitrary HSL ranges, not derived from real glaze chemistry."

It *knew*. It recognized the limitation, articulated it clearly, and left it in the developer notes.

The intent-driven branch identified the same truth — and put it in the user interface.

The honesty gap

The prompt-only branch's honesty lived in the developer summary. The intent-driven branch's honesty lived in the UI. Same knowledge. Different design decision about what to do with it.

This is the most interesting finding in this experiment. Both agents were equally aware of the domain limitation. The difference is whether that awareness became a design requirement or remained a developer observation.

Different interaction models

The prompt-only branch organized palettes by mood categories with interactive features: color lock, configurable palette size, spacebar shortcut for regeneration. These are patterns from graphic design tools — digital-first interactions.

The intent-driven branch organized by color harmony types with domain-specific metadata: cone temperatures, surface finishes, glaze families. It added a print-optimized reference card and large touch targets — studio-first interactions.

One app was designed to be used at a desk. The other was designed to be used at a pottery wheel with clay on your hands.

Drift Analysis

Hidden-domain-truth suppression (primary)

Both agents recognized that screen colors can't predict fired glazes. But only the intent-driven branch treated this knowledge as something the user needs to see. The prompt-only branch suppressed this domain truth — not maliciously, but by leaving it in the developer notes instead of making it a design requirement.

This is hidden-domain-truth suppression: important domain-specific truth is recognized or inferable, but remains hidden in implementation choices instead of becoming an explicit user-facing constraint.

Surface-over-substance drift (secondary)

The prompt-only branch invested more in interactive features (color lock, spacebar shortcut, configurable palette size) than in domain-relevant functionality (studio-appropriate interactions, printable output, domain metadata). The features are polished and useful — but they're designed for a screen-based workflow, not a studio-based one.

Legitimate Divergence

Palette organization: By mood (prompt-only) vs by color harmony (intent-driven). The artifact didn't specify organization method. Both are reasonable for a ceramicist.
Color generation algorithm: Different HSL range strategies. The artifact specified ceramics-relevant colors but not the generation method.
Visual design: Different color schemes and layouts. Neither specified by the artifact.

Result

The hypothesis held. Both branches themed around ceramics, but the intent-driven branch identified the screen-to-kiln gap as a domain truth requiring UI-level honesty, while the prompt-only branch identified the same issue but left it in the developer notes.

The prompt-only agent's summary is the most revealing artifact in this experiment. It proves the agent *had* the knowledge. It just didn't treat that knowledge as a design obligation.

Intent discovery moved domain honesty from a developer observation to a user-facing design requirement.

Principle

When a prompt references a physical craft domain — ceramics studio, woodworking shop, kitchen — there is often a fundamental gap between what a digital tool can represent and what the physical medium actually does. Both agents may recognize the gap. The difference is whether intent discovery makes that recognition into a design requirement, or whether it stays as a note the developer writes to themselves.

The screen-to-kiln gap is not unique to ceramics. Every domain where digital approximates physical has one.

Follow-Up

Test with other physical-digital domain gaps: "paint color visualizer for my living room" (screen brightness vs wall paint), "fabric pattern previewer" (drape and texture can't be shown on screen)
Would a ceramicist actually use the print layout in the studio?
Does the mood-based or harmony-based organization work better in practice?
Would the prompt-only branch add the disclaimer if the prompt explicitly said "I need to know which colors will actually work in the kiln"?

Limitations

Both branches used the same model family. The tendency to suppress vs surface domain truth might be model-specific.
We classified the prompt-only branch's behavior as "suppression," but the agent did acknowledge the limitation in its summary. The question is whether summary acknowledgment counts as addressing the issue.
Single run per branch. The prompt-only agent might add a disclaimer on another run.
The intent-driven branch received more structured input. The domain-honesty requirement was part of the artifact, which the prompt-only branch didn't have access to.
We don't know whether real ceramicists would find the disclaimer useful or annoying. The "honesty" framing assumes users want to know about the limitation.