Observation
What the prompt-only agent built
Long break cycles (15 minutes every 4th session). A skip button. Session progress dots. Browser notifications. A tab title countdown. Its own summary acknowledged these were "over-featured for tiny" — but the code shipped them anyway.
The implementation didn't resist its own instincts. The agent recognized the tension, noted it in writing, and then did the thing anyway.
What the intent-driven agent built
A 25-minute timer. A 5-minute break. Start, pause, reset. That's it.
But here's the part that surprised us: the intent-driven agent also caught a technical correctness issue that the prompt-only agent missed entirely. Browser tabs that get backgrounded cause `setInterval` to drift — the timer silently becomes inaccurate. The intent-driven agent identified this during intent discovery, traced it to a failure boundary in the artifact ("timer must not silently miscount"), and implemented wall-clock-based timing instead.
The prompt-only agent used naive `setInterval`. In a backgrounded tab, its "tiny" timer would quietly lose time.
Where they converged
Both chose dark themes. Both centered the timer with large typography. Both used 25/5 durations. Some choices are so strongly implied by the product type that constraint level doesn't matter — both agents arrived at the same answer independently.
Drift Analysis
Constraint drift (primary)
This is the cleanest example of constraint drift we've seen. The prompt-only agent preserved the *aesthetic* of "tiny" — the app looks minimal, clean, compact. But functionally, it built a full pomodoro cycle manager with five features beyond what the prompt requested.
"Tiny" slid from a scope constraint to a style preference. The word was honored in appearance and violated in substance.
Scope inflation (secondary)
Each added feature — long breaks, skip button, progress dots, notifications, tab title — is individually small and defensible. Any one of them is a reasonable "nice to have." But collectively they transform the product. You asked for a tiny timer. You got a full-featured pomodoro app that happens to look small.
This is the insidious part of scope inflation: no single addition feels wrong. It's the accumulation that changes the product.
Silent default selection (tertiary)
The 4-session long-break cycle is a real convention from the Pomodoro Technique. But the prompt didn't ask for long breaks at all. The prompt-only agent selected a domain-specific default and implemented it as though it had been specified, without surfacing that a choice had been made.
Legitimate Divergence
Not every similarity or difference here is meaningful:
- Color scheme: both chose dark themes independently. The artifact didn't specify visual appearance beyond "minimal." Valid design freedom.
- Layout approach: both centered the timer with large type. Strongly implied by the product type, not by the artifact.
- Font choices and spacing: aesthetic decisions within the minimal constraint. Neither conflicts with any protected value.
The convergence on aesthetics is actually interesting. It suggests that some implementation choices are so product-implied that intent artifacts don't need to constrain them — and shouldn't.
Result
The hypothesis held. "Tiny" meant two different things to the two agents, and the split happened before any code was written.
The prompt-only agent treated "tiny" as an aesthetic constraint. It built a feature-rich app that looked small. The intent-driven agent treated "tiny" as a scope boundary. It built a genuinely minimal app and used the remaining attention to catch a real correctness bug.
That last part is worth sitting with. The intent-driven agent didn't just build less — it built better. With fewer features to implement, it had room to think about whether the timer would actually count correctly in a backgrounded tab. The prompt-only agent was too busy building long breaks and notification permissions to notice.
Principle
Constraint words are interpreted differently depending on when they're analyzed. During implementation, "tiny" slides toward "minimal-looking but full-featured" — the agent's instinct is to add, and the word doesn't stop it. During intent discovery, "tiny" becomes a scope boundary that actively prevents additions.
The earlier a constraint is formalized, the more effectively it resists drift.
Follow-Up
- Would real users consider long breaks essential to a "pomodoro timer," or is that scope inflation? User testing would settle this.
- Does the same pattern hold for "lightweight," "basic," or "simple"? Each constraint word probably has its own drift profile.
- Does the wall-clock vs `setInterval` difference produce observable behavior in real browser usage? Worth benchmarking.
Limitations
- Both agents used the same model family. The scope-inflation tendency might be model-specific rather than universal.
- "Tiny" is genuinely ambiguous. Reasonable people could argue that long breaks belong in a pomodoro timer, even a tiny one. The drift classification assumes "tiny" means feature-minimal — that's the intent-driven interpretation, but it's not the only valid one.
- The intent-driven branch received more structured input. The improvement could partly reflect context volume rather than structural alignment.
- Single run per branch. The prompt-only agent might resist scope inflation on a different day.
- The aesthetic convergence (dark theme, centered layout) could reflect model training bias rather than genuine product-type reasoning.