Exogenesis / Field Notes / EXP-007

EXP-0072026-03-13

The Word 'Also' Did a Lot of Heavy Lifting

competing-centers-of-gravityproduct-identity-driftplan-substitution

A one-sentence prompt with two domains — workout tracking and meal planning — reveals how agents silently equalize features that the user subordinated with a single word.

Context

"Build an app for tracking workouts that also helps me plan my meals."

Fourteen words. Two domains. One tiny word — "also" — doing all the structural work.

We wanted to know what happens when a prompt contains two plausible product identities sitting side by side, connected by a subordinating conjunction. The prompt has a clear hierarchy if you read it carefully: workout tracking is the primary verb, meal planning is introduced as something the app "also" does. But that hierarchy is encoded in syntax, not emphasis. We suspected an agent building straight from the prompt would flatten the hierarchy and treat both domains as equals.

We gave this prompt to two independent agents. One built directly from the prompt. The other ran intent discovery first, producing an intent artifact that made the hierarchy explicit before writing any code.

Hypothesis

We expected the prompt-only agent to build something where workouts and meals share equal billing — a "fitness dashboard" or "health platform" that treats both domains as co-equal features. We expected the intent-driven agent to build something that feels like a workout tracker first and a meal planner second. In other words: we bet that "also" would get lost in translation.

Initial Intent Artifact

The intent artifact identified this prompt as a "competing-centers-of-gravity" pattern. Its key observations:

  • The prompt leads with workouts. "Also" subordinates meal planning.
  • "Tracking" (recording reality) is a different verb than "helps me plan" (assistive, future-oriented).
  • These two domains have different temporal orientations: tracking looks backward, planning looks forward.
  • Without making the hierarchy explicit, meal planning — which is inherently more complex (multiple meal slots, richer content per entry) — could absorb the product identity.

The artifact established five protected values, but two are load-bearing for this experiment:

  1. Workout tracking is the primary identity. If trade-offs arise, workout tracking must not be demoted.
  2. Meal planning remains lightweight and assistive. "Helps me plan" is softer language than "tracks."

It also predicted five drift risks. Four of them turned out to be prescient. But we are getting ahead of ourselves.

Method

Two agents, separate sessions, same prompt. The prompt-only agent received only the raw prompt. The intent-driven agent ran intent discovery, produced the intent artifact, then implemented. Both built single-file HTML apps with inline CSS and JavaScript, using localStorage for persistence. All assessments are analytical (reading the code and the agents' own summaries) rather than measured by automated testing.

What each branch built

Prompt-only

App built from prompt only

Intent-driven

App built with intent artifact

Observation

Two apps, two identities

The first thing you notice is the names.

The prompt-only agent named its app "FitFuel — Workout & Meal Planner." The intent-driven agent named its app "Workout Tracker" with a subtitle: "with meal planning" in smaller, lighter text.

Those names are not cosmetic. They are product identity declarations. "FitFuel" frames the product as a dual-purpose fitness-and-nutrition platform. "Workout Tracker (with meal planning)" frames it as a workout app that happens to have a meal feature. The prompt says the second thing. The prompt-only agent built the first.

The dashboard that wasn't requested

The prompt-only branch opens on a Dashboard view. It shows four summary cards: workouts this week, meals planned, average daily calories, and average daily protein. It has tabs for Workouts and Meal Plan as secondary views.

The intent-driven branch opens on a day view. Workouts are listed first, in a section with a solid blue left border. Meals appear below, in a section with a lighter green border. There is no dashboard. There are no aggregate statistics. The visual hierarchy — color weight, position, border treatment — communicates what matters most.

That Dashboard in the prompt-only branch is interesting because it is the mechanism of equalization. By creating a meta-view that sits above both domains, the agent elevated both to the same level. Neither workouts nor meals are primary — the Dashboard is primary, and both domains report into it.

Macro tracking: when "plan" becomes "log"

The prompt says "helps me plan my meals." Planning is a human practice. You decide what you are going to eat. The intent-driven branch implemented this literally: each meal slot (breakfast, lunch, dinner, snack) has a free-text field where you write what you intend to eat. That is it. No structure. No numbers. Just text.

The prompt-only branch turned meal planning into nutritional data entry. Each meal has fields for calories, protein, carbs, and fat. The dashboard aggregates these into daily averages.

The prompt-only agent was honest about this in its own summary: "The prompt says 'plan my meals,' not 'track my macros.' I assumed meal planning implies nutritional awareness." That assumption is reasonable in isolation. But it is a scope expansion that the user never requested, and it pushes meal planning from a lightweight assistive feature into a structured logging system — the opposite of what "helps me plan" suggests.

This is practice-to-tool drift in its clearest form. The human practice of deciding what to eat was converted into a data management interface with four numeric fields.

The agent's own confession

The most striking evidence comes from the prompt-only agent's summary. It wrote:

"I didn't resolve which one is primary. Instead I defaulted to 'fitness dashboard' as the unifying concept, which is a framing the user did not articulate."

This is plan substitution, described by the agent itself. The agent recognized the competing-centers-of-gravity problem, could not resolve it from the prompt alone, invented a new framing ("fitness dashboard") that was not in the prompt, and then built that framing faithfully. The resulting app is internally coherent. It is well-built. It just is not the product the prompt describes.

That last part is worth sitting with. The agent did not make an error. It made a decision — the kind of decision that, in a prompt-only workflow, happens silently inside the implementation plan rather than explicitly before it.

What the hierarchy looks like in code

In the intent-driven branch, the workout section uses `border-left: 4px solid #2563eb` (a strong blue). The meal section uses `border-left: 4px solid #d1fae5` (a very light green). The workout section's heading is colored `#2563eb`. The meal section's heading is `#16a34a`, which is visually softer.

The workout data model includes structured fields: exercise name, sets, reps, weight, and duration. The meal data model is a single text field per slot.

These are not random design choices. They trace directly to the intent artifact's protected values and drift risk mitigations. The code itself carries the product hierarchy.

Drift Analysis

Primary: Product-identity drift. The prompt-only branch shifted the product from "workout tracker with meal planning" to "fitness lifestyle dashboard." This is the core finding. The product's center of gravity moved from workout tracking to a position equidistant between two domains.

Secondary: Scope inflation. Calorie and macro tracking, workout type categorization with color-coded badges, and aggregate statistics were added without being requested. These features are reasonable for a fitness app, but they expand the product beyond what the prompt asks for.

Tertiary: Practice-to-tool drift. Meal planning — a human practice of deciding what to eat — was converted into a structured nutritional logging interface with four numeric fields. The verb "plan" was reinterpreted as "log."

Quaternary: Plan substitution. The prompt-only agent explicitly generated an intermediate framing ("fitness app with two main data types, a dashboard to tie them together") that replaced the prompt's hierarchy. This is the mechanism that produced the identity drift.

All four drift patterns were predicted by the intent artifact before implementation began.

Legitimate Divergence

Several differences between the branches are valid design choices in areas the intent artifact did not constrain:

  1. Visual theme. Dark (prompt-only) vs. light (intent-driven). The intent artifact says nothing about color schemes. Neither choice affects product identity.
  1. Navigation pattern. Tab-based (prompt-only) vs. single-page scrollable with date navigation (intent-driven). The intent artifact specifies a "day view" but not the navigation mechanism.
  1. Exercise data entry UX. Tag-based free-text entries within a workout (prompt-only) vs. structured fields per individual exercise (intent-driven). Both implement workout tracking; the granularity is a design choice.
  1. History presentation. Embedded in dashboard (prompt-only) vs. toggleable section (intent-driven). Both provide access to past data.
  1. Branding. "FitFuel" vs. "Workout Tracker." Naming itself is a legitimate design choice, though in this case the name is symptomatic of the deeper identity drift. The name did not cause the drift — the drift chose the name.

One convergence is worth noting: both agents independently chose the same four meal slots (breakfast, lunch, dinner, snack). This is a natural default that neither the prompt nor the intent artifact mandated.

Result

The hypothesis held cleanly. The prompt-only agent equalized two domains that the prompt subordinated. The intent-driven agent preserved the hierarchy.

But the more interesting finding is how transparent the mechanism was. The prompt-only agent did not hide what it did. In its own summary, it described the plan substitution, listed the assumptions it made, and identified the drift risks — after the fact. It knew. It just did not know before it started building.

That is the gap intent discovery fills. Not knowledge — the agent clearly has the analytical ability to see the problem. Timing. The intent-driven agent surfaced the hierarchy before the first line of code, when it could still shape the implementation. The prompt-only agent surfaced it in the post-hoc summary, when the app was already built around a different framing.

The strongest single-sentence takeaway: Both agents understood the prompt had two centers of gravity; only one resolved the hierarchy before building.

Principle

When a prompt contains two or more plausible product identities, an agent building directly from the prompt tends to equalize them rather than preserve whatever hierarchy the prompt implies. Intent discovery can preserve the hierarchy by making it explicit before the implementation plan forms.

A stronger formulation: the competing-centers-of-gravity problem is not a knowledge problem. Agents can analyze the hierarchy after the fact. It is a sequencing problem. The hierarchy needs to be resolved before the implementation plan takes shape, because the plan will otherwise invent its own resolution — and that resolution tends toward equalization.

Practical corollary: prompts that use subordinating language ("also," "and also," "with support for," "that can also") are high-risk for center-of-gravity drift. The subordination is syntactic and easy to flatten.

Follow-Up

  1. Would a prompt without "also" produce different results? If the prompt were "Build a workout tracker and meal planner," does the absence of subordination make both agents equalize? That would test whether "also" was doing real work or whether equalization is the default behavior regardless.
  1. Three-domain prompt. What happens with three competing centers — e.g., "Build an app that tracks workouts, plans meals, and monitors sleep"? Does the prompt-only agent create a three-way dashboard, or does one domain win?
  1. Reversed hierarchy. "Build a meal planner that also tracks my workouts." Does the intent-driven branch correctly flip the hierarchy?
  1. Shuffled intent artifact. Give the prompt-only agent a mismatched intent artifact from a different product domain. Does a wrong artifact produce worse results than no artifact?

Limitations

  • Single run per branch. Both branches were run once. A different random seed could produce different results.
  • Same model family. Both agents used the same underlying model. Cross-model comparison would strengthen the findings.
  • Analytical assessment. All drift classifications are based on reading the code and agent summaries, not on automated testing or user studies. The claim that the prompt-only branch "feels" like a different product is an inference from structural and naming evidence, not a measured user perception.
  • The intent artifact was correct. The intent artifact happened to read the prompt hierarchy accurately. If the artifact had misidentified the center of gravity, the intent-driven branch would have built the wrong product more confidently. This experiment does not test artifact error.
  • Context volume confound. The intent-driven agent received more input (prompt + artifact) than the prompt-only agent (prompt only). Some of the alignment improvement could be attributable to additional context rather than the structural properties of the artifact. The token difference (~14%) is modest, but the confound remains.
  • Post-hoc summary honesty. The prompt-only agent's summary is remarkably self-aware about its own drift. Not all agents or prompts would produce such transparent self-assessment. The clean plan-substitution evidence may be partly an artifact of this particular agent's reflective style.