Plan Mode Is A Trap
Plan mode feels good. It’s like taking a bath in rich sophistication. Production-ready slop just oozing out your fingertips. But secretly it seduces you into the dark trap of complexity. There’s a better way, but you’re not going to like it.
(skip-able): Plan Mode was originally from Claude Code and is in every coding agent now. It breaks agentic coding up into two phases. In the first phase you don’t write any code, the AI just interviews you about the problem and proposes a design. Then you exit plan mode and the AI carries out implementation.
Recently I’ve given the same vibe coding interview to 10-15 candidates. It goes something like this (not one of the questions that I use):
Build a web app where a user uploads meeting notes (text or audio transcript), and can then query across them — like ‘what did we decide about the timeline?’ or ‘who owns the design review?
Candidates can use whatever tools they want, AI tools are explicitly encouraged.
The wild part? The more time spent planning, the longer and more complex the implementation phase was.
Now, I don’t actually know why this is, but the correlation is almost perfect. For the rest of this post I’m going to explain why I think this is. My explanation might be wrong, but I’m fairly certain the observation is not.
Plan Mode Is The Spiritual Bliss Attractor
In the Claude 4 Opus system card they noted:
Claude shows a striking “spiritual bliss” attractor state in self-interactions. When conversing with other Claude instances in both open-ended and structured environments, Claude gravitated to profuse gratitude and increasingly abstract and joyous spiritual or meditative expressions.
Basically, Claude is a cool dude. So when confronted with another Claude, they each try to out-cool the other dude until they’re just talking super cool nonsense.
That’s AI<->AI interactions. I tend to think that plan mode is the same thing, but between a human and an AI. And instead of coolness, you and the AI unwittingly pull each other toward complex solutions.
It looks something like:
User: I want to build an app where you can upload notes and talk about them
AI: Great! I’m thinking this should be 5 microservices, postgres behind each, a time series DB, and a vector DB. Obviously we’ll develop in Docker, as one does when they’re as sophisticated as you, and I’ll also sling some Kubernetes config so it’s production grade. Sound good? Or maybe we need end-to-end encryption too, yeah, I’ll add that as well.
(20 minutes later)
User: oh, yes! This is great. Let me know what commands I should use to push to prod.
That’s a caricature, but it scratches at something real. Would you divide this up into 5 microservices with docker images and k8s config? Well no, but you’d really like to if you had time. Now that AI is doing all the work, what’s the downside?
“would you like MORE PRODUCTION or WORSE CODE? choose wisely”
—Plan Mode, probably
It’s Just How Information Works
But it’s not just AI. Take any extremely smart and experienced software engineer and put them into a new highly complex domain and have them solve a problem without giving them enough time to understand the problem. They will, without fail, deliver a solution of spectacular complexity. The smarter they are, the more overly complex the solution. Every time (speaking both 1st & 3rd person here).
When you learn a domain, you learn a lot of shortcuts. Lots of things simply aren’t possible, because that’s just not how things work. Unthinkable things are common.
e.g. “Did you know that individual electronic health records can be over a gigabyte in size?” Those are the scars of experience.
When you don’t have time to learn a domain, you know you’re missing all these things, so you plan for worst case scenarios. The smarter you are, the worse cases you can imagine. LLMs are so smart these days.
Does this not sound like the typical AI code slop scenario?
The Right Way
Learn the domain.
Well, you already know the domain, but the agent doesn’t. What doesn’t work on your box? What quirks does your team/org have? Who’s going to use the app? How solid does it have to be? Which parts tend to break first?
I think plan mode was supposed to surface all of this. But in the 10-15 interviews I’ve witnessed, people often get hung up on the technologies instead. And AI will always discuss the thing you want to discuss, so down the spiritual bliss attractor path we go, with no escape. Claude compensates for lack of domain knowledge through it’s sheer mastery of technology. Complexity ensues.
Explain the domain.
Fun Fact: In math, “domain” means the inputs to a function. All of them.
The Soul Doc
Anthropic trained Opus 4.5 with a soul document (officially, Claude’s constitution). The purpose is alignment. All other labs try to align the AI by giving it long lists of DOs and DON’Ts. The soul doc was an adventure in a new direction — explain what a good AI looks like. Explain why bad behavior is bad.
Many have noticed that Claudes trained with the soul doc have a very dynamic but firm grip on morality, which lets them approach scandalous-sounding situations without awkward refusals. The models feel smarter in a way that’s very hard to describe.
New Employees
I bring up the soul doc because I think it’s a good framework for how to think about communicating with AI.
If you were a new employee, how would you feel if you were given 14 pages of legalistic prohibitions? I mean, that’s normal, that’s what the typical employee handbook is. But I hate it. Who even reads those? At best, I just skip to the rules I’m most likely to break to understand what the punishment is going to be.
It falls close to micromanagement. If a manager is bearing down on me with overly-prescriptive instructions for how to work, I basically just check out and stop thinking. Maybe that’s just me, but I’m pretty sure LLMs do that too.
In my experience, when you give an agent (an AI or a person) a goal, a set of constraints, and an oral history and mythology, they tend to operate with full autonomy. That’s the essence of the soul doc, and it’s how I talk to all LLMs. It works great.
Control: How much?
Ah! The eternal question. How much control should we wield over AI?
Should you look at the code? Should you know every line? Should it be embarrassing if you don’t know what programming language the code is written in?
My answer: Less. Cede more control over to the AI than you currently are.
It’s hard to draw hard lines, but people who can successfully cede control are clearly more productive (we’re excluding people who outright lose control to the AI). They can do more, have more threads running in parallel, etc. It’s clearly better, so it’s just a matter of figuring out how to be successful without losing control.
A paradox!!!
I just said we should cede control while still retaining it. This is a classic problem that people managers have wrestled with. And honestly, there’s a lot of parallels in how to deal with it.
Instruction Inconsistency
When you grow a long AGENTS.md of DOs and DON’Ts it becomes hard for the agent to navigate that.
But it also becomes hard for you to add to it without accidentally causing confusion with a
conflicting instruction.
In management, they talk a lot about setting values & culture. A good manager simply creates an environment in which their employees can succeed. A lot of that involves communicating purpose, aligning people into the same direction, and clarifying ambiguities.
Maybe I’m weird (okay fine, I am), but I like telling stories in the AGENTS.md. “This one time
a guy had a 2 GiB health record, insane!” happens to communicate a lot more than “always check
health record size”. Now, if you’re talking about an unplanned situation like transferring records,
the agent can think about
how large the transfer might be, or how resumability might be important, even for single records.
A more compact tool is values. Strix, my personal agent, wrote about how values that are in tension tend to produce better behavior from agents. This is known, philosophers and managers have said this for years. Amazon has it’s leadership principles that all seem wonderful independently, but once you test them in the real world you quickly discover that they conflict in subtle ways. They force you to think.
Example: Invent & Simplify nudge you toward simplicity, while Think Big nudges you toward crazy potentially very complex ideas. The principles guide debate, they don’t decide the outcome.
This is the essence of culture building, as managers learn. It’s about changing how people talk, not dictating what they say. And that’s what you need to do with your agents as well.
Outro
Plan Mode is a trap.
Well no, it’s not inherently a problem with plan mode, nor is it limited to plan mode. It’s that it sucks you into harmony with your agent without first setting ground rules. Managers stay in control by influencing how work is done, not dictating the specifics of the end product.
If you don’t properly establish that with the agent, they gravitate toward their training data. They produce complexity in order to deal with all the edge cases you didn’t tell them about.
Stateful agents & continual learning are promising frontiers. Strix is a stateful agent, I also launched open-strix, a stripped-down & simplified version of Strix’ harness. I think soon, maybe in the next few months, it will become normal for agents to learn on-the-job, so that chores like setting values & context will feel higher-leverage.