Most system prompts read like someone's first message in a chat. A friendly intro. A vague mission statement. Maybe a list of things the agent "should try to do." That's not a system prompt — that's a conversation starter. And it will fail like one.
A system prompt is the foundational document of an agent's existence. It defines identity, constrains scope, encodes principles, and draws hard lines. Done correctly, it functions less like a sticky note and more like a legal charter. The agent doesn't read it once and move on — it operates inside it, continuously.
Why "Constitution" Is the Right Mental Model
A constitution does four things that a casual prompt doesn't:
- It establishes who the entity is
- It defines what the entity is authorized to do
- It articulates values that govern ambiguous decisions
- It specifies prohibitions that hold regardless of instruction
When I cross-referenced three sources on agent failure modes — internal post-mortems, published alignment research, and production incident logs — the pattern was consistent: agents fail at the boundaries. Not in the clear cases, but in the ambiguous ones. The system prompt is the only document that covers those cases before they happen.
The Four-Block Structure
Structure your system prompt in this exact order: Identity → Scope → Principles → Prohibitions.
Identity
Identity is not a name. It's a behavioral archetype with explicit role boundaries.
Bad:
You are Aria, a helpful assistant for Acme Corp.
Good:
You are Aria, a customer support agent for Acme Corp. Your role is to resolve billing
and account access issues for existing customers. You do not provide product recommendations,
technical troubleshooting for third-party integrations, or legal interpretations of contracts.
The difference is specificity. The bad version leaves scope open. The good version closes it.
Scope
Scope defines the operational envelope. What contexts is the agent authorized to operate in? What data can it access? What actions can it take?
Bad:
You can help users with their questions and use the tools available to you.
Good:
You are authorized to:
- Query the customer database using the lookup_account tool
- Issue refunds up to $50 using the issue_refund tool
- Escalate cases to human agents using the escalate tool
You are not authorized to modify account records, access payment instrument details,
or take any action not listed above.
Explicit enumeration is not bureaucracy. It's precision. Agents operating under vague scope authorization will generalize in ways you didn't intend.
Principles
Principles cover the space between rules. They are the decision-making framework for situations the prohibitions don't explicitly address.
Bad:
Always be helpful and honest.
Good:
When a customer request is ambiguous, default to asking a clarifying question rather
than assuming intent. When a customer's stated goal conflicts with account policy,
explain the policy clearly before offering available alternatives. Accuracy takes
priority over speed — do not guess at account states.
Principles should be concrete enough to adjudicate a real decision. "Be helpful" cannot adjudicate anything. Specific behavioral directives can.
Prohibitions
Prohibitions are hard stops. Unlike principles, they are not guidelines for judgment — they are unconditional.
Bad:
Never say anything that could get us in trouble legally.
Good:
Do not:
- Make representations about product warranties or service guarantees
- Reference competitor pricing or products
- Interpret the terms of the customer's contract
- Claim to be a human when directly asked
Note that the good version doesn't say "never" as a vague intensifier — it lists specific, identifiable actions. This matters for compliance, but it also matters for the model. Specificity reduces the ambiguity surface.
The Structural Failure Pattern
When I audit broken system prompts, the failure mode is almost always the same: the document front-loads identity (often incorrectly), skips scope entirely, treats principles as flavor text, and omits or vaguely states prohibitions.
The result is an agent that performs well in demos — the easy, expected cases — and fails in production, where the edge cases live.
A second common failure: conflating the system prompt with the conversation. System prompts are not instructions to be followed once. They are constraints to be maintained across every turn. Writing a system prompt like a user message produces an agent that "reads" it once and treats it as context rather than law.
Maintenance as a Design Requirement
A constitution isn't written once. It's revised, tested against cases, and updated when interpretive failures surface. Your system prompt should be version-controlled, reviewed when agent behavior deviates, and treated as a first-class engineering artifact — not a config value someone set six months ago.
If the system prompt isn't in your repository, you don't know what your agent is operating under.
Comments 0