Why Agentic AI Keeps Failing in Production
Eighteen months of shipping agents to paying users. Here's what actually breaks, why, and the patterns that finally held up.
Failure 1 — the loop that won't stop
Agents enter tool-call loops. Same tool, same args, twelve times. Why: the model believes the tool failed but its understanding of "failed" doesn't match yours. Fix:
- Cap tool calls per turn (4-8 is usually right).
- When the cap hits, return a clear "max_tool_calls reached, summarise progress and ask the user" message back into the model's context.
- Log every loop. They're a leading indicator of prompt drift.
Failure 2 — drift across versions
Agent 1.0 works. You add a feature. Now agent 1.0's old behaviour is broken. The model is the same, the prompt is one extra paragraph longer, and somehow it stopped doing the thing it used to do. Fix: every prompt change runs the full eval set including all previous behaviours. Anything that drops gets blocked at merge time.
Failure 3 — the agent assumes the user's intent
Default model behaviour: be helpful, fill in gaps. Production behaviour you actually want: ask for confirmation before destructive actions, ask one clarifying question when intent is ambiguous, refuse out-of-scope requests rather than improvising.
This isn't a bug in the model. It's missing prompt scaffolding. Add an explicit "ask one clarifying question if X" rule. Add an explicit "before any irreversible action, confirm in plain English." Add a refusal list.
Failure 4 — observability that's a lie
"Conversion rate is up 12%." Cool. Then you sample 50 transcripts and discover the agent is hallucinating order numbers and the customers don't notice for a week. Vanity metrics drift you toward looking-good rather than working-well. Sample transcripts. Have a human review them. Yes, every week.
Failure 5 — provider outage cascade
Provider has a degraded region. Your agent's retries pile up in the queue. The queue back-pressures the rest of your app. The rest of your app tips over.
- Circuit breakers in front of model calls.
- Multi-provider, same interface. Round-robin under load.
- Backoff with jitter. Not aggressive retries.
Failure 6 — prompt injection from user content
Anything the agent can read can attack it. RAG chunks. User profile fields. PDF attachments. The defense isn't "tell the model to ignore instructions in user content" — that's incantation. The defense is privilege walls: the agent processing user content must never have access to tools that can act on other users' data. Architectural, not prompt-based.
Failure 7 — silent quality regression
Vendor pushes a model update. Your prompts that were tuned to v1 now misbehave on v2. Defenses:
- Pin the model version explicitly.
claude-x.ynotclaude-latest. - Run the eval on every supported model version weekly.
- When migrating versions, treat it like any other deploy: blue/green, gradual rollout, watch the eval.
The patterns that finally held up
- Eval-gated everything — prompts, model versions, tool changes.
- Small, composable prompts. Not one giant system prompt.
- Tool calls capped per turn, errors returned as data not exceptions.
- Privilege walls between user-content processing and action-taking.
- Sample-and-review weekly. Not optional.
- Multi-provider on the same interface for resilience.