12-Factor Agents for Production AI Workflows

AI agents are easy to demo and hard to operate. The failure mode is familiar: a framework gets a team to 80 percent quickly, then the last 20 percent requires control over prompts, context, state, tools, retries, and human review. 12-Factor Agents is trending because it gives that control a concrete vocabulary.
What Is 12-Factor Agents?
12-Factor Agents is an open-source guide from HumanLayer inspired by the classic Twelve-Factor App methodology. Instead of presenting one more agent framework, it describes engineering principles for LLM-powered software that needs to survive real users and real production constraints.
The guide argues that useful agents are loops: the model reads context, emits structured intent, deterministic code executes the step, and the result goes back into context. The hard part is deciding which parts the model should control and which parts the application should own.
Key Ideas
- Own your prompts: Keep prompts in your codebase so they can be reviewed, versioned, tested, and rolled back.
- Own your context window: Curate what the model sees instead of dumping every log line, ticket, metric, and chat message into one giant prompt.
- Treat tools as structured outputs: Let the model request typed actions, then let deterministic code validate and execute them.
- Unify execution and business state: Store agent progress where the rest of the product can inspect, resume, and audit it.
- Keep agents small: Prefer focused agents with clear responsibilities over one broad agent that can do everything badly.
Installation
There is no package to install. The project is a design guide and reference set:
git clone https://github.com/humanlayer/12-factor-agents.git
cd 12-factor-agents
ls content
Use it as a review checklist when designing an internal runbook assistant, incident triage agent, or platform automation workflow.
SRE Workflow Example
For an incident triage agent, the principles translate into a safer architecture:
Alert webhook
-> load service metadata, recent deploys, and runbook
-> ask model for the next typed action
-> execute only approved read tools
-> compact errors and evidence into context
-> pause for human approval before write actions
That shape is more reliable than giving a model broad shell access and hoping the framework hides the complexity. It also creates audit points that SRE teams can inspect after an incident.
Operational Tips
Start with read-only workflows. Summarizing alerts, collecting logs, finding recent deploys, and drafting postmortems are good first targets. Add write actions only after tool schemas, permissions, logs, and human approval paths are boringly clear.
Do not let context become a junk drawer. Pre-fetch the facts the agent is likely to need, but keep raw evidence linkable outside the prompt. That keeps token use predictable and makes failures easier to debug.
Finally, test prompts like production code. A small regression suite with real alert examples will catch prompt drift before it reaches on-call engineers.
Conclusion
12-Factor Agents is valuable because it frames AI agents as software systems, not magic workers. The project gives platform teams a practical checklist for building agents that are observable, controllable, resumable, and narrow enough to trust.
At Akmatori, we help SRE teams build intelligent automation that responds to incidents and manages infrastructure. For GPU-accelerated AI workloads, check out Gcore cloud infrastructure with global edge locations.
