Tool & Skill Architecture
Akmatori documents agent behavior through incident-level AGENTS.md guidance and skill-level SKILL.md files that include an auto-generated Assigned Tools section. Instead of Python import snippets, modern Akmatori routes tool use through the MCP Gateway, with generated gateway_call(...) and execute_script examples tied to the skill's actual assigned tools.
Overview
The goal is still to reduce exploration, but the execution model changed. Agents should read the relevant skill doc first, follow the generated tool examples, and use the gateway's namespaced tools directly. This keeps authorization, configuration, and routing centralized instead of asking agents to discover import paths or wire credentials manually. Connector and MCP registration changes can be reloaded at runtime, and proxied MCP tool schemas can be refreshed so newly discovered external tools show up without rewriting the skill prompt model.
docs/TOOL_ARCHITECTURE.md that describe the old Quick Start import-snippet model. The current runtime contract is the gateway-first Assigned Tools flow documented here. File Responsibilities
AGENTS.md
Incident-level instructions loaded at the incident workspace root.
- Defines the incident manager role and investigation workflow
- Points agents to the relevant skill docs before they start exploring
- Requires runbook search, then cross-incident memory search, before using infrastructure tools
SKILL.md
Skill prompt plus an auto-generated Assigned Tools section.
- Stores YAML frontmatter and human-authored skill instructions
- Lists only the tools currently assigned to that skill
- Includes per-tool details and ready-to-use
gateway_call(...)examples - Can reference shared context files through linked assets
Gateway Tool Registry
The MCP Gateway is the runtime contract agents execute against.
- Built-in tool types include SSH, Zabbix, VictoriaMetrics, Catchpoint, PostgreSQL, Grafana, PagerDuty, ClickHouse, NetBox, Kubernetes, Jira, and the credentialless Incidents namespace
- HTTP connectors can add declarative API-backed tools and trigger gateway reloads after CRUD changes
- External MCP servers can be proxied under a custom namespace prefix, with runtime reloads and schema refreshes for discovered tools
- System-level MCP servers can also stay registered across reloads, so platform-wide helper tools remain available without per-skill rewiring
Assigned Tools Pattern
Generated skill docs now include a structured tool section with copyable gateway examples. This keeps the docs aligned with the authorized tool set for that skill and avoids stale import instructions.
## Assigned Tools
### Production PostgreSQL (logical_name: "prod-db", type: postgresql)
Use ```python
gateway_call("postgresql.list_tables", {}, "prod-db")
gateway_call("postgresql.describe_table", {"table_name": "users"}, "prod-db")
```
For multi-step work, use `execute_script` with built-in `gateway_call()`.
gateway_call, with execute_script available for multi-step workflows. Investigation Guardrails
The incident manager prompt now makes context recall an explicit preflight. For every incident, agents search the runbook library first, then search cross-incident memory, and only then invoke infrastructure tools through skills or the gateway.
1. Understand the alert.
2. Delegate runbook recall:
subagent({"agent": "runbook-searcher", "task": "<full Original alert text or concise summary>"})
3. Delegate cross-incident memory recall:
subagent({"agent": "memory-searcher", "task": "<full Original alert text or concise summary>"})
4. Read the most relevant returned files.
5. Load skills and call infrastructure tools with runbook and memory context in hand.
runbook-searcher can only inspect /akmatori/runbooks/, while memory-searcher can only inspect /akmatori/memory/. Each may retry up to two times with narrower terms before the parent agent continues. Scheduled Agents
Cron jobs use the same agent execution path as incidents, but each schedule owns its tool allowlist. A cron job can post to a configured Channel or fall back to the workspace default post channel, and its tool_instance_ids determine exactly which infrastructure tools the cron-agent may call. Empty tool assignments mean the scheduled run still has memory and runbook recall, but no gateway-backed infrastructure tools.
409 Conflict so maintenance schedules survive routine cleanup. Key Principles
Skill-scoped authorization
Generated docs should reflect only the tools a skill is actually allowed to use, not the entire platform surface area.
Gateway-first execution
Agents should prefer gateway_call and execute_script so routing, auth, rate limiting, and auditing stay centralized.
Generated docs stay close to runtime behavior
Examples should be produced from current tool assignments and tool types, so docs change when the real execution path changes.
Less discovery, more investigation
Good docs remove avoidable exploration while still leaving room for runbook search, memory recall, incident analysis, and targeted debugging.
Generation Flow
- User assigns one or more tool instances to a skill in the UI.
- The backend regenerates
SKILL.mdwith the currentAssigned Toolssection. - Each generated block includes the logical tool name and gateway usage examples for that tool type.
- HTTP connector and MCP server changes can trigger gateway reloads so new tools appear without a full rebuild.
- When an incident is created, Akmatori generates
AGENTS.mdwith the current incident workflow and global memory manifest.
Contributor Checklist
When adding a new built-in tool type
Update tool registration and add a concrete usage example in generateToolUsageExample() so assigned skills get useful gateway examples automatically.
When changing skill docs behavior
Keep generated section markers and stripping logic aligned so user-authored prompt content is preserved while auto-generated blocks stay fresh.
When extending integrations
Decide whether the new capability belongs in a built-in tool type, a declarative HTTP connector, or a proxied external MCP server, then document that runtime path clearly.
Troubleshooting
Assigned tools look wrong
Check the skill-to-tool assignment and regenerate skill docs so SKILL.md is rebuilt from the current mapping.
A connector or MCP server is missing
Verify the HTTP connector or MCP server configuration exists, has a non-conflicting namespace, and that the gateway reload completed successfully.
Agents are still exploring too much
Make sure AGENTS.md points agents to the right skill docs and that the generated gateway examples are specific enough to execute directly.