Skip to main content
05.04.2026

Microsoft Agent Framework for Platform Engineering

head-image

Platform teams are under pressure to ship AI features without creating a second reliability mess. Microsoft Agent Framework matters because it combines agent runtimes, graph-based workflows, and OpenTelemetry support in one stack for Python and .NET.

What Is Microsoft Agent Framework?

Microsoft Agent Framework is an open source framework for building AI agents and multi-agent workflows. It is the successor to ideas from AutoGen and Semantic Kernel, but the interesting part for SRE and platform engineers is not the branding. It is the operational model.

The framework separates two concerns:

  • Agents for open-ended tasks, tool calls, and conversational flows
  • Workflows for explicit multi-step execution with checkpoints and human review

That split is useful in production. Teams can keep the flexible parts inside agents while putting approval gates, retries, and deterministic routing inside workflows.

Key Features

Here are the features that stand out for platform engineering teams:

  • Graph-based workflows for multi-step tasks and multi-agent orchestration
  • OpenTelemetry integration for tracing and debugging agent execution
  • Session-based state management for multi-turn interactions
  • Middleware hooks for policy checks, logging, and request shaping
  • Support for multiple model providers including Azure OpenAI, OpenAI, Anthropic, Ollama, and Microsoft Foundry

In short, the framework is trying to move agent systems closer to normal software operations.

Installation

Python setup is straightforward:

pip install agent-framework

A minimal Python example using Microsoft Foundry looks like this:

from agent_framework.foundry import FoundryChatClient
from azure.identity import AzureCliCredential

credential = AzureCliCredential()
client = FoundryChatClient(
    project_endpoint="https://your-foundry-service.services.ai.azure.com/api/projects/your-foundry-project",
    model="gpt-5.4-mini",
    credential=credential,
)

agent = client.as_agent(
    name="HelloAgent",
    instructions="You are a friendly assistant. Keep your answers brief.",
)

If you are building in .NET instead, the project also ships first-party packages for that stack.

Usage

The main design choice is when to use an agent and when to use a workflow.

Use an agent when the task is open-ended and benefits from tool use and planning. Use a workflow when the execution path should stay controlled and reviewable.

That makes Agent Framework a reasonable fit for cases like these:

  • AI-assisted incident triage with human approval before remediation
  • Multi-step runbook execution with typed routing between services
  • Internal copilots that need traces, policy hooks, and provider flexibility

For SRE teams, that last point is the differentiator. Most agent demos stop at prompting. This project puts observability and workflow control much closer to the center.

Operational Tips

A few practical notes before adopting it:

  • Treat workflows as the boundary for approvals, retries, and rollback logic
  • Export OpenTelemetry data early so you can inspect latency and failure paths
  • Use a specific production credential instead of broad development defaults where possible
  • Keep agent prompts small and push deterministic logic into code or workflow nodes

The official overview also makes a good point: if a plain function can solve the problem, use a plain function. That is good engineering advice.

Conclusion

Microsoft Agent Framework looks promising because it speaks the language platform teams already care about: workflows, middleware, state, and telemetry. If you want to experiment with agentic systems without giving up operational control, it is worth a serious look.

You can explore the project on GitHub at microsoft/agent-framework and read the official documentation on Microsoft Learn.

If you are building AI operations workflows, Akmatori helps teams automate incident response and reduce noisy manual handling. If you need global infrastructure for those workloads, Gcore is worth a look as well.

Automate incident response and prevent on-call burnout with AI-driven agents!