What is AI-powered incident management?

AI-powered incident management uses intelligent agents to automate incident response tasks like alert triage, runbook execution, root cause analysis, and team notifications. Instead of relying solely on on-call engineers, AI agents can handle initial diagnosis and remediation steps 24/7, reducing MTTR and on-call burnout.

Is Akmatori open source?

Yes, Akmatori is an Apache 2.0 open-source project. You can inspect the code, run it on your own infrastructure using Docker or Kubernetes, and keep complete control over your data and incident workflows.

What LLM providers does Akmatori support?

Akmatori supports multiple LLM providers including OpenAI (GPT-4), Anthropic (Claude), Google (Gemini), and OpenRouter. You can also use on-premise models like Mistral, GLM, Kimi, or Minimax for data sovereignty requirements.

How does Akmatori integrate with PagerDuty?

Akmatori integrates with PagerDuty via webhooks and the PagerDuty API. When an incident is triggered, Akmatori's AI agents can automatically acknowledge alerts, gather context from your observability stack, execute diagnostic runbooks, and post updates to incident channels.

Can Akmatori reduce on-call burnout?

Yes. Akmatori handles routine incidents autonomously, filters alert noise, and only escalates to human engineers when necessary. This significantly reduces the number of pages during off-hours and allows SRE teams to focus on high-impact work instead of repetitive troubleshooting.

What observability tools does Akmatori work with?

Akmatori integrates with popular observability tools including Prometheus, Grafana, Datadog, New Relic, Splunk, and CloudWatch. AI agents can query metrics, logs, and traces to diagnose issues automatically.

03.03.2026

Formal Verification for AI-Generated Code: Why SRE Teams Should Care

Formal Verification

Quick Reference:

# Static analysis with semgrep
semgrep --config auto ./src

# Type checking with mypy (Python)
mypy --strict ./src

# Property-based testing with hypothesis
pytest --hypothesis-profile=ci

# Formal verification with Dafny
dafny verify program.dfy

AI coding assistants now write significant portions of production code. GitHub reports that Copilot generates 46% of code in files where it's enabled. But here's the uncomfortable question: how do you verify code you didn't write and may not fully understand?

Traditional testing catches bugs you anticipate. Formal verification proves properties hold for all possible inputs. As AI-generated code scales, that distinction becomes critical.

The Verification Gap

Consider this AI-generated function:

def calculate_discount(price: float, user_tier: str) -> float:
    if user_tier == "gold":
        return price * 0.8
    elif user_tier == "silver":
        return price * 0.9
    return price

Looks correct. Tests pass. Ships to production.

Three months later, someone discovers that negative prices return negative discounts (refunds). The AI generated syntactically correct code that violated an implicit business invariant: prices must be non-negative.

This is the verification gap: the space between "code that works for test cases" and "code that's provably correct."

Why AI Code Amplifies the Problem

AI-generated code has unique verification challenges:

Challenge	Description
Volume	More code generated faster means more potential bugs at scale
Opacity	Developers may not deeply understand generated code
Edge cases	AI optimizes for common patterns, not corner cases
Implicit assumptions	AI doesn't know your business invariants
Confidence bias	"AI wrote it" can reduce scrutiny

When a human writes code, they carry mental models of edge cases and constraints. AI models lack this context unless explicitly provided.

Levels of Verification

Verification exists on a spectrum:

Level 1: Static Analysis

Catch common bugs without running code:

# Python
ruff check ./src
mypy --strict ./src

# Go
go vet ./...
staticcheck ./...

# JavaScript/TypeScript
eslint --max-warnings 0 ./src

Static analysis catches type errors, unused variables, and common anti-patterns. It's fast and should run on every commit.

Level 2: Property-Based Testing

Instead of specific test cases, define properties that must hold:

from hypothesis import given, strategies as st

@given(st.floats(min_value=0), st.sampled_from(["gold", "silver", "bronze"]))
def test_discount_never_increases_price(price, tier):
    result = calculate_discount(price, tier)
    assert result <= price
    assert result >= 0

Hypothesis generates hundreds of random inputs, finding edge cases humans miss.

Level 3: Design by Contract

Specify preconditions, postconditions, and invariants:

from icontract import require, ensure

@require(lambda price: price >= 0, "Price must be non-negative")
@ensure(lambda result, price: 0 <= result <= price, "Discount must not exceed price")
def calculate_discount(price: float, user_tier: str) -> float:
    if user_tier == "gold":
        return price * 0.8
    elif user_tier == "silver":
        return price * 0.9
    return price

Contracts are checked at runtime (or statically with appropriate tools), catching violations immediately.

Level 4: Formal Verification

Prove mathematically that code satisfies specifications:

method CalculateDiscount(price: real, tier: string) returns (result: real)
  requires price >= 0.0
  ensures 0.0 <= result <= price
{
  if tier == "gold" {
    result := price * 0.8;
  } else if tier == "silver" {
    result := price * 0.9;
  } else {
    result := price;
  }
}

Dafny, TLA+, and similar tools mathematically prove your code meets its specification for all possible inputs.

Practical Implementation for SRE Teams

Full formal verification isn't practical for most codebases. Here's a pragmatic approach:

Tier 1: Everything (CI/CD Required)

# .github/workflows/verify.yml
- name: Static Analysis
  run: |
    ruff check .
    mypy --strict src/
    
- name: Security Scan
  run: semgrep --config auto --error .

Tier 2: Critical Paths (Required for Merge)

- name: Property Tests
  run: pytest tests/property/ --hypothesis-profile=ci
  
- name: Contract Verification
  run: python -m icontract_hypothesis test src/billing/

Tier 3: Core Invariants (Weekly/Release)

- name: Formal Verification
  run: |
    dafny verify src/core/*.dfy
    tla+ check specs/consensus.tla

What to Verify

Focus verification efforts on code that:

Handles money: Billing, payments, credits
Controls access: Authentication, authorization
Manages state: Distributed consensus, caches
Affects safety: Rate limiting, circuit breakers

For AI-generated code specifically, add verification when:

The generated code handles edge cases
You don't fully understand why the code works
The code will run in production without human review

Tools by Language

Language	Static	Property Testing	Contracts	Formal
Python	mypy, ruff	hypothesis	icontract	CrossHair
Go	go vet, staticcheck	rapid	-	-
Rust	clippy	proptest	contracts	Prusti
TypeScript	tsc, eslint	fast-check	io-ts	-
Java	SpotBugs	jqwik	cofoja	KeY

AI-Specific Verification Patterns

Pattern 1: Specification-First Generation

Instead of generating code directly, have AI generate specifications first:

Prompt: "Write a Dafny specification for a function that calculates shipping cost 
based on weight and distance. Include all necessary preconditions and postconditions."

Then generate implementation that must satisfy the spec.

Pattern 2: Invariant Extraction

Ask AI to identify invariants in existing code:

Prompt: "What invariants must hold for this billing calculation? 
Express them as Python icontract decorators."

Pattern 3: Test Oracle Generation

Use AI to generate property-based test oracles:

Prompt: "Generate Hypothesis strategies and properties for testing this 
rate limiter implementation. Focus on edge cases around counter wraparound."

Verification in Production

Runtime verification catches what static analysis misses:

from opentelemetry import trace

def calculate_discount(price: float, user_tier: str) -> float:
    tracer = trace.get_tracer(__name__)
    with tracer.start_as_current_span("calculate_discount") as span:
        # Runtime invariant check
        if price < 0:
            span.set_attribute("invariant.violation", "negative_price")
            raise ValueError(f"Price invariant violated: {price}")
        
        result = _calculate_discount_impl(price, user_tier)
        
        # Post-condition check
        if result > price:
            span.set_attribute("invariant.violation", "discount_exceeds_price")
            raise ValueError(f"Discount invariant violated: {result} > {price}")
        
        return result

Combine with alerting to catch invariant violations in production before they cascade.

The Economics of Verification

Verification has costs:

Time: Property tests are slower than unit tests
Expertise: Formal methods require specialized knowledge
Maintenance: Specifications must evolve with code

But so do bugs:

Incident response: Hours to days of engineer time
Customer impact: Revenue loss, trust erosion
Regulatory risk: Compliance violations, fines

The math favors verification for critical paths. A billing bug that costs $100K to resolve justifies significant verification investment.

Getting Started

This week: Add static analysis to CI if missing
This month: Implement property-based tests for one critical module
This quarter: Add contracts to AI-generated code in billing/auth
This year: Evaluate formal verification for core algorithms

Start small, measure bug escape rates, and expand verification where it proves valuable.

Conclusion

As AI generates more code, the verification gap widens. Traditional testing catches anticipated bugs; verification catches unanticipated ones.

You don't need to formally verify everything. But for code that handles money, controls access, or runs unsupervised, verification is becoming table stakes.

The question isn't whether to verify AI-generated code. It's which level of verification matches your risk tolerance.

Want AI agents that help verify and test your code? Akmatori can automate code review, run verification suites, and catch issues before they reach production.

Akmatori team