05.02.2026

Miniray: Lightweight Distributed Python Compute

head-image

Running compute-intensive Python workloads on a single machine quickly hits limits. Whether you are training ML models, processing large datasets, or running simulations, scaling to multiple nodes typically requires complex frameworks. Miniray takes a different approach: extreme simplicity backed by Redis.

What is Miniray?

Miniray is an open-source distributed compute library from comma.ai, the autonomous driving company. It dispatches arbitrary Python code to workers across a datacenter using Redis as the message broker. The API mirrors Python's built-in concurrent.futures, making adoption straightforward for anyone familiar with ThreadPoolExecutor or ProcessPoolExecutor.

The project powers comma.ai's internal ML training infrastructure, handling everything from data preprocessing to on-policy reinforcement learning rollouts across 600+ GPUs.

Key Features

  • Familiar API: Uses concurrent.futures patterns with submit(), map(), and as_completed().
  • Redis-backed: Task queuing and result delivery through Redis keeps the architecture simple and debuggable.
  • Zero configuration for workers: Workers pull tasks automatically based on job priority.
  • GPU support: Built-in integration with Triton Inference Server for efficient model serving.
  • Cgroup isolation: Tasks run in isolated cgroups with configurable memory limits and NUMA pinning.
  • Cloudpickle serialization: Send arbitrary Python functions without manual serialization code.

Installation

Clone the repository and install with pip:

git clone https://github.com/commaai/miniray.git
cd miniray
pip install -e .

You will need a Redis instance accessible to both executor and workers.

Usage

The executor API feels native to Python developers:

import miniray
from concurrent.futures import as_completed

def process_item(item):
    # Your compute-intensive logic here
    return item * 2

data = range(1000)

with miniray.Executor(job_name='batch_process') as executor:
    # Map style
    results = list(executor.map(process_item, data, chunksize=10))

    # Submit style with futures
    futures = [executor.submit(process_item, x) for x in data]
    for future in as_completed(futures):
        print(future.result())

For ML workloads, miniray supports batching with chunksize to reduce Redis round-trips and automatic function caching to avoid resending pickled code.

Operational Tips

  • Set appropriate chunksizes: Large chunksizes reduce overhead but increase latency for individual results.
  • Use job priorities: Higher priority jobs get scheduled first when workers become available.
  • Monitor with Redis: Task queues are visible in Redis, making debugging straightforward with redis-cli.
  • Configure timeouts: Set limits.timeout_seconds to prevent runaway tasks from blocking workers.

Conclusion

Miniray proves that distributed compute does not require heavyweight orchestration. If you need to parallelize Python workloads across multiple machines without the complexity of Dask or Ray, miniray offers a battle-tested alternative. It runs comma.ai's entire ML training pipeline, from data processing to model training.


Akmatori is an open-source AI agent platform designed for SRE teams. If you are looking for reliable, scalable cloud infrastructure, check out Gcore.

Automate incident response and prevent on-call burnout with AI-driven agents!