🌐 OmniDaemon

Universal Event-Driven Runtime Engine for AI Agents

Run any AI agent. Any framework. One event-driven control plane. Created by Abiola Adeshina • From the team behind OmniCoreAgent

🌊 Why OmniDaemon Exists: The “Single Process” Trap

Most AI frameworks run everything in a single Python process. One crash kills your entire system.

The Problem

Frameworks like LangGraph, CrewAI, and AutoGen are great for building agent logic, but they run as a single process.

❌ One agent crashes? The entire process dies.
❌ Memory leak? Affects all agents in the process.
❌ No fault isolation Each agent shares the same memory space.

The OmniDaemon Solution: Process Isolation

OmniDaemon runs each agent in its own isolated process (like containers), managed by a Supervisor.

✅ Fault Isolation: If Agent A crashes, Agent B keeps running.
✅ Auto-Recovery: Supervisors automatically restart crashed agents.
✅ Resource Safety: Clean memory/CPU boundaries per agent.
✅ Production-Ready: Run Python agents (LangGraph, CrewAI, AutoGen, custom) with process isolation.

Think of it like Kubernetes Pods:
Each agent runs in its own “container” (process) but shares the underlying host resources (CPU, memory). One pod crash doesn’t affect others, and the orchestrator (Supervisor) handles lifecycle management.

👉 See the deep dive: OmniDaemon vs Other Frameworks

📚 Event-Driven Architectures: A Primer

In the early days, software systems were monoliths. Everything lived in a single, tightly integrated codebase. While simple to build, monoliths became a nightmare as they grew. Scaling was a blunt instrument: you had to scale the entire application, even if only one part needed it. This inefficiency led to bloated systems and brittle architectures that couldn’t handle growth. Microservices changed this. By breaking applications into smaller, independently deployable components, teams could scale and update specific parts without touching the whole system. But this created a new challenge: how do all these smaller services communicate effectively? If we connect services through direct RPC or API calls, we create a giant mess of interdependencies. If one service goes down, it impacts all nodes along the connected path. EDA solved the problem. Instead of tightly coupled, synchronous communication, EDA enables components to communicate asynchronously through events. Services don’t wait on each other — they react to what’s happening in real-time. This approach made systems more resilient and adaptable, allowing them to handle the complexity of modern workflows. It wasn’t just a technical breakthrough; it was a survival strategy for systems under pressure.

The rise and fall of early social networks like Friendster underscore the importance of scalable architecture. Friendster captured massive user bases early on, but their systems couldn’t handle the demand. Performance issues drove users away, and the platform ultimately failed. On the flip side, Facebook thrived not just because of its features but because it invested in scalable infrastructure. It didn’t crumble under the weight of success — it rose to dominate. Today, we risk seeing a similar story play out with AI agents. Like early social networks, agents will experience rapid growth and adoption. Building agents isn’t enough. The real question is whether your architecture can handle the complexity of distributed data, tool integrations, and multi-agent collaboration. Without the right foundation, your agent stack could fall apart just like the early casualties of social media.

🚀 The Future is Event-Driven Agents

The future of AI isn’t just about building smarter agents — it’s about creating systems that can evolve and scale as the technology advances. With the AI stack and underlying models changing rapidly, rigid designs quickly become barriers to innovation. To keep pace, we need architectures that prioritize flexibility, adaptability, and seamless integration. EDA is the foundation for this future, enabling agents to thrive in dynamic environments while remaining resilient and scalable.

🤝 Agents as Microservices with Informational Dependencies

Agents are similar to microservices: they’re autonomous, decoupled, and capable of handling tasks independently. But agents go further. While microservices typically process discrete operations, agents rely on shared, context-rich information to reason, make decisions, and collaborate. This creates unique demands for managing dependencies and ensuring real-time data flows. For instance, an agent might pull customer data from a CRM, analyze live analytics, and use external tools — all while sharing updates with other agents. These interactions require a system where agents can work independently but still exchange critical information fluidly. EDA solves this challenge by acting as a “central nervous system” for data. It allows agents to broadcast events asynchronously, ensuring that information flows dynamically without creating rigid dependencies. This decoupling lets agents operate autonomously while integrating seamlessly into broader workflows and systems.

🔓 Decoupling While Keeping Context Intact

Building flexible systems doesn’t mean sacrificing context. Traditional, tightly coupled designs often bind workflows to specific pipelines or technologies, forcing teams to navigate bottlenecks and dependencies. Changes in one part of the stack ripple through the system, slowing innovation and scaling efforts. EDA eliminates these constraints. By decoupling workflows and enabling asynchronous communication, EDA allows different parts of the stack — agents, data sources, tools, and application layers — to function independently. Take today’s AI stack, for example. MLOps teams manage pipelines like RAG, data scientists select models, and application developers build the interface and backend. A tightly coupled design forces all these teams into unnecessary interdependencies, slowing delivery and making it harder to adapt as new tools and techniques emerge. In contrast, an event-driven system ensures that workflows stay loosely coupled, allowing each team to innovate independently. Application layers don’t need to understand the AI’s internals — they simply consume results when needed. This decoupling also ensures AI insights don’t remain siloed. Outputs from agents can seamlessly integrate into CRMs, CDPs, analytics tools, and more, creating a unified, adaptable ecosystem.

⚡ Scaling Agents with Event-Driven Architecture

EDA is the backbone of this transition to agentic systems. Its ability to decouple workflows while enabling real-time communication ensures that agents can operate efficiently at scale. Platforms like Kafka exemplify the advantages of EDA in an agent-driven system:

Horizontal Scalability: Distributed design supports the addition of new agents or consumers without bottlenecks, ensuring the system grows effortlessly.
Low Latency: Real-time event processing enables agents to respond instantly to changes, ensuring fast and reliable workflows.
Loose Coupling: By communicating through topics rather than direct dependencies, agents remain independent and scalable.
Event Persistence: Durable message storage guarantees that no data is lost in transit, which is critical for high-reliability workflows.

Data streaming enables the continuous flow of data throughout a business. A central nervous system acts as the unified backbone for real-time data flow, seamlessly connecting disparate systems, applications, and data sources to enable efficient agent communication and decision-making. This architecture is a natural fit for frameworks like Anthropic’s Model Context Protocol (MCP). MCP provides a universal standard for integrating AI systems with external tools, data sources, and applications, ensuring secure and seamless access to up-to-date information. By simplifying these connections, MCP reduces development effort while enabling context-aware decision-making. EDA addresses many of the challenges MCP aims to solve. MCP requires seamless access to diverse data sources, real-time responsiveness, and scalability to support complex multi-agent workflows. By decoupling systems and enabling asynchronous communication, EDA simplifies integration and ensures agents can consume and produce events without rigid dependencies.

🎯 Event-Driven Agents Will Define the Future of AI

The AI landscape is evolving rapidly, and architectures must evolve with it. And businesses are ready. A Forum Ventures survey found that 48% of senior IT leaders are prepared to integrate AI agents into operations, with 33% saying they’re very prepared. This shows a clear demand for systems that can scale and handle complexity. EDA is the key to building agent systems that are flexible, resilient, and scalable. It decouples components, enables real-time workflows, and ensures agents can integrate seamlessly into broader ecosystems. Those who adopt EDA won’t just survive — they’ll gain a competitive edge in this new wave of AI innovation. The rest? They risk being left behind, casualties of their own inability to scale.

🎯 What is OmniDaemon?

“Kubernetes for AI Agents” - A universal runtime that makes AI agents autonomous, observable, and scalable.

OmniDaemon transforms AI from static reasoning engines into event-driven, self-operating entities that integrate seamlessly across clouds, data streams, and enterprise environments.

In 5 Seconds

🤖 Run AI agents in the background (not chatbots, not APIs)
📨 Event-driven (agents react to events, not HTTP requests)
🔌 Use any AI framework (OmniCore Agent, Google ADK, LangChain, or custom)
🚀 Production-ready (retries, DLQ, metrics, scaling built-in)

🌊 Why Event-Driven AI? The Evolution Story

The AI Evolution: Three Waves

AI has progressed through distinct waves, each unlocking new possibilities but also introducing critical limitations.

Wave 1: Predictive Models (Traditional ML)

The first wave focused on narrowly defined, domain-specific tasks.

Training Data → ML Model → Predictions for Specific Use Case

Limitations:

❌ Domain-specific and rigid
❌ Required ML expertise for each use case
❌ Difficult to repurpose
❌ Lacked scalability

Wave 2: Generative Models (LLMs)

Generative AI revolutionized capabilities by training on vast, diverse datasets.

Massive Dataset → LLM → Generate Text, Images, Code

Breakthrough: Generalization across contexts Limitations:

❌ Fixed in time (no dynamic information)
❌ Expensive to fine-tune
❌ No access to private/domain data
❌ Generic responses without context

Example Problem:

“Recommend an insurance policy tailored to my health history, location, and financial goals.”

The LLM can’t deliver accurate recommendations because it lacks access to your personal data. Without it, responses are either generic or wrong. Solution: Compound AI (RAG) Retrieval-Augmented Generation bridges the gap:

Retrieve user’s health and financial data from database
Add data to context during prompt assembly
LLM generates accurate, personalized response

Database → Retrieve Context → LLM + Context → Accurate Response

RAG Limitation: Fixed workflows. Every interaction path must be pre-defined. This rigidity makes it impractical for complex, dynamic tasks.

Wave 3: Agentic AI (Current)

The future of AI lies with autonomous agents — systems that think, adapt, and act independently.

Event → Agent Reasons → Uses Tools → Adapts Workflow → Takes Action

Why Agents Are Different:

✅ Dynamic workflows (figure out next steps on the fly)
✅ Context-driven (adapt to the situation)
✅ Autonomous (no pre-defined paths needed)
✅ Tool use (access external systems)
✅ Memory (learn from past interactions)

Industry Validation:

“Agents are the new apps.” — Dharmesh Shah, HubSpot CTO

“We’ve reached the upper limits of what LLMs can do. The future lies with autonomous agents.” — Marc Benioff, Salesforce CEO (The Wall Street Journal, “Future of Everything” podcast)

Google’s Gemini and OpenAI’s Orion are reportedly hitting limits despite larger training datasets. The next breakthrough isn’t bigger models — it’s agentic systems.

🏗️ Why Agents Need Event-Driven Architecture

The Infrastructure Problem

AI agents aren’t just an AI problem — they’re a distributed systems problem. Agents need:

📊 Access to data from multiple sources
🔧 Ability to use tools and external systems
🤝 Communication with other agents
🌐 Outputs available to multiple services
⚡ Real-time responsiveness
📈 Horizontal scalability

This isn’t about better models. It’s about better infrastructure.

The Tight Coupling Problem

You could connect agents via APIs and RPC, but that creates:

❌ Tightly Coupled Architecture
┌─────────┐     ┌─────────┐     ┌─────────┐
│ Agent 1 │────▶│ Agent 2 │────▶│ Agent 3 │
└─────────┘     └─────────┘     └─────────┘
     │               │               │
     ▼               ▼               ▼
  [Fails] ───────▶ [Fails] ─────▶ [Fails]

Problems:
• If one agent fails, entire chain breaks
• Hard to scale individual agents
• Changes ripple through system
• Difficult to add new agents
• Can't support multiple consumers

The Event-Driven Solution

Event-Driven Architecture (EDA) solves this through loose coupling:

✅ Event-Driven Architecture
        ┌──────────────────┐
        │   Event Bus      │
        │ (Redis Streams)  │
        └──────────────────┘
         ↑      ↑      ↑
         │      │      │
    ┌────┴──┐ ┌┴────┐ ┌┴────┐
    │Agent 1│ │Agent│ │Agent│
    │       │ │  2  │ │  3  │
    └───────┘ └─────┘ └─────┘

Benefits:
✅ Agents don't depend on each other
✅ Scale agents independently
✅ Add/remove agents dynamically
✅ Multiple consumers per event
✅ Automatic retries & DLQ
✅ Message persistence

Agents as Microservices

Like microservices, agents are:

Autonomous - Operate independently
Decoupled - Don’t depend on each other
Scalable - Add more instances for load

But agents go further:

Context-rich - Need shared information to reason
Tool-enabled - Interact with external systems
Collaborative - Share insights with other agents
Adaptive - Modify behavior based on events

The Challenge: Managing these informational dependencies without tight coupling. The Solution: EDA provides a “central nervous system” for data flow.

🚀 What OmniDaemon Provides

Traditional AI (Request-Driven)

User asks → AI responds → Done ❌

OmniDaemon (Event-Driven)

Event happens → AI agent reacts → Result stored ✅
              → Multiple agents listen
              → Automatic retries
              → DLQ for failures

Core Features

Feature	What It Means
🤖 Run Any AI Agent	OmniCore Agent, Google ADK, LangChain, CrewAI, AutoGen, LlamaIndex, or custom
📨 Event-Driven	Agents listen to topics, not HTTP endpoints
🔄 Auto Retries	Failed tasks retry automatically (configurable)
💀 Dead Letter Queue	Failed messages go to DLQ for analysis
📊 Real-time Metrics	Tasks received, processed, failed, timing
🎛️ Full Control	Beautiful CLI + HTTP API for management
⚖️ Horizontal Scaling	Run multiple agent instances for load balancing
🔌 Pluggable	Swap backends via environment variables (no code changes!)

Pluggable Architecture

The Simple Truth: You provide the URL, OmniDaemon handles EVERYTHING else!

# Event Bus - Switch backends with NO code changes
EVENT_BUS_TYPE=redis_stream + REDIS_URL=...         # Current
EVENT_BUS_TYPE=rabbitmq + RABBITMQ_URL=...          # Coming soon
EVENT_BUS_TYPE=kafka + KAFKA_SERVERS=...            # Coming soon

# Storage - Switch backends with NO code changes
STORAGE_BACKEND=json + JSON_STORAGE_DIR=...         # Development
STORAGE_BACKEND=redis + REDIS_URL=...               # Production
STORAGE_BACKEND=postgresql + POSTGRES_URL=...       # Coming soon
STORAGE_BACKEND=mongodb + MONGODB_URI=...           # Coming soon

Your agent code NEVER changes. Just update environment variables!

🎯 When to Use OmniDaemon

OmniDaemon is a distributed, event-driven runtime for AI agents and automation. It works seamlessly alongside HTTP, WebSockets, and SSE — and often powers the internal logic behind them.

✅ Perfect For

Background AI Agents Autonomous agents reacting to events, triggers, or system signals.
Event-Driven Workflows Multi-step pipelines coordinated through events.
Distributed Multi-Agent Systems Sub-agents running across different servers, runtimes, or toolsets.
Async & Long-Running AI Tasks Workloads that shouldn’t block a client request (analysis, ingestion, evaluation).
Enterprise AI Ops Durable, observable, scalable systems with retries, logs, and monitoring baked in.
Hybrid Real-Time + Background Work Use SSE/WebSockets for live streaming, while OmniDaemon handles internal agent events and orchestration.

❌ Overkill For (Simpler Alternatives Exist)

Simple HTTP APIs — FastAPI/Flask are more straightforward.
Pure Real-Time Chat Only — WebSockets/SSE alone give lower direct latency.
Strict Synchronous Request→Response — REST/RPC is simpler when no async logic is involved.
Single-Shot Scripts — A basic Python script is sufficient.

🆚 Compared to Alternatives

Tool	Use Case	vs OmniDaemon
Celery	Task queues	❌ Not AI-first, complex setup, no agent abstraction
AWS Lambda	Serverless functions	❌ Cold starts, time limits, vendor lock-in
Temporal	Workflow engine	❌ Heavy, complex, not AI-optimized
Airflow	DAG orchestration	❌ Batch-oriented, not real-time events
OmniDaemon	AI Agent Runtime	✅ AI-first, event-driven, any framework, production-ready

🏗️ Architecture

┌─────────────────────────────────────────────────────────────────┐
│                         OmniDaemon                              │
│                   Universal Runtime Engine                       │
└─────────────────────────────────────────────────────────────────┘
                              │
                              │
        ┌─────────────────────┼─────────────────────┐
        │                     │                     │
        ▼                     ▼                     ▼
┌──────────────┐    ┌──────────────┐    ┌──────────────┐
│  Event Bus   │    │   Storage    │    │ Agent Runner │
│ (Pluggable!) │    │ (Pluggable!) │    │  (Your Code) │
├──────────────┤    ├──────────────┤    ├──────────────┤
│ • Streams    │    │ • Agents     │    │ • Register   │
│ • Pub/Sub    │    │ • Results    │    │ • Subscribe  │
│ • DLQ        │    │ • Metrics    │    │ • Process    │
│ • Groups     │    │ • Config     │    │ • Respond    │
│              │    │              │    │              │
│ Redis ✅     │    │ Redis ✅     │    │              │
│ Kafka 🚧     │    │ JSON ✅      │    │              │
│ RabbitMQ 🚧  │    │ Postgres 🚧  │    │              │
│ NATS 🚧      │    │ MongoDB 🚧   │    │              │
└──────────────┘    └──────────────┘    └──────────────┘
        │                     │                     │
        │                     │                     │
        └─────────────────────┴─────────────────────┘
                              │
                              │
        ┌─────────────────────┼─────────────────────┐
        │                     │                     │
        ▼                     ▼                     ▼
┌──────────────┐    ┌──────────────┐    ┌──────────────┐
│     CLI      │    │     API      │    │     SDK      │
│   (Typer)    │    │  (FastAPI)   │    │  (Python)    │
├──────────────┤    ├──────────────┤    ├──────────────┤
│ • Commands   │    │ • REST       │    │ • Register   │
│ • Rich UI    │    │ • Endpoints  │    │ • Publish    │
│ • Monitoring │    │ • Webhooks   │    │ • Query      │
└──────────────┘    └──────────────┘    └──────────────┘

Key Components

Event Bus (Pluggable) - Message broker for event distribution
- Currently: Redis Streams
- Coming: Kafka, RabbitMQ, NATS
Storage (Pluggable) - Persistent layer for agents, results, metrics
- Currently: Redis, JSON
- Coming: PostgreSQL, MongoDB, S3
Agent Runner - Orchestrates agent execution and lifecycle
CLI - Beautiful command-line interface (powered by Rich)
API - RESTful HTTP API (powered by FastAPI)
SDK - Python SDK for agent integration

🏭 Production Mode: Agent Supervisors

Refactoring from “Simple Mode” to “Production Mode” is easy.

When to use Supervisors?

Simple Mode (sdk.register_agent): Great for lightweight tasks, development, and simple logic. Runs in the main process.
Supervisor Mode (create_supervisor_from_directory): REQUIRED for production AI agents. Runs in a separate process with auto-restart, crash protection, and full isolation.

How to implement

Structure your agent:

my_agent/
├── __init__.py
├── agent.py       # Your AI agent class/logic
└── requirements.txt

Use the Supervisor in agent_runner.py:

from omnidaemon.supervisor import create_supervisor_from_directory

# Create a supervisor that manages the agent process
supervisor = await create_supervisor_from_directory(
    agent_name="my_robust_agent",
    agent_dir="./my_agent",                 # Directory containing code
    callback_function="my_agent.callback",  # Function to call inside that dir
    watch_paths=["./my_agent"]              # Auto-reload on code change!
)

# Register the supervisor to handle events
await sdk.register_agent(
    agent_config=AgentConfig(topic="my.topic"),
    callback=supervisor.handle_event        # Supervisor handles the event!
)

📚 See full examples:

examples/agents_with_supervisors

examples/google_adk_with_supervisor

🚀 Quick Start

Get OmniDaemon running in 5 minutes with production-ready process isolation:

# 1. Install Redis (event bus backend)
docker run -d -p 6379:6379 --name redis redis:latest

# 2. Install OmniDaemon
uv add omnidaemon
# Or: pip install omnidaemon

# 3. Create agent directory
mkdir my_first_agent

Create your agent (my_first_agent/agent.py):

# my_first_agent/agent.py
def greeter_callback(message: dict):
    """Your agent runs here - in an isolated process!"""
    name = message.get("content", {}).get("name", "stranger")
    return {"reply": f"Hello, {name}! 👋"}

Create init file (my_first_agent/__init__.py):

touch my_first_agent/__init__.py

Create runner (agent_runner.py):

# agent_runner.py - Production-ready with Supervisor
import asyncio
from omnidaemon import OmniDaemonSDK, AgentConfig
from omnidaemon.supervisor import create_supervisor_from_directory

sdk = OmniDaemonSDK()

async def main():
    # Create supervisor (runs agent in separate process)
    supervisor = await create_supervisor_from_directory(
        agent_name="greeter",
        agent_dir="./my_first_agent",
        callback_function="agent.greeter_callback"
    )
    
    await sdk.register_agent(
        agent_config=AgentConfig(topic="greet.user"),
        callback=supervisor.handle_event
    )
    
    await sdk.start()
    print("🎧 Agent running. Press Ctrl+C to stop.")
    
    try:
        while True:
            await asyncio.sleep(1)
    except KeyboardInterrupt:
        await sdk.shutdown()

if __name__ == "__main__":
    asyncio.run(main())

Run it:

python agent_runner.py

🎉 Your AI agent is now running in an isolated process with auto-restart! → Continue with Full Tutorial

📚 What’s Next?

For New Users

Getting Started - Understand core concepts
Quick Start Tutorial - Build your first agent in 10 minutes
Complete Examples - Real-world agent implementations

For Developers

How-To Guides - Solve specific problems
Common Patterns - Production-ready recipes
API Reference - Complete SDK documentation

For Architects

Architecture & Design - Deep dive into system design
Enterprise - Use cases and deployment guide

🌟 Learn More

Read the README - Comprehensive overview
Explore Examples - Working code
Join Community - Get help and contribute

📖 References

This documentation is inspired by Sean Falconer’s article: “The Future of AI Agents is Event-Driven”

👨‍💻 About

Created by Abiola Adeshina and the OmniDaemon Team From the creators of OmniCore Agent — building the future of event-driven AI systems ⭐ Star on GitHub · 🐛 Report Bug · 💡 Request Feature

Get Started

Core Concepts

How-To Guides

Architecture

Enterprise

Community

​🌐 OmniDaemon

​Universal Event-Driven Runtime Engine for AI Agents

​🌊 Why OmniDaemon Exists: The “Single Process” Trap

​The Problem

​The OmniDaemon Solution: Process Isolation

​📚 Event-Driven Architectures: A Primer

​⚠️ The Rise and Fall of Early Social Giants

​🚀 The Future is Event-Driven Agents

​🤝 Agents as Microservices with Informational Dependencies

​🔓 Decoupling While Keeping Context Intact

​⚡ Scaling Agents with Event-Driven Architecture

​🎯 Event-Driven Agents Will Define the Future of AI

​🎯 What is OmniDaemon?

​In 5 Seconds

​🌊 Why Event-Driven AI? The Evolution Story

​The AI Evolution: Three Waves

​Wave 1: Predictive Models (Traditional ML)

​Wave 2: Generative Models (LLMs)

​Wave 3: Agentic AI (Current)

​🏗️ Why Agents Need Event-Driven Architecture

​The Infrastructure Problem

​The Tight Coupling Problem

​The Event-Driven Solution

​Agents as Microservices

​🚀 What OmniDaemon Provides

​Traditional AI (Request-Driven)

​OmniDaemon (Event-Driven)

​Core Features

​Pluggable Architecture

​🎯 When to Use OmniDaemon

​✅ Perfect For

​❌ Overkill For (Simpler Alternatives Exist)

​🆚 Compared to Alternatives

​🏗️ Architecture

​Key Components

​🏭 Production Mode: Agent Supervisors

​When to use Supervisors?

​How to implement

​🚀 Quick Start

​📚 What’s Next?

​For New Users

​For Developers

​For Architects

​🌟 Learn More

​📖 References

​👨‍💻 About