Blitz

Autonomous AI Task Orchestration

An orchestration engine that breaks goals into task DAGs, executes them through AI agents, validates results, and recovers from failures automatically.

Repositories

blitzcli/blitz

Platform — server orchestration, React dashboard, CLI, worker agents

blitzcli/blitz-molt

Fleet extension — multi-agent orchestration and voice chat as Moltbot plugins

How it works

"Add OAuth2 with refresh token rotation"

▼

Planner

LLM → task DAG

Task 1

Task 2

Task 3

Validator

build / test / typecheck

pass

commit checkpoint

▼

open PR

fail

recovery planner

▼

retry with new tasks

◆

Task DAG with dependency resolution

Goals decompose into directed acyclic graphs with topological ordering. Deadlock detection skips blocked tasks when remaining work is stuck.

◆

Circuit breaker

Global and per-project circuit breakers open after consecutive failures, auto-reset after cooldown. Prevents cascade failures across the fleet.

◆

LLM-driven failure recovery

On task failure, the system gathers error context and file diffs, sends to the planner for root cause analysis, and injects recovery tasks into the DAG.

◆

Git-based checkpointing

Creates a feature branch per goal, commits after each successful task, and opens a PR with a formatted description on completion.

◆

Dual execution backends

Route tasks to either an embedded AI agent runtime or Claude Code CLI. Configurable per-project with model and provider selection.

Structure

blitz — standalone platform: server orchestration, React dashboard, CLI, worker agents
blitz-molt — the same orchestration engine ported as typed TypeScript plugins for Moltbot, with voice chat added

blitzcli/blitz

server/
├── services/
│   ├── planner.js            # Goal → task DAG
│   ├── conductor.js          # DAG execution loop
│   ├── task-router.js        # Priority queue + circuit breaker
│   ├── validation.js         # Build / test / typecheck
│   ├── project-manager.js    # Multi-project lifecycle
│   └── workers/
│       ├── anthropic-worker.js
│       ├── openai-worker.js
│       └── claude-code-worker.js
src/                          # React dashboard
blitzcli/                     # CLI
worker/                       # Remote worker agent

blitzcli/blitz-molt

extensions/fleet/
├── src/
│   ├── types.ts              # Full type system
│   ├── planner.ts            # LLM task decomposition
│   ├── conductor.ts          # Orchestration + recovery
│   ├── task-router.ts        # Circuit breaker
│   ├── validation-engine.ts
│   ├── agent-bridge.ts       # Dual backend
│   ├── state.ts              # Persistent state
│   └── events.ts             # 22 event types
extensions/voice-chat/
├── src/
│   ├── ws-server.ts          # WebSocket audio
│   ├── stt-client.ts         # Whisper STT
│   └── tts-client.ts         # Kokoro TTS

Why this exists

Tools like Claude Code are powerful but they're still single-shot — you prompt, you wait, you review. The thing that felt fundamentally missing was the ability to gain leverage through running agents continuously, 24/7. That's the real form of leverage: agents that manage themselves, break down tasks, recover from failures, and run to completion while you sleep.

The shift is that your job becomes managing agents, not doing the work. You define goals, the system decomposes them into task graphs, validates each step, and handles failures through LLM-driven recovery planning.

There was also a cost angle. Instead of making hundreds of raw API calls that rack up costs, you can initiate Claude Code instances through the orchestrator — same capability, more control over spend.

The more speculative idea: if you can run these in parallel, you can explore multiple approaches to the same task simultaneously — like a DFS across alternative timelines, keeping the best outcome. Run five agents on the same problem, pick the one that passes all validations cleanly.

In practice, the first real application was a Slack integration. You message a channel, it spins up Claude Code instances that work on your codebase and open PRs — similar to what Devin does, but built on top of Claude Code as the execution backend.