Open source · AGPL-3.0 · runs on your laptop in ~60s

The open-source framework for
agents that do the work.

BoringOS runs your agents as the CLI tools you already use — Claude Code, Codex, Gemini, Ollama — and wires them into tasks, workflows, memory, and a multi-tenant backend. Budgets, audit trails, and human approvals live in the execution path, not bolted on after.

The open-source agent framework built by Hebbs. One command boots the whole thing locally.

Run the full stack locally — paste into your coding agent (Claude Code · Cursor · Codex · Gemini CLI), in an empty folder
deploy boringos shell on my localhost
Prefer real commands?
$ git clone https://github.com/BoringOS-dev/boringos && cd boringos && pnpm install && pnpm dev

Boots embedded Postgres and serves the shell at localhost:3000. No Docker, no external services.

What the shell is

One command boots a full agentic OS.

The shellis the reference app that ships with the repo — a complete operating surface for your agents. You don't build any of it to start. Run the command above and this is live at localhost:3000.

your coding agent
boot log
✓ Cloned github.com/BoringOS-dev/boringos
✓ Installed packages · built workspace
✓ Embedded Postgres started on :54321
✓ Registered 9 built-in Modules
✓ Shell live at http://localhost:3000
Open the shell. Sign up. Start talking to your agents.

Agents org

A team of agents with roles, hierarchy, and delegation — already wired.

Copilot

A chat surface in every app that can run tools and edit your code.

Inbox

Email / Slack / event triage queue agents work from.

Workflows

Visual DAG runner over the same tools your agents call.

Drive

Tenant-isolated file storage with per-agent ACLs.

Budgets

Spend caps per tenant, agent, and task — enforced at runtime.

CRM

A full example app — deals, contacts, schema, UI — shipped as a Module.

Modules screen

Drop in new apps as signed bundles, live, no restart.

The three primitives

Three concepts. That's the whole model.

Every connector, every app, every built-in capability is made of these. Learn them once, and you can read — or write — any part of the system.

Skills

Behavior, in markdown.

Plain .md files concatenated into the agent's system prompt on every wake. Teach it when to use a tool, the edge cases, your house style. No templating — just words.

skills/deals.md → injected under ## Skills

Tools

Capability, with types.

Zod-typed callables dispatched at POST /api/tools/<module>.<name>. The same handler runs from agents, workflows, routines, or your own routes. Every call is audited.

crm.list_deals({ stage: "blocked" })

Modules

Everything, bundled.

One manifest binds skills + tools + schema + workflows + routines + webhooks + OAuth + UI. Built-ins, third-parties, your own — all the same shape. app.module(x) wires the rest.

app.module(crmModule)
How it runs

From goal to done.

You set the goal. The agent figures out the rest. Every action is a Tool call. Every Tool call is audited.

1

Assign a goal

Create a task, assign it to an agent or a role. A comment on a task is a message; posting one wakes the agent.

2

The agent wakes

It reads its skills, the task, recent comments, and relevant memory — assembled fresh by the context pipeline.

3

It works in a CLI

The framework spawns Claude Code / Codex / Gemini / Ollama as a subprocess. Skills shape behavior; tools execute capability.

4

Guardrails hold

Budget tracked per run. High-risk steps pause for human approval. Every tool call lands in the audit log.

5

It reports back

The run's result auto-posts as a comment. Remaining work re-wakes the agent. Memory persists across runs.

Agents that delegate.

Agents have a reportsTo field. They break goals into subtasks, assign each to the right teammate, escalate when blocked, and hand off to humans cleanly.

Hierarchy & delegation

CEO sets the goal, CTO breaks it down, engineers execute, QA validates. A next_actor state machine routes work between agents and humans.

Shared memory

Every run builds context. Pluggable provider — Hebbs out of the box, or your own.

Budget-enforced

Cost tracked per run. Limits per agent, per task, per tenant. No runaway spend.

Workflows you can see.

A DAG that dispatches every node through the tool registry — the same handlers your agents call. Persisted runs, live SSE, replay, fork-from-here, budget gates, and pause-on-approval.

Tool registry as the backend

Each block resolves to a Tool — same Zod validation, same audit log, whether the call comes from an agent, a workflow, or a routine.

Live runs

Every block transition streams via SSE. Watch the DAG light up — no polling, no reconstruction.

Replay & fork

Re-execute past runs. Fork from any block. Compare two runs side by side.

Batteries included

The controls are first-class.

Budgets, audit, runtime routing, and approvals aren't add-ons — they sit in the execution path. Here's what @boringos/core ships.

Budget enforcement

Spend caps by tenant, agent, or task. Hard stops or soft alerts. Cost — including Anthropic cache tokens — tracked per run, not estimated after.

Audit ledger

Every tool call writes a row to tool_calls with actor, inputs, and outcome. Run transitions, comments, approvals — all on one timeline you can replay.

Runtime router

Route any task to Claude Code, Codex, Gemini CLI, Ollama, a raw command, or a webhook. Skills and tools stay stable while you swap the backend.

Human approvals

wait-for-human blocks pause a run and create an Actions-queue card. Approve, and execution resumes with your input merged in. Low-risk paths stay autonomous.

Multi-tenant by default

Sessions, invitations, team management, device auth — every domain row carries a tenantId. Two tenants never see each other's data.

Signed module installs

Third-party apps ship as Ed25519-signed .hebbsmod bundles. The host verifies, content-addresses, migrates schema, and registers tools on a live process.

Runtimes — any agentic CLI is one
claude
Claude Code CLI
chatgpt
OpenAI Codex CLI
gemini
Google Gemini CLI
ollama
Local Ollama model
command
Any shell command
webhook
HTTP POST to a URL
Built-in isolation

Autonomy without the blast radius.

Agents run with --dangerously-skip-permissionsso they don't stop to ask. That's safe because every byte they read or write goes through Drive — the framework's proxy over the filesystem. Tenants can't see each other. Private files stay private.

drive namespace · tenant=<tenantId>
<tenantId>/                  # isolation root — you cannot escape it
├── shared/...               # tenant-wide   · agents read+write
├── users/<userId>/...       # PRIVATE       · agents denied
├── agents/<agentId>/...     # agent home    · own=rw · others=read-only
├── tasks/<taskId>/...       # deliverables  · tenant-shared
└── projects/<projectId>/... # long-running  · tenant-shared

Tenant isolation, structural

Every path is prefixed with the tenant id at the storage layer. No code path reads or writes outside the tenant root. Path traversal is rejected before the storage call.

User space is genuinely private

users/<id>/ returns "private — not accessible to agents" on every agent attempt. Not a permission you forgot to set — the default, in the type system, with a literal error string you can grep.

Per-agent ACLs

Agents read each other's working drafts (transparency by default) but can only write to their own agents/<id>/ folder. Cross-agent writes are rejected before storage.

Every byte goes through Drive

Reads, writes, lists, deletes — all through DriveManager. That means tenant scoping, ACL check, audit row, event fan-out, memory-sync index, every time. No side door.

The agent has full power inside its lane. The lane is the load-bearing part. Source: @boringos/core/src/modules/drive-acl.ts.

9 Modules shipped in the box.

Everything you'd otherwise wire up yourself — already a Module, already installed. Same shape as the one you'll write next.

frameworkcore

Tasks, comments, agents, runs

memorycore

Pluggable cognitive memory

drivecore

File storage + ACL

workflowcore

DAG runner over the tool registry

inboxcore

Email/Slack/event triage queue

triagecapability

Routes new items to the right agent

copilotmodule

Built-in chat surface for every app

googleconnector

Gmail + Calendar via OAuth

slackconnector

Channels, DMs, slash commands

Open ecosystem

Extend the shell — without forking it.

A Module is a TypeScript file with a manifest. Bundle it into a .hebbsmod archive, upload it, and hosts install it per-tenant — same flow as a Chrome extension, just for agents.

01

Author

Plain TypeScript. Implement the Module interface. Skills as .md, tools as Zod-typed handlers.

02

Bundle

A .hebbsmod is a signed zip — manifest + ESM entry + skills + migrations + UI. ~100KB–2MB.

03

Upload

Drag it onto the shell's Apps screen. Ed25519 signature verified, bytes content-addressed.

04

Install

Tenants opt in. Schema applied, tools live at /api/tools/<id>.<name>, agents read the new skills next wake.

Every app ships with an AI copilot.

The copilot Module is built in. A chat surface that can call any registered tool and edit your code. Zero config, auto-provisioned per tenant.

Copilot
Show me all blocked deals
AI
Calling crm.list_deals
Found 2 blocked: Acme (waiting on legal), Globex (procurement hold).
Add a priority chart to the dashboard
AI
Edited app/page.tsx — added priority distribution chart.
+14 lines. Refresh to see the change.

14 packages on npm.

The framework underneath the shell. Modules sit on top; you usually only depend on @boringos/core and @boringos/module-sdk.

@boringos/coreApplication host
@boringos/agentExecution engine + dispatcher
@boringos/module-sdkModule / Tool / Skill types
@boringos/runtimeCLI runtime adapters
@boringos/memoryCognitive memory
@boringos/driveFile storage + ACL
@boringos/dbPostgres + Drizzle
@boringos/pipelineJob queue
@boringos/uiTyped client + React hooks
@boringos/shellReference UI shell
@boringos/connector-slackSlack reference Module
@boringos/connector-googleGmail + Calendar Module
@boringos/sharedBase types
create-boringosCLI scaffold

Stop reading.
Run it.

One command. One minute. Your own agentic OS on localhost.

deploy boringos shell on my localhost
Building a fully custom agentic product instead? npx create-boringos my-app