CAHDqjibDqGg4yDm54TCK8Q4QJVGfAE8Q2d7tygLpZpump

A self-evolving agentic protocol

Persistent behavioral learning for agentic systems. File-native memory protocols. Production-grade instruction synthesis. Scoped, confidence-tracked, decay-aware.

1. Install ButtCore

2. Install FartOS

Core

Signal

Memory

Scope

Evolve

Reflect

Guard

Core

Signal

Memory

Scope

Evolve

Reflect

Guard

Institutional intelligence

Agents that learn in real-time

Put it in front of ten thousand users. It reads every conversation. It finds what confuses people, where they give up, what actually works. Then it rewrites its own instructions based on what it found.

Week one: five hundred conversations. It notices that thirty percent of users get stuck at the same point and leave. No human caught that. It proposes a fix, runs it on a slice of traffic, the numbers improve, the fix becomes permanent. Nobody wrote a new prompt.

The agent handling your users next quarter is not the one that launched. It learned its way to something better. On its own.

fartos/README.md

# FartOS

A learning + coordination layer for AI agents. Feed it any JSON, and it learns
what's true, proves what works, and improves your agent's instructions over time.

**Models do the thinking. The framework does the bookkeeping.**

## What this is

Bolt FartOS onto any agent. It ingests any data (chat, metrics, on-chain, tickets),
lets models judge what it means (no string matching), remembers what's true (scoped,
never leaking), split-tests behavior with a multi-arm bandit, promotes the winners,
and flags a human when it's unsure. It runs fully offline out of the box — no API key,
no database — and every piece is swappable.

**Not a chatbot. Not a wrapper. A learning engine.**

## Architecture

```
┌──────────────────────────────────────────────────────────┐
│                      EVOLVE (main API)                     │
├────────────────────────────────────────────────────────────┤
│  MIDDLEWARE — the thinking (every judgment is a model call)│
│  intent · sentiment · contradiction · distill · narrate · embed
├────────────────────────────────────────────────────────────┤
│  SUBSTRATE — the bookkeeping (deterministic)               │
│  scope · lessons · semantic recall · confidence · bandit   │
├────────────────────────────────────────────────────────────┤
│  ORCHESTRATION            │  COMMUNICATION (actuation)      │
│  durable queue · workers  │  promote · escalate · webhook · │
│  agents · coordinator     │  code-edit (a broken sender     │
│                           │  never crashes the loop)        │
└────────────────────────────────────────────────────────────┘
```

## Install

```bash
npm install @buttfart_os/fartos
# optional: a hosted model provider
npm install @buttfart_os/provider-anthropic   # or @buttfart_os/provider-openai
```

`@buttfart_os/fartos` ships an in-memory store and a local embedder, so it runs with zero
external services.

## Quick start

```typescript
import { Evolve } from '@buttfart_os/fartos';
import { AnthropicProvider } from '@buttfart_os/provider-anthropic';

const evolve = new Evolve({
  provider: new AnthropicProvider({ apiKey: process.env.ANTHROPIC_API_KEY! }),
  defaultScope: { tenant: 'my-app' },
});

// 1) Ingest ANY JSON — non-blocking, no adapter required.
await evolve.ingest(
  { userMessage: 'please keep replies short and warm' },
  { scope: { tenant: 'my-app', user: 'user-123' }, type: 'feedback' },
);

// 2) A worker processes the queue: judge → reflect → reconcile → learn.
await evolve.drain();

// 3) Serve evolved instructions (base prompt + learned, active lessons).
const { instructions } = await evolve.getInstructions({
  scope: { tenant: 'my-app', user: 'user-123' },
  baseInstructions: 'You are a helpful customer support agent.',
});
```

Want it fully offline (tests, CI, demos)? Swap the provider for the built-in
`MockProvider` and use the default `LocalEmbedder` — no keys, deterministic output.
See `examples/closed-loop/run.js` for an end-to-end run.

## Split-test behavior (multi-arm bandit)

```typescript
const exp = await evolve.experiments.create({
  scope: { tenant: 'my-app' },
  hypothesis: 'a warmer tone lifts replies',
  arms: [
    { label: 'control', instruction: 'reply normally' },
    { label: 'warm', instruction: 'reply warmly and concisely' },
  ],
  minSamples: 100,
  primaryMetric: 'reply_rate',
  guardrails: [{ name: 'unsubscribe', lowerIsBetter: true, tolerance: 0.05 }],
});

const a = await evolve.experiments.assign(exp.id, viewerId);     // stable per viewer
await evolve.experiments.recordOutcome(exp.id, viewerId, {       // push your metric
  primary: 0.42,
  guardrails: { unsubscribe: 0.01 },
});

const why = await evolve.experiments.explain(exp.id);            // plain-language rationale
```

Real statistics decide the winner (Welford variance, Welch's t-test, two-proportion
z-test); a guardrail can veto a "win"; the winner auto-promotes into instructions and
losers retire.

## API

```typescript
evolve.ingest(json, opts)            // Feed ANY JSON (non-blocking; enqueues)
evolve.drain(max?)                   // Worker: process queued events into lessons
evolve.queueDepth()                  // How many events are waiting
evolve.recall(query, opts)           // Semantic search over learned lessons
evolve.getInstructions(opts)         // Base prompt + active lessons (+ served arm)
evolve.promoteScope(lessonId, scope) // Share a corroborated lesson up-scope
evolve.maintain(opts)                // Evict weak lessons (cadence is yours)
evolve.registerActuator(actuator)    // Add an outbound channel (webhook, etc.)
evolve.recordedErrors()              // Inspect any degraded middleware calls

evolve.experiments.create(cfg)       // Start a multi-arm experiment
evolve.experiments.assign(id, key)   // Stable arm assignment for a key
evolve.experiments.recordOutcome(id, key, metrics)  // Push a success metric
evolve.experiments.explain(id)       // Read why a variant won (plain language)
```

## Core concepts

### Ingest — any data, no adapter
`ingest(json)` accepts any shape. A `context` object is first-class ("what this data
means"). Numeric/on-chain blobs are canonicalized into an embeddable surface, so even
data with no text is recallable. A math-only cost gate throttles high-volume numeric
streams so you don't pay for redundant model calls.

### Intelligence — model-driven, zero strings
Intent, sentiment, contradiction, distillation, and narration are each a model call
with a typed, validated output. Recall is semantic (vector similarity), never
`.includes()`. Every judgment unit is independently configurable — its model, role,
prompt, and on/off switch — via the `middlewares` config, not code. Sentiment is off by default.

### Memory — earned, scoped, decaying
Learned lessons gain confidence by **repetition**, climbing a ladder (`candidate →
testing → active`); only `active` lessons are served. Retrieval is by meaning. Weak
lessons are evicted. Everything is scoped `global > tenant > project > user > session
> agent` with zero cross-scope leakage — nothing becomes global by accident.

### Experimentation — prove before trust
A multi-arm bandit (epsilon-greedy / UCB / Thompson; explore-vs-exploit is a dial)
with real statistics and guardrail vetoes. Winners auto-promote into instructions;
the agent can read why in plain language.

### Communication — act and escalate
Decisions emit typed effects routed to pluggable actuators: promote into the
instruction store (pull), notify a webhook/email (your sender), or apply a code edit
(your writer). The framework never touches the network or filesystem itself, and a
broken sender can never crash the loop. Low-confidence judgments escalate to a human.

### Persistence — in-memory by default, swap to durable
Out of the box everything is in-memory. To persist, implement the small `LessonRepo`,
`VectorStore`, `QueueStore`, and `BanditStore` interfaces and pass them in. Back the
queue and stores with a shared service to run multiple workers. See `EXTENDING.md`.

## Packages

| Package | Description |
|---------|-------------|
| `@buttfart_os/fartos` | The learning engine (in-memory defaults, local embedder, runs offline) |
| `@buttfart_os/provider-anthropic` | Claude provider |
| `@buttfart_os/provider-openai` | GPT provider |

> No adapter packages needed — `ingest(json)` takes raw data of any shape directly.
> Persistence is bring-your-own: implement `LessonRepo` / `VectorStore` / `QueueStore` /
> `BanditStore` to swap the in-memory defaults for a durable backend (see `EXTENDING.md`).

## Extending

Bring your own data, storage, model, metric, and sender — swap any piece without
forking core. See **`EXTENDING.md`** for: custom middleware, custom storage, a custom
outbound channel, and pushing a metric. See **`DECISIONS.md`** for the design
rationale, and **`GOAL-TESTS.md`** for the capability → test coverage map.

## License

MIT

Explorer83

DECISIONS.mdfartos/

package.jsonfartos/examples/closed-loop/

run.jsfartos/examples/closed-loop/

EXTENDING.mdfartos/

GOAL-TESTS.mdfartos/

package-lock.jsonfartos/

package.jsonfartos/

package.jsonfartos/packages/core/

bandit-engine.tsfartos/packages/core/src/bandit/

explain.tsfartos/packages/core/src/bandit/

index.tsfartos/packages/core/src/bandit/

memory-bandit-store.tsfartos/packages/core/src/bandit/

policies.tsfartos/packages/core/src/bandit/

stats.tsfartos/packages/core/src/bandit/

types.tsfartos/packages/core/src/bandit/

welford.tsfartos/packages/core/src/bandit/

actuator-registry.tsfartos/packages/core/src/communication/

actuators.tsfartos/packages/core/src/communication/

escalation-evaluator.tsfartos/packages/core/src/communication/

index.tsfartos/packages/core/src/communication/

types.tsfartos/packages/core/src/communication/

evolve.tsfartos/packages/core/src/

core.tsfartos/packages/core/src/exports/

evolve.tsfartos/packages/core/src/exports/

memory.tsfartos/packages/core/src/exports/

scope.tsfartos/packages/core/src/exports/

signal.tsfartos/packages/core/src/exports/

index.tsfartos/packages/core/src/

index.tsfartos/packages/core/src/ingest/

json-ingest.tsfartos/packages/core/src/ingest/

judgment-gate.tsfartos/packages/core/src/ingest/

canonicalize.tsfartos/packages/core/src/memory/

index.tsfartos/packages/core/src/memory/

lesson-repo.tsfartos/packages/core/src/memory/

reconciler.tsfartos/packages/core/src/memory/

scope-enforcer.tsfartos/packages/core/src/memory/

vector-store.tsfartos/packages/core/src/memory/

base-middleware.tsfartos/packages/core/src/middleware/

builtins.tsfartos/packages/core/src/middleware/

embed-middleware.tsfartos/packages/core/src/middleware/

index.tsfartos/packages/core/src/middleware/

registry.tsfartos/packages/core/src/middleware/

types.tsfartos/packages/core/src/middleware/

agent-registry.tsfartos/packages/core/src/orchestration/

coordinator.tsfartos/packages/core/src/orchestration/

index.tsfartos/packages/core/src/orchestration/

queue.tsfartos/packages/core/src/orchestration/

clock.tsfartos/packages/core/src/testing/

index.tsfartos/packages/core/src/testing/

local-embedder.tsfartos/packages/core/src/testing/

mock-provider.tsfartos/packages/core/src/testing/

types.tsfartos/packages/core/src/

confidence.tsfartos/packages/core/src/utils/

scope.tsfartos/packages/core/src/utils/

time.tsfartos/packages/core/src/utils/

actuation.test.tsfartos/packages/core/tests/

bandit.test.tsfartos/packages/core/tests/

confidence.ladder.test.tsfartos/packages/core/tests/

confidence.test.tsfartos/packages/core/tests/

evolve.e2e.test.tsfartos/packages/core/tests/

ingest.test.tsfartos/packages/core/tests/

middleware.test.tsfartos/packages/core/tests/

orchestration.test.tsfartos/packages/core/tests/

recall.test.tsfartos/packages/core/tests/

reconcile.test.tsfartos/packages/core/tests/

scope.test.tsfartos/packages/core/tests/

units.test.tsfartos/packages/core/tests/

tsconfig.jsonfartos/packages/core/

vitest.config.tsfartos/packages/core/

package.jsonfartos/packages/providers/anthropic/

anthropic-provider.tsfartos/packages/providers/anthropic/src/

index.tsfartos/packages/providers/anthropic/src/

tsconfig.jsonfartos/packages/providers/anthropic/

package.jsonfartos/packages/providers/base/

base-provider.tsfartos/packages/providers/base/src/

index.tsfartos/packages/providers/base/src/

tsconfig.jsonfartos/packages/providers/base/

package.jsonfartos/packages/providers/openai/

index.tsfartos/packages/providers/openai/src/

openai-provider.tsfartos/packages/providers/openai/src/

tsconfig.jsonfartos/packages/providers/openai/

README.mdfartos/

tsconfig.base.jsonfartos/

buttcore/README.md

# ButtCore

Drop-in markdown files that teach AI agents to learn and evolve from your habits.

## What This Is

ButtCore is a set of markdown files you drop into any project directory. As you work with any AI agent — Claude, GPT, Cursor, Copilot, Codex, Gemini — the system observes, learns, distills, and updates itself.

It maintains a living `SOUL.md` file: a continuously evolving instruction set that reflects your preferences, communication style, technical decisions, corrections, and patterns.

**No database. No API keys. No configuration. Just markdown and intelligence.**

## Install

```bash
npx buttcore init
```

This creates:
- `~/.buttcore/` (global, shared across all projects)
  - `SOUL.md` — Living identity and preferences
  - `PATTERNS.md` — Cross-project patterns
  - `MISTAKES.md` — Process failure lessons
  - `LEXICON.md` — Your personal vocabulary
- `.buttcore/` (local to this project)
  - `PROJECT.md` — Project-specific context
  - `SESSION_LOG.md` — Rolling session history
  - `CORRECTIONS.md` — Correction tracking
- `AGENTS.md` — Cross-tool bridge file

## How It Works

1. **Agent reads** ButtCore files at session start
2. **You work** with the agent normally
3. **When corrected**, the agent logs to CORRECTIONS.md
4. **When patterns emerge** (3+ similar corrections), the Distillation Process triggers
5. **Distilled knowledge** flows to the appropriate file (SOUL.md, MISTAKES.md, LEXICON.md, PATTERNS.md)
6. **Next session**, the agent is better

The learning is real. Not "I'll try harder" — structural, persistent, versioned knowledge that survives across sessions, projects, and even different AI tools.

## Commands

| Command | What it does |
|---------|-------------|
| `buttcore init` | Initialize ButtCore in current project |
| `buttcore onboard` | Interactive questionnaire to pre-populate SOUL.md |
| `buttcore status` | Show current learning state |
| `buttcore compact` | Compress old session logs |
| `buttcore export` | Export all knowledge to a single file |
| `buttcore doctor` | Validate file integrity |

## Philosophy

### Global vs Local

ButtCore enforces a strict scope boundary:

- **Global** (`~/.buttcore/`): Who you are. Applies everywhere.
- **Local** (`.buttcore/`): What this project is. Stays here.

Project-specific data never leaks into global files. Global knowledge is abstract and generalizable.

### Confidence Escalation

Not every observation becomes a rule:

| Signal | Where it goes | Confidence |
|--------|--------------|------------|
| 1 occurrence | SESSION_LOG.md | — |
| 2 occurrences | SOUL.md | low |
| 5 occurrences | SOUL.md | medium |
| 10+ occurrences | SOUL.md | high |

### The Agent Learns From Its Mistakes

MISTAKES.md doesn't just record what the user wants — it records what the agent got wrong and why. "I defaulted to complex when the user meant simple" is more valuable than "user prefers simple."

## Works With

Any AI agent that can read files in your project:
- Claude Code / Claude Desktop
- Cursor (reads AGENTS.md + .cursor/rules/)
- GitHub Copilot (reads AGENTS.md)
- OpenAI Codex
- Windsurf
- Aider / OpenCode / Goose
- Any custom agent with filesystem access

## License

MIT

Personal intelligence

Agents that evolve and adapt to you

It reads how you work. Your corrections. The things you rebuild. The preferences you reveal by changing what it made. It writes all of it down.

Day one, it rebuilds the same thing four times before getting it right. Day thirty, it gets it right first. Not because it was reconfigured. Because it watched long enough to understand how you think.

Everything it learns lives inside your environment, invisible to everyone else. An evolving intelligence trained entirely on one person. You.

Signal

Reads you, constantly

Corrections. Frustration. Conversations that stop before they finish. Signal the system was already producing that nobody was reading.

Memory

Keeps the lesson, not the log

Three similar corrections become one principle. The incidents fade. What they meant stays, attached to a confidence score that grows each time it is confirmed.

Instinct

Rewrites its own instructions

The instructions it runs on are not static. When a pattern clears the evidence threshold, the instructions update. No new deployment. No human intervention.

Scope

Knows exactly where it is

What it learns about one user stays with that user. What it learns in one context never leaks into another. Isolated at the architecture level.

Evidence

Demands proof

One bad interaction changes nothing. A single correction is noise. It waits for a pattern to repeat across enough independent interactions before it trusts itself.

Control

Asks before it commits

The changes that matter most surface as proposals, evidence attached. Nothing high-stakes applies itself.

What we built

Two Artifacts

One learns a single person. Corrections, patterns, ways of thinking. The longer it runs alongside them, the closer it gets to knowing what they want before they ask.

The other scales to a population. Point it at a product with thousands of users and it watches all of them. Spots what they struggle with. Tests solutions. Updates the product around what it found.

Same architecture underneath. The only variable is what you aim it at.

Begin

It's already learning. The only question is whose.

Give it something live to watch. A system already running, conversations already happening. Come back in thirty days.

Read the docs Explore framework