Skip to content

AI agent operates, human verifies

One of Folio’s stated goals is to delegate sheet operations to an AI coding agent (Claude Code, Cursor, Codex, Claude Desktop, …) and make it easy for a human to verify the result and share the sheet. The mechanics for this exist today; this page wires them together into a single recipe so you can see the loop end-to-end on a real example sheet.

The recipe uses examples/research-memory because it ships an agent-audience skill that already exercises the loop. Adapt the principles to any of your own sheets.

The loop in one diagram

┌──────────────┐ ┌─────────────────┐
│ AI agent │ Bash tool, `folio …` │ sheet on disk │
│ (Claude Code,│ ─────────────────────────▶ │ contract + │
│ Cursor, …) │ (default path) │ records + │
│ │ │ provenance │
│ │ MCP (when remote / no │ │
│ │ shell / MCP-native │ │
│ │ runtime) │ │
│ │ ──folio-mcp serve ───────▶ │ │
└──────────────┘ └────────┬────────┘
┌─────────────────┐
│ folio-viewer │
│ (human reviews,│
│ inline edits, │
│ audits hist.) │
└────────┬────────┘
tar czf sheet.tgz
(ship to anyone)

Two ways the agent reaches Folio. The default is the top arrow: the agent uses its built-in Bash tool, runs folio verbs, and folio-agent-skills (via skills.sh) provides the SKILL.md files that tell it which verbs to call. The MCP path is the second arrow and is the right call when the agent has no shell access, runs in an MCP-native runtime, or talks to Folio across a network.

Three guarantees Folio gives you make this work:

  1. Every write records its actor. agent:enrichment, human:alice — whatever string the writer presented to Folio is in provenance.jsonl.
  2. Derivations record their inputs. The input_hash lets a reviewer re-derive the exact value the agent saw, or audit when it last changed.
  3. The sheet is portable. Cache and runtime live outside the directory (ADR‑0008), so tar czf sheet.tgz my-sheet/ is a complete copy — no environment, no DB.

1. Connect an agent to the sheet

For an agent running locally on the developer’s machine, the canonical path is CLI + skills.sh. Reach for MCP only when shell access isn’t available (see the recap at the end of this section).

Default — folio CLI + skills.sh (Claude Code / Cursor / Cline / Codex / …)

The agent already has a Bash tool. Install the folio-agent-skills pack and the agent picks up SKILL.md files that describe which CLI verbs to invoke for each task:

Terminal window
# Project-scoped install (recommended for Folio work)
npx skills add nyuta01/folio
# Or pick a single skill
npx skills add nyuta01/folio --skill folio-quickstart

skills.sh symlinks the files into your runtime’s discovery path (.claude/skills/, .cursor/skills/, etc., across 18+ agents). From that point on the agent — Claude Code, Cursor, Cline, Codex, Windsurf, Aider, Copilot, Gemini, … — discovers the skill, reads it, and shells out to folio directly:

Terminal window
folio upsert ./research-memory --actor agent:research-triage --file findings.jsonl
folio materialize ./research-memory --actor agent:research-triage
folio provenance ./research-memory f_001 status

Every write carries --actor agent:*, which Folio records in provenance.jsonl. No daemon, no extra configuration — and you can re-run any command from your own shell to reproduce what the agent did. See folio-agent-skills for the bundled product-knowledge skills (folio-quickstart, add-derivation-ai, add-derivation-cross-sheet, debug-failed-materialize).

Per-sheet skills under <sheet>/skills/ (e.g. the research-memory:triage-candidates example) are the same idea at sheet scope — folio skill list <sheet> / folio skill show <sheet> <name> surface them at the CLI, and the agent invokes the steps via the same Bash tool.

When MCP earns its keep

Use folio-mcp instead when one of these is true:

  • The agent is remote or runs in a sandbox without shell access — folio-mcp serve --transport http --bind 0.0.0.0 --port 3001 puts Folio behind an MCP/HTTP endpoint.
  • The agent runs in an MCP-native runtime with no shell tool (Claude Desktop’s built-in MCP surface, custom orchestrators built directly on the MCP SDK).
  • The same long-lived agent session performs thousands of writes and the subprocess-startup cost per CLI call becomes the bottleneck.
  • You’re exposing Folio to another service as a structured-tool API rather than scraping stdout.

If you’re connecting Claude Desktop specifically, see Deploy folio-mcp for the JSON config snippet.

2. Let the agent run a packaged skill

A skill is the agent’s operating procedure — a markdown file under <sheet>/skills/<name>.md with YAML frontmatter declaring its audience and which SDK methods it intends to use. See Sheet skills for the file format.

The bundled triage skill is a good example:

Terminal window
$ folio skill show ./examples/research-memory triage-candidates

The skill walks the agent through: pull pending candidates → decide verify/reject/hold → upsert the updates as agent:research-triage → re-check provenance to confirm the write landed.

Per-sheet skills are sibling to the product-knowledge skills in folio-agent-skills: the npm package teaches the agent how to use Folio, the per-sheet skills teach the agent how to operate a specific dataset. Both are plain markdown that the agent reads and follows.

If you’re going the MCP route instead, the same per-sheet skill body also surfaces as a prompt named <sheet-id>:<skill-name> (here research-memory:triage-candidates) via the standard prompts/list

  • prompts/get MCP methods.

3. The writes go through provenance

Every write the agent performs appends one line to provenance.jsonl:

{"record_id":"f_001","field":"status","source":"human",
"actor":"agent:research-triage","at":"2026-05-11T12:00:00Z",
"input_hash":""}

source: "human" because the value came from a write (the provenance source field reflects how the value was produced, not who did it — the actor is recorded separately).

Derivation outputs look different:

{"record_id":"f_001","field":"domain","source":"python",
"actor":"agent:research-triage","at":"2026-05-11T12:00:05Z",
"input_hash":"sha256:ce82…"}

The non-empty input_hash lets a reviewer re-derive the value deterministically: feed the same url to the same script, get the same domain.

4. Human opens the Viewer to verify

Terminal window
folio serve ./research-memory --port 3000 --actor human:alice
# → http://127.0.0.1:3000/

What the human sees:

  • Records grid — every row the agent wrote, with cell-level provenance dots (python for derived, human for written).
  • History timeline — click a cell, see the append-only history: agent wrote → who → when → from what input hash.
  • Inline edit — if the human disagrees, edit the cell. The Viewer writes a new provenance line with source: "human" and actor: human:alice. Subsequent agent materialize runs will respect that override (respect_human_override defaults to true; see Materialize lifecycle).

The human’s edits and the agent’s writes coexist on the same audit log; nobody can silently overwrite anyone else.

5. Ship the sheet

Folio sheets are designed to travel:

Terminal window
tar czf research-memory-$(date +%Y%m%d).tgz research-memory/

The recipient untars, runs folio validate, and gets the same provenance history the human reviewer saw — no DB, no environment, no vendor lock-in.

For machine-readable interop (BI tools, dbt, data warehouses), Folio ships a Frictionless Data Package export:

Terminal window
folio export datapackage ./research-memory --stdout > datapackage.json

What is and isn’t validated today

StepStatus
Agent writes via MCP using agent:* actor✓ covered by tests/test_mcp.py
Materialize records actor, input_hash, model, cost_usd✓ covered by tests/test_provenance.py
respect_human_override blocks agent rewrites of human-edited cells✓ same test file
Skills surface as MCP prompts✓ covered by tests/test_mcp.py (prompts/list, prompts/get)
Sheet stays portable (cache + runtime outside the dir)✓ enforced by ADR‑0008
End-to-end “agent runs skill → human opens Viewer” smokeCurrently the offline scripts/_mcp_smoke.py validates the agent side; the human-Viewer side is not automated. Track as a known gap.

Adapt to your own sheet

The pattern generalises:

  1. Author a contract that declares which fields are agent-derived (x-derived) versus human-editable (x-editable-by).
  2. Add at least one skill under skills/ with audience: agent and explicit tools:. Skills are how the agent learns what’s safe to do.
  3. Run folio serve whenever you want to inspect what the agent did.
  4. Tar the sheet to share it.

If you’re brand-new, start with Quickstart with folio init to scaffold a runnable sheet, then come back here when you want to hand the loop over to an agent.

See also