ai derivation
ai uses an LLM to fill derived fields. Folio’s
implementation routes every call through an AIClient Protocol — the
AnthropicClientAdapter is the only module that imports
the anthropic SDK (ADR-0009), and a deterministic
StubAIClient ships for offline tests and demos.
Minimal example
targets: [industry_tag]inputs: [company_name]kind: aimodel: claude-sonnet-4-6prompt: | Industry of {{ company_name }} in one word.output: textWhen folio materialize reaches a record, Folio:
- Resolves the
prompttemplate (substituting{{ field }}placeholders). - Computes
input_hashover the inputs and the resolved prompt body. - Cache hit → done. Cache miss → calls the AIClient, writes the value,
appends a provenance line that includes
modelandcost_usd.
Fields
| Field | Required | Notes |
|---|---|---|
targets | yes | Fields this derivation writes. |
inputs | yes | Fields whose values the prompt may read. |
kind | yes | Always ai. |
model | yes | Any string the AIClient accepts. The default adapter passes it through to Anthropic. |
prompt | one of | Inline template with {{ field }} placeholders. |
prompt_ref | one of | Path (relative to the sheet) of a markdown / text file. Mutually exclusive with prompt. |
output | no | text (default) or json. |
output_schema | when multi-target | {name: type} map describing the expected JSON keys. |
materialization | no | See below. |
prompt_ref
Use a separate file when the prompt is more than a few lines:
prompt_ref: prompts/enrich-industry.mdThe file lives inside the sheet (so it travels in the tarball). The file’s
content goes into the input_hash, so editing the prompt invalidates the
cache.
output: json and output_schema
For multi-target derivations, the model must return a JSON object whose keys
match output_schema. The default adapter sends a structured-output hint to
the model; the StubAIClient honours output_schema literally.
targets: [industry, employee_count]inputs: [company_name, country]kind: aimodel: claude-sonnet-4-6prompt_ref: prompts/enrich.mdoutput: jsonoutput_schema: industry: string employee_count: integerIf the model returns an extra key, Folio writes only the keys in output_schema
and ignores the rest. If the model returns a missing or null key, the field
stays null and the derivation logs a failure.
materialization
Optional sub-block tuning the loop:
materialization: respect_human_override: true # default retries: 0 # default retry_delay_seconds: 1.0Retries apply to AIClient errors only; deterministic content errors (a missing
output_schema key) do not retry.
Prompt template syntax
A tiny {{ field }} substitution. Whitespace inside the braces is allowed:
{{ company_name }} and {{company_name}} both work. The substituted value
is the JSON-encoded form of the field — strings get quotes, integers
stay bare, arrays become bracket lists. This keeps prompts safe against
fields that contain quotes or newlines.
Cost reporting
Every successful ai call appends cost_usd to its provenance line. Folio
keeps a small PRICE_TABLE_USD in _ai_kind.py for known models; unknown
models report cost_usd: null rather than fabricating a price
(ADR-0009). The total_cost on the
materialize envelope is the sum of all known costs in
the run.
Running offline with the StubAIClient
For tests and the offline materialize smoke, inject a StubAIClient:
from folio import open_sheetfrom folio._ai_kind import StubAIClient
stub = StubAIClient()stub.prepare("Industry of Acme", "Manufacturing") # canned response
sheet = open_sheet("./customers", actor="agent:demo")result = sheet.materialize(ai_client=stub)StubAIClient.prepare(substring, value) matches by substring against the
resolved prompt. A fallback responder lets you return canned shapes for any
prompt:
stub = StubAIClient(default_responder=lambda prompt: {"industry": "Unknown"})When to use ai vs other kinds
- Use
aiwhen the answer is fundamentally fuzzy — classifying free text, summarizing, extracting from messy inputs. - Use
pythonorsqlwhen the answer is deterministic. The cache is cheaper, the result is reproducible, the test story is trivial, and there’s no API key to manage. - Use
importwhen the answer already exists in a local file.
A common pattern is a deterministic python / sql first pass to handle the
easy cases, with an ai fallback for the long tail.
Where to next
- AIClient SDK reference — Protocol, adapter, stub.
- Custom AI provider — implement the Protocol against any model.
- provenance.jsonl — what an
aiprovenance line looks like, includingcost_usdandmodel.