You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: agent-schema.json
+30Lines changed: 30 additions & 0 deletions
Original file line number
Diff line number
Diff line change
@@ -398,6 +398,10 @@
398
398
"$ref": "#/definitions/HooksConfig",
399
399
"description": "Lifecycle hooks for executing shell commands at various points in the agent's execution"
400
400
},
401
+
"cache": {
402
+
"$ref": "#/definitions/CacheConfig",
403
+
"description": "Optional response cache: when the same user question is asked again, replay the previous answer instead of calling the model."
404
+
},
401
405
"skills": {
402
406
"description": "Enable skills discovery for this agent. Set to true to load all discovered skills from local filesystem sources; false disables skills. A list can mix sources (\"local\" or an HTTP/HTTPS URL) and/or skill names to include. If only names are given, local sources are loaded and filtered to just those skills.",
403
407
"oneOf": [
@@ -480,6 +484,32 @@
480
484
},
481
485
"additionalProperties": false
482
486
},
487
+
"CacheConfig": {
488
+
"type": "object",
489
+
"description": "Configuration for the agent's response cache. When enabled, the assistant response produced for a given user question is stored and replayed verbatim the next time the same question is asked, skipping the model entirely. Two normalization options control what 'same question' means: case_sensitive (default false) toggles case-insensitive matching, and trim_spaces (default false) strips leading/trailing whitespace before comparison. Set 'path' to persist entries to a JSON file (relative paths resolve against the agent config directory); leave it empty to keep entries in memory only.",
490
+
"properties": {
491
+
"enabled": {
492
+
"type": "boolean",
493
+
"description": "Set to true to enable the cache. When false (or when the cache section is omitted), no caching is performed.",
494
+
"default": false
495
+
},
496
+
"case_sensitive": {
497
+
"type": "boolean",
498
+
"description": "When true, questions must match exactly (including case) to hit the cache. Default: false (case-insensitive matching).",
499
+
"default": false
500
+
},
501
+
"trim_spaces": {
502
+
"type": "boolean",
503
+
"description": "When true, leading and trailing whitespace is stripped from questions before they are compared. Default: false.",
504
+
"default": false
505
+
},
506
+
"path": {
507
+
"type": "string",
508
+
"description": "Path to a JSON file used to persist cache entries across runs. Relative paths are resolved against the agent's config directory. When empty, the cache lives only in memory."
509
+
}
510
+
},
511
+
"additionalProperties": false
512
+
},
483
513
"HooksConfig": {
484
514
"type": "object",
485
515
"description": "Lifecycle hooks configuration for an agent. Hooks allow running shell commands at various points in the agent's execution lifecycle.",
Copy file name to clipboardExpand all lines: docs/configuration/agents/index.md
+47Lines changed: 47 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -48,6 +48,11 @@ agents:
48
48
structured_output: # Optional: constrain output format
49
49
name: string
50
50
schema: object
51
+
cache: # Optional: response cache (skip the model on repeat questions)
52
+
enabled: boolean
53
+
case_sensitive: boolean
54
+
trim_spaces: boolean
55
+
path: string
51
56
```
52
57
53
58
<div class="callout callout-tip" markdown="1">
@@ -83,6 +88,7 @@ agents:
83
88
| `handoffs` | array | ✗ | List of agent names this agent can hand off the conversation to. Enables the `handoff` tool. See [Handoffs Routing]({{ '/concepts/multi-agent/#handoffs-routing' | relative_url }}). |
84
89
| `hooks` | object | ✗ | Lifecycle hooks for running commands at various points. See [Hooks]({{ '/configuration/hooks/' | relative_url }}). |
85
90
| `structured_output` | object | ✗ | Constrain agent output to match a JSON schema. See [Structured Output]({{ '/configuration/structured-output/' | relative_url }}). |
91
+
| `cache` | object | ✗ | Response cache. When the same user question is asked again, the previous answer is replayed verbatim and the model is not called. See [Response Cache](#response-cache) below. |
The response cache short-circuits the model when the same user question is asked again. The first time a question is asked, the agent calls the model normally and stores the assistant's reply. Subsequent identical questions skip the model entirely and replay the stored reply verbatim.
| `enabled` | boolean | `false` | Master switch. When `false` (or when the `cache` section is omitted), no caching is performed. |
120
+
| `case_sensitive` | boolean | `false` | When `true`, questions must match exactly (including case) to hit the cache. |
121
+
| `trim_spaces` | boolean | `false` | When `true`, leading and trailing whitespace is stripped from the question before it is compared. |
122
+
| `path` | string | _empty_ | When set, cache entries are persisted to a JSON file at the given path and reloaded on startup so the cache survives restarts. Relative paths resolve against the agent config directory. When empty, the cache lives in memory only. |
123
+
124
+
**How it works**
125
+
126
+
- The cache key is the latest user message in the session, normalized according to `case_sensitive` and `trim_spaces`.
127
+
- On a hit, the cached reply is added to the session as the assistant message and stop hooks fire normally — the rest of the agent (tools, sub-agents, the model) is bypassed.
128
+
- On a miss, the agent runs normally; the final assistant message produced by the first stop of the run is then stored under the question's key.
129
+
- Only the response to the original user question of a run is cached; follow-up turns inside the same `RunStream` are not.
130
+
131
+
**File-backed storage**
132
+
133
+
When `path` is set, every `Store` rewrites the entire cache file. Writes are **atomic**: the new content is written to a sibling temp file, `fsync`'d, and renamed over the destination, so a concurrent reader (or a process that crashes mid-write) will always see either the previous content or the new content in full — never a partially written file. The parent directory is also `fsync`'d after the rename so the rename itself is durable.
134
+
135
+
**Cross-process sharing**
136
+
137
+
Multiple processes can share the same `path:` cache file safely. Every `Store` takes an exclusive advisory lock on a sibling `<path>.lock` file (POSIX `flock(2)` on Unix, `LockFileEx` on Windows), reloads the current on-disk state under the lock, merges the new entry, and writes back atomically. Two processes that store *different* keys at the same time both see their writes preserved on disk; the lock window is short (one read + one fsync'd write).
138
+
139
+
`Lookup` watches the file's modification time and reloads the in-memory map when the file has advanced since its last load, so writes from a sibling process become visible without a restart. The `<path>.lock` sentinel file is created on first write and never deleted: removing it would let two processes lock different inodes and lose mutual exclusion.
0 commit comments