runtime: extract image-stripping into a registered MessageTransform by dgageot · Pull Request #2573 · docker/docker-agent

dgageot · 2026-04-28T11:25:55Z

Summary

Extracts the inline stripImageContent call from runStreamLoop into a registered, runtime-private message-transform mechanism that opens the door to a family of message-mutating builtins (PII redactors, secret scrubbers, prompt-prefix injectors, …).

Changes

New mechanism — `MessageTransform` (in-process before_llm_call rewrites)

New MessageTransform type and WithMessageTransform("name", fn) option in pkg/runtime/transforms.go.
Transforms are intentionally a runtime-private contract: the cost of JSON-roundtripping a full conversation through the cross-process hook protocol would be prohibitive, so command/model hooks cannot rewrite messages. By design.
Transforms run after the standard before_llm_call gate — a hook that wants to abort the call should target the gate, not a transform.
Fail-soft: a transform that returns an error logs at warn level and the chain continues with the previous slice. A transform must never break the run loop.
Chain order = registration order. Per-agent scoping (if needed) lives in the transform body via hooks.Input.AgentName.

First built-in transform — `strip_unsupported_modalities`

New pkg/runtime/strip_modalities.go hosts BuiltinStripUnsupportedModalities, the transform body, and the stripImageContent helper (moved from streaming.go).
The inline if m != nil && len(m.Modalities.Input) > 0 && !slices.Contains(...) block in runStreamLoop is gone. The loop now calls executeBeforeLLMCallHooks (gate) followed by applyBeforeLLMCallTransforms (rewrite) — so a transform failure cannot waste the gate's allow verdict.

Correctness fix — alloy mode + per-tool model override

New ModelID field on hooks.Input, populated by runStreamLoop with the model the loop actually picked (post per-tool override, post alloy-mode random selection).
The strip transform now keys its modality lookup off in.ModelID instead of calling agent.Model() again — which would re-randomize the alloy pick or miss a per-tool override and consult the wrong modalities.
Pinned by TestStripUnsupportedModalitiesTransform_UsesInputModelID, which uses an ID-keyed model store to prove the lookup keys off ModelID rather than the agent.
The same ModelID is now also surfaced to user-authored before_llm_call hooks for free.

What's preserved

All previous user-facing behavior:

Strip-when-text-only: identical decision logic.
"Unknown model → pass through": identical fall-through.
The add_date / add_environment_info / add_prompt_files / cache_response builtins are untouched.
hooks.Input field additions are backward-compatible (omitempty JSON tags; existing handlers ignore unknown fields).

What's not preserved (intentional)

The original PR briefly experimented with auto-injecting transforms as {type: builtin, command: name} entries into agent hook configs (with a no-op BuiltinFunc shim and dedup logic). This was simplified away because users couldn't actually control transforms through YAML — auto-injection always won — so the YAML coupling was internal plumbing for a control surface that didn't exist. The simplification dropped ~340 net lines without losing any user-facing capability.

Why this matters

The payoff isn't in code we deleted today (the strip is the only candidate currently inline). The payoff is shrinking the diff for future message-rewriting features:

PII redactor: ~30-line transform + WithMessageTransform("redact_pii", fn). 0 lines in the run loop.
"Drop large tool outputs from old turns": same shape.
"Inject team-policy prefix": same shape.

Without this mechanism, each of those would have grown a new branch in runStreamLoop. With it, the loop's pre-LLM-call section stays at three logical lines: get gate verdict, run transforms, call model.

Validation

mise lint ✓ (golangci-lint run: 0 issues, internal lint checker: no offenses, go mod tidy --diff: clean)
mise test ✓ (full suite passes)
New tests cover: text-only / multimodal / unknown-model branches, empty ModelID, registration-order chain semantics, fail-soft contract, end-to-end strip via RunStream, end-to-end transform-error survival, input validation, alloy / per-tool override correctness.

Commits

extract strip_unsupported_modalities into a registered before_llm_call transform
simplify message transforms: drop the YAML auto-injection plumbing
fix strip transform reading wrong model in alloy / per-tool override mode

Assisted-By: docker-agent

…l transform

…mode The transform was calling agent.Model() which re-randomizes alloy picks and ignores per-tool overrides — it could end up consulting modalities for a different model than the one the loop was actually about to call. Pass the resolved modelID through hooks.Input.ModelID instead.

dgageot added 3 commits April 28, 2026 11:11

extract strip_unsupported_modalities into a registered before_llm_cal…

2b50abc

…l transform

simplify message transforms: drop the YAML auto-injection plumbing

48e2b71

dgageot requested a review from a team as a code owner April 28, 2026 11:25

gtardif mentioned this pull request Apr 28, 2026

feat(fetch): add allowed_domains and blocked_domains filters #2572

Merged

gtardif approved these changes Apr 28, 2026

View reviewed changes

dgageot merged commit e59e163 into docker:main Apr 28, 2026
9 checks passed

BrewTestBot mentioned this pull request Apr 29, 2026

docker-agent 1.54.0 Homebrew/homebrew-core#280008

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

runtime: extract image-stripping into a registered MessageTransform#2573

runtime: extract image-stripping into a registered MessageTransform#2573
dgageot merged 3 commits intodocker:mainfrom
dgageot:board/extracting-runtime-features-into-builtin-d52e607b

dgageot commented Apr 28, 2026

Uh oh!

Labels

2 participants

Conversation