Skip to content

fix: reasoning models reject tool_choice=required; bump to 1.0.2 (closes #503, #505)#508

Merged
0xallam merged 2 commits into
mainfrom
fix/reasoning-tool-choice-1.0.2
May 28, 2026
Merged

fix: reasoning models reject tool_choice=required; bump to 1.0.2 (closes #503, #505)#508
0xallam merged 2 commits into
mainfrom
fix/reasoning-tool-choice-1.0.2

Conversation

@0xallam

@0xallam 0xallam commented May 28, 2026

Copy link
Copy Markdown
Member

Fixes #503 and #505.

Issue

Every scan against a reasoning-capable model fails immediately with provider-side 400s:

  • Anthropic Claude with thinking: "Thinking may not be enabled when tool_choice forces tool use"
  • DeepSeek V4 thinking: "Thinking mode does not support this tool_choice"

Affects anthropic/claude-sonnet-4-6, anthropic/claude-opus-4-7, DeepSeek V4 thinking, and any future reasoning-capable provider.

Root cause

strix/core/inputs.py:make_model_settings() unconditionally sets tool_choice="required" AND attaches Reasoning(effort=...). This combination is rejected provider-side. Quoting Anthropic's docs:

"Tool use with thinking only supports tool_choice: {"type": "auto"} (the default) or tool_choice: {"type": "none"}. Using tool_choice: {"type": "any"} … will result in an error because these options force tool use, which is incompatible with extended thinking."

LiteLLM has declined to normalize this client-side (BerriAI/litellm#8883). LangChain and pydantic-ai both shipped client-side fixes for the same constraint (langchain#35544, pydantic-ai#3611).

Fixes

  • `make_model_settings()` — drop `tool_choice="required"` when reasoning is enabled. Use SDK default (`None` → auto). Lifecycle-tool termination is still enforced by `_finish_tool_use_behavior` in the agent factory.
  • `STRIX_REASONING_EFFORT="none"` crash — previously crashed with `AttributeError: 'NoneType' object has no attribute 'get'` because the literal string `"none"` was truthy and made it into LiteLLM's reasoning path. Now treated as "do not attach `Reasoning(...)`" — restores the documented escape hatch.
  • System prompt — reinforced that text-only turns terminate the scan, since the SDK considers a text-only response the final output when `tool_choice` isn't forced. One new explicit line near existing strong language.
  • Runner defensive check — when a non-interactive scan ends without `finish_scan` being called, log a clear error pointing at the text-ended turn. Replaces silent half-scans.

Behavior matrix after the fix

`STRIX_REASONING_EFFORT` `tool_choice` `reasoning` attached
(unset → default `high`) `None` (auto) yes
`low` / `medium` / `high` / `xhigh` `None` (auto) yes
`"none"` `"required"` no (literal `"none"` is now treated as off)
`None` (Python) `"required"` no

Provider impact

Provider Before After
Anthropic Claude + thinking 400 immediately works
DeepSeek V4 thinking 400 immediately works
OpenAI o-series works works (loses `required` safety net, but follows instructions reliably)
OpenAI gpt-5.x non-reasoning works works
Any model with reasoning off works works (unchanged)

Version

Bumped to 1.0.2. After merge → tag `v1.0.2` → CI binaries + PyPI publish.

Docker image stays at `:1.0.0` (unaffected).

Every reasoning-capable provider (Anthropic with thinking, DeepSeek
V4 thinking, etc.) rejects the combination of ``tool_choice="required"``
and reasoning/thinking enabled. Strix sets both unconditionally, so
scans fail immediately with provider-side 400s:

  Anthropic: "Thinking may not be enabled when tool_choice forces tool use"
  DeepSeek:  "Thinking mode does not support this tool_choice"

This is a provider contract — Anthropic's extended-thinking docs state
the rule explicitly, and LiteLLM has declined to normalize it
(BerriAI/litellm#8883). LangChain and pydantic-ai both shipped
client-side normalization for the same constraint.

Fixes:

1. make_model_settings(): when reasoning is enabled, leave tool_choice
   unset (model self-selects). When reasoning is off, keep the existing
   "required" safety net. The lifecycle-tool requirement is still
   enforced by _finish_tool_use_behavior in the agent factory.

2. STRIX_REASONING_EFFORT="none" used to crash with
   AttributeError: 'NoneType' object has no attribute 'get' because the
   literal string "none" was truthy and made it into LiteLLM's reasoning
   path. Treat it as "do not attach Reasoning(...)" — restores the
   documented escape hatch.

3. system_prompt: reinforce that text-only turns terminate the scan,
   so the agent leans harder on the lifecycle tools now that the
   forced-tool-choice safety net is gone for reasoning runs.

4. runner: surface a clear error log when a non-interactive scan ends
   without finish_scan being called (text-ended turn), so users see
   what happened instead of a silent half-scan.

Closes #503, #505

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@greptile-apps

greptile-apps Bot commented May 28, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR fixes provider-side 400 errors that occurred when tool_choice=\"required\" was combined with an active reasoning/thinking configuration — a combination explicitly rejected by Anthropic and DeepSeek. It also fixes a crash introduced by passing the literal string \"none\" as a reasoning effort.

  • make_model_settings() now skips tool_choice=\"required\" and skips attaching Reasoning(...) whenever reasoning_effort is None or \"none\", restoring the documented escape hatch and unblocking Claude + DeepSeek thinking models.
  • runner.py gains a defensive post-loop check that logs a clear error when a non-interactive scan returns without finish_scan having been called, surfacing the text-only-turn failure mode introduced by removing the forced tool_choice.
  • System prompt gets one reinforcing line making the "text-only turn = immediate silent termination" contract explicit for reasoning models that no longer have tool_choice forcing them into tool use.

Confidence Score: 5/5

Safe to merge — the changes are well-scoped and directly fix documented provider incompatibilities without altering unrelated paths.

The use_reasoning guard is logically correct for all values in the ReasoningEffort literal. The runner defensive check only fires in non-interactive mode and only logs — it does not alter control flow. No existing non-reasoning paths are changed.

No files require special attention.

Important Files Changed

Filename Overview
strix/core/inputs.py Core fix: adds use_reasoning guard to drop tool_choice=required and skip attaching Reasoning(effort=...) when reasoning_effort is None or the literal string none. Logic is correct and handles both documented edge cases.
strix/core/runner.py Defensive check added post-loop to detect when finish_scan was not called; logs a clear error. Handles str and dict final_output shapes.
strix/agents/prompts/system_prompt.jinja Adds one explicit line reinforcing that text-only turns immediately end the scan without a report; fits naturally with the adjacent autonomous-behavior rules.
pyproject.toml Version bump from 1.0.1 to 1.0.2; reflected in uv.lock as expected.

Reviews (2): Last reviewed commit: "Also handle dict final_output in scan-co..." | Re-trigger Greptile

Comment thread strix/core/runner.py
Greptile review on PR #508 flagged that the post-run defensive check
only treated string final_output. If the SDK ever returns a structured
dict (depending on output_type configuration), the check would
false-positive and log "scan ended without finish_scan" on every
successful reasoning-model scan.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@0xallam

0xallam commented May 28, 2026

Copy link
Copy Markdown
Member Author
@0xallam 0xallam merged commit d032151 into main May 28, 2026
2 checks passed
@0xallam 0xallam deleted the fix/reasoning-tool-choice-1.0.2 branch May 28, 2026 18:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

1 participant