Skip to content

Strands Agents tools invocation passing on the whole context every time

0

I am building the GenAI application using bedrock and strands agent running in lambda. Agent is using the strands-native calculator tool. In the processing of the single request, there is anumber of times, when calculator tool is invoked for a calculation. When reviewing the log, I noticed that every time the calculator tool is invoked, the inputBodyJson messages contain full context which is being passed to bedrock including user_prompt, system_prompt and all the preceeding tool results. Trying to understand if there is a way to reduce the size of the payload being used for tool invocations to reduce the cost (obviously apart from caching).

1 Answer
0

When using Strands Agents with the calculator tool, you're observing the expected behavior. Strands follows a model-driven approach where the full context is maintained throughout the agent loop to ensure the model has complete information for decision making.

In the Strands agentic loop, each time a tool is invoked, the full context (including the system prompt, user prompt, and previous interactions) is passed to the model. This is by design to maintain coherence and allow the model to make informed decisions based on the complete conversation history.

There are a few approaches you could consider to optimize this:

  1. Context management techniques: Strands provides context management capabilities that include sliding window approaches (keeping only recent messages) and summarization techniques that preserve key information while reducing token count.

  2. Use the use_agent function instead of use_llm: There's a migration path from use_llm to use_agent which provides equivalent functionality but may offer more efficient context handling. This is recommended as use_llm is being deprecated in the next major release.

  3. Custom tool implementation: You could potentially create a custom calculator tool that maintains its own state and only passes the necessary calculation parameters rather than using the built-in calculator tool.

  4. Agent architecture optimization: Consider if your agent architecture could be restructured to minimize repeated tool calls or to batch calculations together.

The model-driven approach of Strands is designed to give the model the most relevant context for making good decisions about your specific use case, but this does come with the tradeoff of larger payloads being sent to the model.
Sources
use_llm in GitHub repo https://github.com/strands-agents | AWS re:Post
Strands Agents and the Model-Driven Approach | AWS Open Source Blog
Introducing Strands Agents, an Open Source AI Agents SDK | AWS Open Source Blog

answered 5 days ago