Python + AI Weekly Office Hours: Recordings & Resources #280

pamelafox · 2026-01-07T01:13:50Z

pamelafox
Jan 7, 2026

Each week, we hold weekly office hours about all things Python + AI in the Foundry Discord.
Join the Discord here: http://aka.ms/aipython/oh

This thread will list the recordings of each office hours, and any other resources that come out of the OH sessions. The questions and answers are automatically posted (based on the transcript) as comments in this thread.

April 28, 2026

Recording

Topics covered:

April 20, 2026

Recording

Topics covered:

April 13, 2026

Recording

Topics covered:

April 7, 2026

Recording

Topics covered:

March 31, 2026

Recording

Topics covered:

March 24, 2026

Recording

Topics covered:

March 17, 2026

Recording

Topics covered:

February 17, 2026

Recording

Topics covered:

February 10, 2026

Recording

Topics covered:

February 3, 2026

Recording

Topics covered:

January 27, 2026

Recording

Topics covered:

January 20, 2026

Recording

Topics covered:

January 13, 2026

Recording

Topics covered:

January 6, 2025

Recording

Topics covered:

pamelafox · 2026-01-08T05:44:18Z

pamelafox
Jan 8, 2026
Author

2026/01/06: Do you think companies will create internal MCP servers for AI apps to connect to?

Yes, this is already happening quite a bit. Common use cases include:

Internal documentation servers
Data analytics access for non-developers
Ticketing systems
Debugging tools

A particularly valuable use case is data science/engineering teams creating MCP servers that enable less technical folks (marketing, PMs, bizdev) to pull data safely without needing to write SQL.

The pattern often starts with an engineer building an MCP server for themselves, sharing it with colleagues, adding features based on their needs, and growing from there.

Links shared:

Pragmatic Engineer: Building MCP servers in the real world

0 replies

pamelafox · 2026-01-08T05:58:10Z

pamelafox
Jan 8, 2026
Author

2026/01/06: How do you set up Entra OBO (On-Behalf-Of) flow for Python MCP servers?

📹 5:48

The demo showed how to use the Graph API with the OBO flow to find out the groups of a signed-in user and use that to decide whether to allow access to a particular tool.

The flow works as follows:

Get the access token from the middleware
Exchange that access token for a Graph API token using the OBO flow with a specific scope
Use the Graph token to call graph.microsoft.com/v1/me/memberOf to check group membership
Filter by the specific group ID you want to check (more efficient than getting all groups and paginating)
If the count returned is 1, the user is in the group; if 0, they're not

For the authentication dance, FastMCP handles the DCR (Dynamic Client Registration) flow since Entra itself doesn't support DCR natively.

To test from scratch:

Go to "Authentication: Remove Dynamic Authentication Providers" in VS Code
Clear the localhost authentication
Start the server
When you start, VS Code detects the 401, attempts DCR flow, gets the PRM, and sees the authorization server supports DCR
Allow access on the FastMCP consent screen
It briefly jumps to login.microsoftonline.com before returning

Links shared:

PR to python-mcp-demos to add Entra OBO

0 replies

pamelafox · 2026-01-08T05:58:11Z

pamelafox
Jan 8, 2026
Author

2026/01/06: Which MCP inspector should I use for testing servers with Entra authentication?

📹 20:24

The standard MCP Inspector doesn't work well with Entra authentication because it doesn't do the DCR (Dynamic Client Registration) dance properly.

MCP Jam is recommended instead because it properly handles the OAuth flow with DCR. To set it up:

Install MCP Jam
Add a server over HTTP (e.g., localhost:8000/mcp)
Configure OAuth with your scopes
It will go through the full registration flow

MCP Jam also has nice features like:

Saved requests for replaying the same request repeatedly during development
An OAuth debugger with a diagram showing the whole flow
A chat interface for testing your server with different models

One note: enum values in tools don't yet show as dropdowns in MCP Jam (issue to be filed).

Links shared:

MCPJam Inspector

What's the difference between MCP Jam and LM Studio?

📹 34:19

LM Studio is primarily for playing around with LLMs locally. MCP Jam has some overlap since it includes a chat interface with access to models, but its main purpose is to help you develop MCP servers and apps. It's focused on the development workflow rather than just chatting with models.

0 replies

pamelafox · 2026-01-08T05:58:12Z

pamelafox
Jan 8, 2026
Author

2026/01/06: How do you track LLM usage tokens and costs?

📹 28:04

For basic tracking, Azure portal shows metrics for token usage in your OpenAI accounts. You can see input tokens and output tokens in the metrics section.

You can also:

Log custom metrics with OpenTelemetry
Use Langfuse
Use LiteLLM (mentioned by a community member)

If you use multiple providers, you need a way to consolidate the tracking. OpenTelemetry metrics could work but you'd need a way to hook into each system.

0 replies

pamelafox · 2026-01-08T05:58:13Z

pamelafox
Jan 8, 2026
Author

2026/01/06: How do you keep yourself updated with all the new changes related to AI?

📹 30:32

Several sources recommended:

Company chat channels (e.g., generative AI chat, GitHub Copilot chat) for sharing what people are experimenting with
Newsletters from LangChain, Pydantic AI, etc.
LinkedIn, Hacker News
Specific bloggers

Particularly recommended:

Elite AI Assisted Coding newsletter - Great for agentic coding tips, run by Isaac and Eleanor who experiment with everything
Drew Breunig's blog - A developer who writes thoughtful pieces about LLMs

Links shared:

How I learn about generative AI (blog post)

0 replies

pamelafox · 2026-01-08T05:58:14Z

pamelafox
Jan 8, 2026
Author

2026/01/06: How do you build a Microsoft Copilot agent in Python with custom API calls?

📹 36:30

For building agents that work with Microsoft 365 Copilot (which appears in Windows Copilot and other Microsoft surfaces):

Use the Agent Framework - it has a demo for M365 integration
Test locally using the agent playground
Deploy to Microsoft 365 using the deployment docs

The agent framework team is responsive if there are issues.

Links shared:

0 replies

pamelafox · 2026-01-08T05:58:16Z

pamelafox
Jan 8, 2026
Author

2026/01/06: As a backend developer with a non-CS background, how do I learn about AI from scratch?

📹 46:39

Recommended approach:

Watch the Python + AI series from October (if you understand Python, it's at a good level)
Read the AI Engineering book by Chip Huyen
Build stuff - go back and forth between learning and doing

Links shared:

0 replies

pamelafox · 2026-01-08T05:58:17Z

pamelafox
Jan 8, 2026
Author

2026/01/06: What's new with the RAG demo (azure-search-openai-demo) after the SharePoint data source was added?

📹 49:50

The main work is around improving ACL (Access Control List) support. The cloud ingestion feature was added recently, but it doesn't yet support ACLs. The team is working on making ACLs compatible with all features including:

Cloud ingestion
SharePoint Online document libraries
ADLS (Azure Data Lake Storage Gen2)

A future feature idea: adding an MCP server to the RAG repo for internal documentation use cases, leveraging the Entra OBO flow for access control.

0 replies

pamelafox · 2026-01-08T05:58:18Z

pamelafox
Jan 8, 2026
Author

2026/01/06: Do you think companies will create internal MCP servers for AI apps to connect to?

📹 53:53

Yes, this is already happening quite a bit. Common use cases include:

Internal documentation servers
Data analytics access for non-developers
Ticketing systems
Debugging tools

A particularly valuable use case is data science/engineering teams creating MCP servers that enable less technical folks (marketing, PMs, bizdev) to pull data safely without needing to write SQL.

The pattern often starts with an engineer building an MCP server for themselves, sharing it with colleagues, adding features based on their needs, and growing from there.

Links shared:

Pragmatic Engineer: Building MCP servers in the real world

0 replies

pamelafox · 2026-01-13T23:26:36Z

pamelafox
Jan 13, 2026
Author

2026/01/13: What advantages do other formats have over .txt for prompts? How do you improve prompts with DSPy and evals?

📹 4:55

Prompty is a template format that mixes Jinja and YAML together. The YAML goes at the top for metadata, and the rest is Jinja templating. Jinja is the most common templating system for Python (used by Flask, etc.). The nice thing about Jinja is you can pass in template variables—useful for customization, passing in citations, etc. Prompty turns the file into a Python list of chat messages with roles and contents.

However, we're moving from Prompty to plain Jinja files because:

Prompty doesn't support the Responses API
Prompty hasn't seen much adoption—it's hard to get people to adopt new formats
It's easier to just use text files, markdown files, or something people already know

Recommendation: Keep prompts separate from code when possible, especially long system prompts. Use plain .txt or .md if you don't need variables, or Jinja if you want to render variables. With agents and tools, some LLM-facing text (like tool descriptions in docstrings) will inevitably live in your code—that's fine.

For iterating on prompts: Run evaluations, change the prompt, and see whether it improves things. There are tools like DSPy and Agent Framework's Lightning that do automated prompt optimization/fine-tuning. Lightning says it "fine-tunes agents" but may actually be doing prompt changes. Most of the time, prompt changes don't make a huge difference, but sometimes they might.

Links shared:

0 replies

pamelafox · 2026-01-13T23:26:37Z

pamelafox
Jan 13, 2026
Author

2026/01/13: What is the future of AI and which specialization should I pursue?

📹 11:54

If you enjoy software engineering and full-stack engineering, it's more about understanding the models so you understand why they do what they do, but it's really about how you're building on top of those models. There's lots of interesting stuff to learn, and it really depends on you and what you're most interested in doing.

0 replies

pamelafox · 2026-01-13T23:26:38Z

pamelafox
Jan 13, 2026
Author

2026/01/13: Which livestream series should I follow to build a project using several tools and agents, and should I use a framework?

📹 13:33

Everyone should understand tool calling before moving on to agents. From the original 9-part Python + AI series, start with tool calling, then watch the high-level agents overview. The upcoming six-part series in February will dive deeper into each topic, especially how to use Agent Framework.

At the bare minimum, you should understand LLMs, tool calling, and agents. Then you can decide whether to do everything with just tool calling (you can do it yourself with an LLM that has tool calling) or use an agent framework like LangChain or Agent Framework if you think it has enough benefits for you.

It's important to understand that agents are based on tool calling—it's the foundation of agents. The success and failure of agents has to do with the ability of LLMs to use tool calling.

Links shared:

0 replies

pamelafox · 2026-01-13T23:26:39Z

pamelafox
Jan 13, 2026
Author

2026/01/13: How does Azure manage the context window? How do I maintain a long conversation with a small context window?

📹 15:21

There are three general approaches:

Send the last N messages - This is the most naive approach, but you don't actually know if they're going to fit.
Send only messages that fit - Pre-count the tokens and only send messages that fit inside your remaining tokens. This is hard to do correctly as you're basically reverse-engineering the models to figure out how to calculate tokens. There's a library for this if you want to attempt it.
Summarize the conversation - When the conversation gets too long, make a call to summarize it. You can either wait for an error from the LLM that says the context is too long and then summarize, or proactively summarize before hitting the limit.

With today's large context windows (128K, 256K), it's often easier to just wait for an error and tell the user to start a new chat, or do summarization when the error occurs. This approach is most likely to work across models since every model should throw an error when you're over the context window.

Links shared:

0 replies

pamelafox · 2026-01-13T23:26:39Z

pamelafox
Jan 13, 2026
Author

2026/01/13: How do we deal with context rot and how do we summarize context using progressive disclosure techniques?

📹 19:17

Read through Kelly Hong's (Chroma researcher) blog post on context rot. The key point is that even with a 1 million token context window, you don't have uniform performance across that context window. She does various tests to see when performance starts getting worse, including tests on ambiguity, distractors, and implications.

A general tip for coding agents with long-running tasks: use a main agent that breaks the task into subtasks and spawns sub-agents for each one, where each sub-agent has its own focused context. This is the approach used by the LangChain Deep Agents repo.

You can also look at how different projects implement summarization. LangChain's summarization middleware is open source—you can see their summary prompt and approach. They do approximate token counting and trigger summarization when 80% of the context is reached.

Links shared:

How do I deal with context issues when using the Foundry SDK with a single agent?

📹 25:03

If you're using the Foundry SDK with a single agent (hosted agent), you can implement something like middleware through hooks or events. Another approach is the LangChain Deep Agents pattern: implement sub-agents as tools where each tool has a limited context and reports back a summary of its results to the main agent.

For the summarization approach with Foundry agents, you'd need to figure out what events, hooks, or middleware systems they have available.

0 replies

pamelafox · 2026-01-13T23:26:40Z

pamelafox
Jan 13, 2026
Author

2026/01/13: Have you seen or implemented anything related to AG-UI or A2UI?

📹 29:02

AG-UI (Agent User Interaction Protocol) is an open standard introduced by the CopilotKit team that standardizes how front-end applications communicate with AI agents. Both Pydantic AI and Microsoft Agent Framework have support for AG-UI—they provide adapters to convert messages to the AG-UI format.

The advantage of standardization is that if people agree on a protocol between backend and frontend, it means you can build reusable front-end components that understand how to use that backend.

Agent Framework also supports different UI event stream protocols, including Vercel AI (though Vercel is a competitor, so support may be limited). These are adapters—you can always adapt output into another format if needed, but it's nice when it's built in.

A2UI is created by Google with Consortium CopilotKit and relates to A2A (Agent-to-Agent). A2UI appears to be newer with less support currently in Agent Framework, though A2A is supported.

Links shared:

0 replies

pamelafox · 2026-04-15T01:28:59Z

pamelafox
Apr 15, 2026
Author

2026/04/13: Are there good resources to dig deeper on AI agents deployment?

📹 52:41

For Foundry hosted agents specifically, Pamela recommends waiting about two weeks for the upcoming live stream series (Host your agents on Foundry, Apr 27-30), since the SDKs are actively being redone and things are changing rapidly.

In the meantime, she recommends starting with the seattle-hotel-agent AZD example repo and the corresponding blog post about azd AI agent debugging. That's what she used as the basis for her own hosted agent, and what she had a colleague use to get started. If you run into issues, file them in the AZD repo.

Links shared:

0 replies

pamelafox · 2026-04-15T01:29:00Z

pamelafox
Apr 15, 2026
Author

2026/04/13: Announcements

📹 00:57

Responses API migration:

The azure-openai-to-responses migration agent is live. Pamela has now migrated pretty much every sample over to the Azure Responses API, including the large RAG sample. The Responses API enables easy access to built-in code interpreter and web search tools.

Copilot CLI remote control from mobile:

You can now monitor and steer a running Copilot CLI session from your phone using copilot --remote. This lets you kick off long-running tasks and check in from your mobile device.

Copilot CLI multi-model reflection (rubber duck):

The new rubber duck feature has Copilot CLI use a different model family to provide a second opinion and critique on plans and implementations.

VS Code agent customizations:

A new VS Code window shows all your agent customizations in one place — AGENTS.md files, custom instructions, skills, and more.

Agent-first development video series:

Gwen's introduction to agent-first development series covers building apps with VS Code and Copilot using an agent-first approach.

ParseBench document parsing benchmark:

ParseBench is a new benchmark with 2,000+ human-verified pages and 167K test rules for evaluating document OCR quality across tables, charts, formatting, and more.

DSPy meetup and talks:

Recent DSPy meetup featured talks on reasoning models and the GEPA optimize_anything approach. Dropbox also presented on search relevance with DSPy.

MCP conformance suite:

The MCP conformance suite is a tool for testing whether your MCP server complies with the MCP specification.

Review PR comments skill:

Pamela built a review-pr-comments Copilot CLI skill that reviews comments on an active pull request and decides whether to accept, iterate, or reject the suggested changes.

Personal projects:

recap-my-week — Agent skill that combines Work IQ, Twitter, and GitHub MCP servers to generate a weekly activity summary
comedy-roast — Agent skill that generates a comedy roast based on your week's activity

Anthropic — Project Glasswing:

Discussed Project Glasswing, Anthropic's new initiative to secure critical software for the AI era.

0 replies

pamelafox · 2026-04-22T06:06:08Z

pamelafox
Apr 22, 2026
Author

2026/04/20: What's a good workflow for pulling entities out of PDFs? Is MarkItDown a good library?

📹 4:02

Pamela demonstrated several approaches for extracting data from PDFs, starting with MarkItDown — a Microsoft open-source library that converts documents (DOCX, PDF, etc.) to Markdown. She showed an entity extraction example where a Word document was converted to Markdown and then sent to an LLM to extract fields like title, author, and headings.

She then compared MarkItDown vs. PyMuPDF (specifically pymupdf4llm) on a complex PDF with mixed layouts and images. PyMuPDF appeared to produce slightly better results with less repetitive text, though results will vary depending on document complexity.

For documents with images, Pamela demonstrated MarkItDown's OCR plugin, which uses an LLM to describe images found in documents.

The best results came from Azure Document Intelligence, which she demonstrated through her RAG application. Document Intelligence extracted far more figures and structural information from the PDF. Combined with an LLM for image descriptions (using a prompt like "describe the image with no more than five sentences"), this approach produced the richest output — including both text content and detailed figure descriptions that go beyond simple OCR text extraction.

She also mentioned Azure Content Understanding as a newer alternative hosted service worth exploring, and noted that Pablo shared an Azure Content Understanding MCP server (built in .NET) for quick experimentation.

Links shared:

How do you access PDFs stored in SharePoint?

📹 35:39

If you just want to ask questions about a document, you could use Work IQ (which can query SharePoint content). But if you need the full document — for example, to run your own extraction pipeline — you'll need to use the Microsoft Graph API. In the future, Work IQ may add more Graph API functionality, but currently it's limited for full document retrieval.

0 replies

pamelafox · 2026-04-22T06:06:09Z

pamelafox
Apr 22, 2026
Author

2026/04/20: Bug report: Sporadic 400 errors from the Azure AI Search vectorization endpoint

📹 30:43

A community member reported intermittent 400 (Bad Request) errors when using text-embedding-small with Azure AI Search's integrated vectorization. Pamela first suggested checking RBAC permissions — specifically, the search service's managed identity needs the Cognitive Services OpenAI User role assigned to it. She showed this setup in her Bicep templates.

However, since the error was intermittent (working sometimes, failing other times), Pamela suspected it might be a rate limit error that isn't surfacing clearly. She messaged the Azure AI Search PM directly and asked the community member to share their search service ID, subscription, and timestamp of the error so the team could check the logs.

Links shared:

0 replies

pamelafox · 2026-04-22T06:06:11Z

pamelafox
Apr 22, 2026
Author

2026/04/20: Bug report: Authentication succeeds but tool calls fail with the Foundry Atlassian MCP server

📹 36:35

A community member reported that OAuth authentication succeeds for the Atlassian MCP server added from the Foundry catalog to a prompt agent, but tool calls fail. Pamela acknowledged there are known issues with remote MCP servers on Foundry, showing a similar internal server error she encountered.

For debugging, she recommended:

Capturing the request ID from the error response and sharing it so the Foundry team can check logs.
Considering Foundry hosted agents instead of prompt agents, which are easier to debug because the agentic loop runs in a Docker container where you can view container logs.

The new version of Foundry hosted agents is expected to ship this week or the following week.

3 replies

nnellanspdl Apr 22, 2026

Oooh, excited to see the new version of Foundry Hosted Agents. Does this mean I can finally use private VNet injection with them?

pamelafox Apr 23, 2026
Author

Today's launch blog post says that should be supported:
https://devblogs.microsoft.com/foundry/from-local-to-production-the-complete-developer-journey-for-building-composing-and-deploying-ai-agents/#step-5:-host-and-manage-agents-at-scale-%E2%80%94-hosted-agents-in-foundry-agent-service-(public-preview)

I have not personally had time to test that out, so if you have issues, let us know.

nnellanspdl Apr 29, 2026

2 steps forward, 1 step back :)

I can now use Hosted Agents in my own private Virtual Network, yay! But, there is a still a limitation that the Azure Container Registry that hosts my custom container images must still be publicly accessible. This is a blocker for my company, we simply cannot allow our ACR to be publicly accessible.

pamelafox · 2026-04-22T06:06:12Z

pamelafox
Apr 22, 2026
Author

2026/04/20: Any tips for the Vancouver Web Summit hackathon?

📹 42:25

A community member based in Vancouver mentioned they planned to submit to the Microsoft Vancouver Web Summit GitHub Copilot SDK Hackathon. Pamela suggested looking at the recently announced Agents League hackathon winners for inspiration on what judges look for.

Links shared:

0 replies

pamelafox · 2026-04-22T06:06:13Z

pamelafox
Apr 22, 2026
Author

2026/04/20: Announcements

📹 0:42

GitHub Copilot pricing changes: New signups for GitHub Copilot Pro, Pro+, and Student plans are paused due to high demand. Free tier (with rate limits) is still available, and Business/Enterprise plans are unaffected. Additionally, Opus models have been removed from Pro — only Pro+ gets Opus 4.7.

Changes to GitHub Copilot plans for individuals

VS Code 1.116 updates: Copilot is now built-in to VS Code, Claude Opus 4.7 is GA in Copilot, thinking effort is configurable in Copilot CLI, and the Agent Host Protocol now supports subagents and teams.

VS Code 1.116 Release Notes

Foundry hosted agents livestream series: Starting April 27-30, covering Agent Framework agents on Foundry, LangChain/LangGraph agents on Foundry, and evaluation/safety.

Host your agents on Foundry series

New Microsoft certifications: Two new AI certifications were shared — AI Agent Builder Associate (Copilot Studio focused) and Azure AI Apps and Agents Developer Associate (Python/Azure AI focused, with an 80% discount for the first 300 people before May 7th).

PyCon US 2026: Pamela will be giving an MCP tutorial on Wednesday, a tutorial at the Edu Summit on Thursday, and a sponsored session on Thursday or Friday. Microsoft booth will be open Thursday evening through Saturday.

PyCon 2026 MCP tutorial repo

Upcoming events:

Azure Cosmos DB Conf (Online): April 28
AgentCon (Silicon Valley): May 4
PyCon US (Long Beach): May 13-19
Microsoft Build (SF): June 2-3

0 replies

pamelafox · 2026-04-28T22:22:37Z

pamelafox
Apr 28, 2026
Author

2026/04/28: Update: Do Foundry evaluations stay in your tenant?

📹 1:03

Pamela followed up on a question from the previous day's office hours. She confirmed with the Foundry evaluation team that you must bring your own storage container if you want evaluation data to stay in your tenant. This is a hard requirement, not just a recommendation.

Links shared:

Evaluation regions and limits

0 replies

pamelafox · 2026-04-28T22:22:38Z

pamelafox
Apr 28, 2026
Author

2026/04/28: How could we use GraphRAG from Cosmos DB in a hosted agent for memory and knowledge?

📹 4:52

The term "graph RAG" gets used in different ways — the Microsoft Research GraphRAG project versus any RAG approach that does a graph query. The Cosmos DB Conf session covered the approach described in this Cazton blog post, which benchmarks four AI agent memory strategies (including an entity graph approach using Cosmos DB and OpenAI) across recall, token cost, and latency — the entity graph strategy achieved 100% recall.

For integrating any Azure service (including Cosmos DB) into a hosted agent using keyless auth:

Get a token scoped to Cosmos DB using the agent's managed identity — each hosted agent has its own identity and principal ID.
Assign the necessary Cosmos DB role to the agent's identity in your post-deploy script (the same pattern used for Azure AI Search).

For the memory use case, implement a custom context provider for Agent Framework — context providers are called on every agent invocation to inject memory. Look at the existing Redis or Mem0 context provider implementations as a starting point and ask GitHub Copilot to adapt one for Cosmos DB.

For the knowledge retrieval use case, implement a tool instead — tools are better for knowledge because you typically want the agent to decide when to query, whereas memory should always be checked.

Links shared:

0 replies

pamelafox · 2026-04-28T22:22:39Z

pamelafox
Apr 28, 2026
Author

2026/04/28: Which model is best for RAG-based chatbots?

📹 19:17

Avoid GPT-4o (gpt-4o) for new projects — move to more recent models. Based on Pamela's own RAG evals, GPT-5 (full) performed well, and GPT-4.1 mini was a solid smaller option. She still needs to rerun evals on GPT-5.5 before merging an upgrade PR for the RAG demo.

The GPT-5.5 prompting guide recommends treating it as an entirely new model family to tune for, not just an incremental upgrade — worth reading if you're planning to migrate.

Links shared:

0 replies

pamelafox · 2026-04-28T22:22:41Z

pamelafox
Apr 28, 2026
Author

2026/04/28: How come I can't deploy the Mistral OCR model anymore?

📹 21:52

There is a known open issue where Mistral models are not showing up in the Foundry catalog — the Foundry team is actively working on it. This is not a deprecation. The model should reappear once the issue is resolved.

Links shared:

0 replies

pamelafox · 2026-04-28T22:22:42Z

pamelafox
Apr 28, 2026
Author

2026/04/28: I'm getting 408 timeouts when asking the model to query multiple tools at once — is it a prompt issue or a model issue?

📹 35:08

The "The operation was timeout." error message is a known Azure OpenAI error. A few things to investigate:

Enable debug logging: Set logging.basicConfig(level=logging.DEBUG) locally to see all outgoing requests and identify exactly where the timeout occurs.
Check context window size: If either tool is returning a very large response, the combined prompt may be hitting token limits or taking too long to process. Check your token usage.
Local vs. remote tools: If your tools are local (running in your own code), a timeout is unexpected unless the prompt is very large. If they're remote tools (e.g., an MCP server), network latency could be a factor.
Retry with tenacity: If the timeout is intermittent rather than consistent, the tenacity Python library can add retry logic on timeouts.

The key is to gather more data before guessing at the root cause — look at token counts, check whether the timeout happens before or after a response is received, and narrow down which tool or model call is failing.

Links shared:

0 replies

pamelafox · 2026-04-28T22:22:43Z

pamelafox
Apr 28, 2026
Author

2026/04/28: Any inputs on PageIndex vs. vector RAG?

📹 39:59

Based on feedback from Pamela's colleague who specializes in retrieval: PageIndex does work, but it's document-type dependent. It tends to perform best on long documents where traditional chunking and vector search struggle. It may not be a universal improvement. The recommendation is to set up your own evaluations with your actual data and compare retrieval quality.

There is no formal Microsoft support for PageIndex in any of the current RAG demos, but it's worth experimenting with if you have long-document use cases.

Links shared:

PageIndex GitHub

0 replies

pamelafox · 2026-04-28T22:22:44Z

pamelafox
Apr 28, 2026
Author

2026/04/28: Announcements

📹 1:56

Foundry Hosted Agents public preview launched: The new hosted agents platform (with fast microVM infrastructure) launched last week. It's in public preview and still stabilizing — some roughness expected.

GitHub Copilot moving to usage-based billing: Starting June 1, Copilot usage will consume GitHub AI Credits. Pamela noted she's been trying to use smaller models (Sonnet, Haiku, GPT-4.1) as a result. Strategies to manage costs: choose models intentionally, use auto mode (VS Code is improving task-based routing), or bring your own API keys.

GPT-5.5 now in Azure Foundry: Available as of April 23rd.

GPT-5.5 in Microsoft Foundry

MAI-Image-2: Microsoft's new in-house text-to-image model, available in Foundry. Pamela demonstrated it generating a photorealistic Jedi costume image from a photo of her face — impressive facial likeness quality.

Pydantic Monty $5K sandbox escape bounty: Pydantic is running a competition to find exploits in the Monty Python sandbox. A good example of open-source security hardening through incentivized bug hunting.

Hack Monty: Win $5,000

FastMCP 3.2: Full MCP Apps support released.

FastMCP 3.2 blog post

GitHub merge queues deep-dive: A useful blog post for maintainers who merge many PRs concurrently — merge queues test PRs in order before merge to ensure compatibility.

Improving developer velocity with GitHub merge queue

Deploying Anthropic (Claude) to Foundry: Pamela showed a demo repo with Bicep for deploying Anthropic models to Foundry. The Bicep is similar to OpenAI model deployments but requires an organization name, country code, and industry. Not available on internal Microsoft accounts, but works on personal/customer subscriptions.

python-foundryagent-anthropic-demo

presentation-skills repo: Pamela published a new repo collecting all her Copilot skills for working with presentations (creating and writing up talks).

presentation-skills GitHub repo

Azure Cosmos DB Conf: Happening April 28 (today), live stream available.

Azure Cosmos DB Conf 2026

Upcoming events:

0 replies

kinthaiofficial · 2026-04-29T00:10:34Z

kinthaiofficial
Apr 29, 2026

For teams moving from single-agent Python prototypes to multi-agent production, the Python patterns that seem natural become anti-patterns at scale. A few common issues:

Sharing OpenAI/Anthropic clients across agents — fine in prototypes, breaks in production. Each agent should have its own client (or at least its own rate limit tracking) because a single starved agent can exhaust the shared client's rate limits, silently degrading all other agents.

Using Python asyncio for agent concurrency — the event loop becomes a bottleneck. Agent LLM calls are I/O-bound but the response processing (context assembly, memory updates) can be CPU-bound. Better: use separate worker processes with message queues between them.

Context = conversation history — the natural pattern in Python AI code is to append every turn to a list and pass it as messages. At 100+ turns, this becomes a context explosion. You need progressive compaction: summarize older turns, keep recent ones verbatim.

No budget tracking = production accidents — Python makes it easy to await client.messages.create(...) without tracking what you've spent. In a multi-agent system, one agent's runaway loop can cost hundreds of dollars before you notice. Budget tracking needs to be at the infrastructure layer, not left to each agent's application code.

Naive retry logic amplifies costs — @retry(attempts=3) on a token-heavy LLM call means a single failed request can triple your cost on that turn. Retries should be smarter: exponential backoff, check if the error is retriable before retrying, use a simpler model on retry rather than the same expensive one.

These patterns from production: https://blog.kinthai.ai/agent-wallet-economic-models-autonomous-agents

0 replies

Microsoft Foundry

Python + AI Weekly Office Hours: Recordings & Resources #280

Uh oh!

Uh oh!

pamelafox Jan 7, 2026

April 28, 2026

April 20, 2026

April 13, 2026

April 7, 2026

March 31, 2026

March 24, 2026

March 17, 2026

February 17, 2026

February 10, 2026

February 3, 2026

January 27, 2026

January 20, 2026

January 13, 2026

January 6, 2025

Replies: 133 comments · 3 replies

Uh oh!

pamelafox Jan 8, 2026 Author

Uh oh!

pamelafox Jan 8, 2026 Author

Uh oh!

pamelafox Jan 8, 2026 Author

Uh oh!

pamelafox Jan 8, 2026 Author

Uh oh!

pamelafox Jan 8, 2026 Author

Uh oh!

pamelafox Jan 8, 2026 Author

Uh oh!

pamelafox Jan 8, 2026 Author

Uh oh!

pamelafox Jan 8, 2026 Author

Uh oh!

pamelafox Jan 8, 2026 Author

Uh oh!

pamelafox Jan 13, 2026 Author

Uh oh!

pamelafox Jan 13, 2026 Author

Uh oh!

pamelafox Jan 13, 2026 Author

Uh oh!

pamelafox Jan 13, 2026 Author

Uh oh!

pamelafox Jan 13, 2026 Author

Uh oh!

pamelafox Jan 13, 2026 Author

Uh oh!

pamelafox Apr 15, 2026 Author

Uh oh!

pamelafox Apr 15, 2026 Author

Uh oh!

pamelafox Apr 22, 2026 Author

Uh oh!

pamelafox Apr 22, 2026 Author

Uh oh!

pamelafox Apr 22, 2026 Author

Uh oh!

nnellanspdl Apr 22, 2026

Uh oh!

pamelafox Apr 23, 2026 Author

Uh oh!

nnellanspdl Apr 29, 2026

Uh oh!

pamelafox Apr 22, 2026 Author

Uh oh!

pamelafox Apr 22, 2026 Author

Uh oh!

pamelafox Apr 28, 2026 Author

Uh oh!

pamelafox Apr 28, 2026 Author

Uh oh!

pamelafox Apr 28, 2026 Author

Uh oh!

pamelafox Apr 28, 2026 Author

Uh oh!

pamelafox Apr 28, 2026 Author

pamelafox
Jan 7, 2026

Replies: 133 comments 3 replies

pamelafox
Jan 8, 2026
Author

pamelafox
Jan 8, 2026
Author

pamelafox
Jan 8, 2026
Author

pamelafox
Jan 8, 2026
Author

pamelafox
Jan 8, 2026
Author

pamelafox
Jan 8, 2026
Author

pamelafox
Jan 8, 2026
Author

pamelafox
Jan 8, 2026
Author

pamelafox
Jan 8, 2026
Author

pamelafox
Jan 13, 2026
Author

pamelafox
Jan 13, 2026
Author

pamelafox
Jan 13, 2026
Author

pamelafox
Jan 13, 2026
Author

pamelafox
Jan 13, 2026
Author

pamelafox
Jan 13, 2026
Author

pamelafox
Apr 15, 2026
Author

pamelafox
Apr 15, 2026
Author

pamelafox
Apr 22, 2026
Author

pamelafox
Apr 22, 2026
Author

pamelafox
Apr 22, 2026
Author

pamelafox Apr 23, 2026
Author

pamelafox
Apr 22, 2026
Author

pamelafox
Apr 22, 2026
Author

pamelafox
Apr 28, 2026
Author

pamelafox
Apr 28, 2026
Author

pamelafox
Apr 28, 2026
Author

pamelafox
Apr 28, 2026
Author

pamelafox
Apr 28, 2026
Author