!!! tip "Looking for example notebooks?"
For example notebooks, check out examples/ai/chat on our
GitHub.
/// marimo-embed size: large
@app.cell
def __():
def simple_echo_model(messages, config):
return f"You said: {messages[-1].content}"
mo.ui.chat(
simple_echo_model,
prompts=["Hello", "How are you?"],
show_configuration_controls=True
)
return///
The chat UI element provides an interactive chatbot interface for conversations. It can be customized with different models, including built-in AI models from popular providers or custom functions.
::: marimo.ui.chat
Here's a simple example using a custom echo model:
import marimo as mo
def echo_model(messages, config):
return f"Echo: {messages[-1].content}"
chat = mo.ui.chat(echo_model, prompts=["Hello", "How are you?"])
chatHere, messages is a list of [ChatMessage][marimo.ai.ChatMessage] objects,
which has role ("user", "assistant", or "system") and content (the
message string) attributes; config is a
[ChatModelConfig][marimo.ai.ChatModelConfig] object with various
configuration parameters, which you are free to ignore.
marimo has first class support for pydantic-ai. Use the Agent class to build your chatbot and the Chat UI will display reasoning steps, tool calls and more.
from pydantic_ai import Agent
import marimo as mo
assistant = Agent(
"openai:gpt-5",
system_prompt="You are a helpful assistant.",
)
chat = mo.ui.chat(mo.ai.llm.pydantic_ai(assistant))
chatYou can use marimo's built-in AI models, such as OpenAI's GPT:
import marimo as mo
chat = mo.ui.chat(
mo.ai.llm.openai(
"gpt-4",
system_message="You are a helpful assistant.",
),
show_configuration_controls=True
)
chatYou can access the chat history using the value attribute:
chat.valueThis returns a list of [ChatMessage][marimo.ai.ChatMessage] objects, each
containing id, role, parts and metadata attributes. The content and attachments attributes are supported for basic models.
???+ note
For pydantic-ai, the messages are mapped to [Vercel UI messages](https://github.com/pydantic/pydantic-ai/blob/9aa6dd40efafd93c04c19c2ef5596a454906ca53/pydantic_ai_slim/pydantic_ai/ui/vercel_ai/request_types.py). To convert to Pydantic messages, use the
adapter function.
```python
from pydantic_ai.ui.vercel_ai import VercelAIAdapter
messages = VercelAIAdapter.load_messages(chat.value)
```
::: marimo.ai.ChatMessage
Here's an example of a custom model that uses additional context:
import marimo as mo
def rag_model(messages, config):
question = messages[-1].content
docs = find_relevant_docs(question)
context = "\n".join(docs)
prompt = f"Context: {context}\n\nQuestion: {question}\n\nAnswer:"
response = query_llm(prompt, config)
return response
mo.ui.chat(rag_model)This example demonstrates how you can implement a Retrieval-Augmented Generation (RAG) model within the chat interface.
You can pass sample prompts to mo.ui.chat to allow users to select from a
list of predefined prompts. By including a {{var}} in the prompt, you can
dynamically insert values into the prompt; a form will be generated to allow
users to fill in the variables.
mo.ui.chat(
mo.ai.llm.openai("gpt-4o"),
prompts=[
"What is the capital of France?",
"What is the capital of Germany?",
"What is the capital of {{country}}?",
],
)You can allow users to upload attachments to their messages by passing an
allow_attachments parameter to mo.ui.chat.
mo.ui.chat(
rag_model,
allow_attachments=["image/png", "image/jpeg"],
# or True for any attachment type
# allow_attachments=True,
)Chatbots can stream responses in real-time, creating a more interactive experience similar to ChatGPT where you see the response appear word-by-word as it's generated.
Responses from built-in models (OpenAI, Anthropic, Google, Groq, Bedrock) are streamed by default.
marimo uses delta-based streaming, which follows the industry-standard pattern used by OpenAI, Anthropic, and other AI providers. Your generator function should yield individual chunks (deltas) of new content, which marimo automatically accumulates and displays progressively.
For custom models, you can use either regular (sync) or async generator functions that yield delta chunks:
Sync generator (simpler):
import marimo as mo
import time
def streaming_model(messages, config):
"""Stream responses word by word."""
response = "This response will appear word by word!"
words = response.split()
for word in words:
yield word + " " # Yield delta chunks
time.sleep(0.1) # Simulate processing delay
chat = mo.ui.chat(streaming_model)
chatAsync generator (for async operations):
import marimo as mo
import asyncio
async def async_streaming_model(messages, config):
"""Stream responses word by word asynchronously."""
response = "This response will appear word by word!"
words = response.split()
for word in words:
yield word + " " # Yield delta chunks
await asyncio.sleep(0.1) # Async processing delay
chat = mo.ui.chat(async_streaming_model)
chatEach yield sends a new chunk (delta) to marimo, which accumulates and displays
the progressively building response in real-time.
!!! tip "Delta vs Accumulated" Yield deltas, not accumulated text. Each yield should be new content only:
✅ **Correct (delta mode):**
```python
yield "Hello"
yield " "
yield "world"
# Result: "Hello world"
```
❌ **Incorrect (accumulated mode, deprecated):**
```python
yield "Hello"
yield "Hello "
yield "Hello world"
# Inefficient: sends duplicate content
```
Delta mode is more efficient (reduces bandwidth by ~99% for long responses) and aligns with standard streaming APIs.
!!! tip "See streaming examples" For complete working examples, check out:
- [`openai_example.py`](https://github.com/marimo-team/marimo/blob/main/examples/ai/chat/openai_example.py) - OpenAI chatbot with streaming (default)
- [`streaming_custom.py`](https://github.com/marimo-team/marimo/blob/main/examples/ai/chat/streaming_custom.py) - Custom streaming chatbot
marimo provides several built-in AI models that you can use with the chat UI element.
import marimo as mo
from pydantic_ai import Agent
assistant = Agent(
"openai:gpt-5",
system_prompt="You are a helpful assistant.",
)
mo.ui.chat(mo.ai.llm.pydantic_ai(assistant))::: marimo.ai.llm.pydantic_ai
import marimo as mo
mo.ui.chat(
mo.ai.llm.openai(
"gpt-4o",
system_message="You are a helpful assistant.",
api_key="sk-proj-...",
),
show_configuration_controls=True
)::: marimo.ai.llm.openai
import marimo as mo
mo.ui.chat(
mo.ai.llm.anthropic(
"claude-3-5-sonnet-20240620",
system_message="You are a helpful assistant.",
api_key="sk-ant-...",
),
show_configuration_controls=True
)::: marimo.ai.llm.anthropic
import marimo as mo
mo.ui.chat(
mo.ai.llm.google(
"gemini-1.5-pro-latest",
system_message="You are a helpful assistant.",
api_key="AI..",
),
show_configuration_controls=True
)::: marimo.ai.llm.google
import marimo as mo
mo.ui.chat(
mo.ai.llm.groq(
"llama-3.1-70b-versatile",
system_message="You are a helpful assistant.",
api_key="gsk-...",
),
show_configuration_controls=True
)::: marimo.ai.llm.groq
Chatbots can be implemented with a function that receives a list of
[ChatMessage][marimo.ai.ChatMessage] objects and a
[ChatModelConfig][marimo.ai.ChatModelConfig].
::: marimo.ai.ChatMessage
::: marimo.ai.ChatModelConfig
[mo.ui.chat][marimo.ui.chat] can be instantiated with an initial
configuration with a dictionary conforming to the config.
ChatMessages can also include attachments.
::: marimo.ai.ChatAttachment
We support any OpenAI-compatible endpoint. If you want any specific provider added explicitly (ones that don't abide by the standard OpenAI API format), you can file a feature request.
Normally, overriding the base_url parameter should work. Here are some examples:
/// tab | Cerebras
chatbot = mo.ui.chat(
mo.ai.llm.openai(
model="llama3.1-8b",
api_key="csk-...", # insert your key here
base_url="https://api.cerebras.ai/v1/",
),
)
chatbot///
/// tab | Groq
chatbot = mo.ui.chat(
mo.ai.llm.openai(
model="llama-3.1-70b-versatile",
api_key="gsk_...", # insert your key here
base_url="https://api.groq.com/openai/v1/",
),
)
chatbot///
/// tab | xAI
chatbot = mo.ui.chat(
mo.ai.llm.openai(
model="grok-beta",
api_key=key, # insert your key here
base_url="https://api.x.ai/v1",
),
)
chatbot///
!!! note
We have added examples for GROQ and Cerebras. These providers offer free API keys and are great for trying out Llama models (from Meta). You can sign up on their platforms and integrate with various AI integrations in marimo easily. For more information, refer to the [AI completion documentation in marimo](../../guides/editor_features/ai_completion.md).