A production-ready chatbot example built with LangGraph and SQLite persistence, demonstrating multi-user, multi-thread conversation management with streaming support.
This example implements a persistent chatbot using:
- LangGraph for agent orchestration
- SQLite for conversation persistence
- LangChain OpenAI for LLM integration
- RunAgent for deployment and management
The chatbot supports multiple users, conversation threads, and real-time streaming responses.
- Persistent Memory: Conversations are stored in SQLite databases and persist across sessions
- Multi-User Support: Each user gets isolated conversation storage
- Multi-Thread Conversations: Users can maintain multiple independent conversation threads
- Streaming Responses: Real-time token-by-token streaming using LangGraph's native
stream_mode="messages" - Conversation History: Retrieve full conversation history for any thread
- Thread Management: List all conversation threads for a user
The agent exposes 4 entrypoints:
chat- Non-streaming chat with persistent memorychat_stream- Streaming chat with real-time token outputget_history- Retrieve conversation history for a threadlist_threads- List all conversation threads for a user
class ChatState(TypedDict):
"""State schema for the chat agent."""
messages: Annotated[list, add_messages]The graph uses LangGraph's add_messages reducer to automatically manage message history.
- Storage Location:
chat_storage/directory (mapped to/persistent/chat_storageby RunAgent) - Database: SQLite database (
conversations.db) - Checkpointing: Uses LangGraph's
SqliteSaverfor state persistence - Thread Isolation: Each
thread_idmaintains separate conversation state
START → chat_node → END
The graph is simple but powerful:
- Receives user message
- Adds system message if needed
- Invokes LLM with full conversation history
- Returns AI response
- Automatically persists state via checkpointer
- Python 3.8+
- OpenAI API key
- RunAgent CLI installed
-
Install dependencies:
pip install -r requirements.txt
-
Set environment variables:
export OPENAI_API_KEY="your-api-key"
-
Deploy with RunAgent:
runagent deploy
from runagent import RunAgentClient
# Initialize client
client = RunAgentClient(
agent_id="your-agent-id",
entrypoint_tag="chat",
local=False,
user_id="user123",
persistent_memory=True
)
# Non-streaming chat
result = client.run(
message="Hello, how are you?",
user_id="user123",
thread_id="conversation_001"
)
print(result['response'])
# Streaming chat
stream_client = RunAgentClient(
agent_id="your-agent-id",
entrypoint_tag="chat_stream",
local=False,
user_id="user123",
persistent_memory=True
)
for chunk in stream_client.run(
message="Tell me a story",
user_id="user123",
thread_id="conversation_001"
):
if chunk.get('type') == 'content':
print(chunk['content'], end='', flush=True)use runagent::{RunAgentClient, RunAgentClientConfig};
use serde_json::json;
let client = RunAgentClient::new(RunAgentClientConfig {
agent_id: "your-agent-id".to_string(),
entrypoint_tag: "chat".to_string(),
local: Some(false),
user_id: Some("user123".to_string()),
persistent_memory: Some(true),
..RunAgentClientConfig::default()
}).await?;
let result = client.run(&[
("message", json!("Hello, how are you?")),
("user_id", json!("user123")),
("thread_id", json!("conversation_001")),
]).await?;Function: chat_response()
Parameters:
message(str): User's input messageuser_id(str): Unique user identifierthread_id(str): Conversation thread ID
Returns:
{
"status": "success",
"response": "AI response text",
"user_id": "user123",
"thread_id": "conversation_001",
"message_count": 5
}Function: chat_response_stream()
Parameters: Same as chat
Yields: Dictionary chunks with:
type: "session_info"- Initial session metadatatype: "content"- Token chunks from LLMtype: "complete"- Completion metadatatype: "error"- Error information
Example:
for chunk in stream_client.run(...):
if chunk.get('type') == 'content':
print(chunk['content'], end='', flush=True)Function: get_conversation_history()
Parameters:
user_id(str): User identifierthread_id(str): Thread identifier
Returns:
{
"status": "success",
"user_id": "user123",
"thread_id": "conversation_001",
"messages": [
{"role": "user", "content": "Hello"},
{"role": "assistant", "content": "Hi there!"}
],
"message_count": 2
}Function: list_user_threads()
Parameters:
user_id(str): User identifier
Returns:
{
"status": "success",
"user_id": "user123",
"threads": ["conversation_001", "conversation_002"],
"thread_count": 2
}Run the Python test suite:
cd test_scripts/python
python client_test_langgraph_sqlite.pyRun the Rust test suite:
-
Navigate to the test folder:
cd runagent/test_scripts/rust/test_langgraph_sqlite -
Export the API key (required for remote deployment):
export RUNAGENT_API_KEY="your-api-key"
-
Run the tests:
cargo run
Note: Make sure you have the RUNAGENT_API_KEY environment variable set before running, as it's required for connecting to the remote agent.
Key configuration options:
persistent_folders:["chat_storage"]- Folders to persist across deploymentsentrypoints: Defines the 4 entrypoints (chat, chat_stream, get_history, list_threads)agent_id: Unique identifier for the deployed agent
OPENAI_API_KEY: Required for LLM accessRUNAGENT_API_KEY: Required for remote deployment (optional for local)
The streaming implementation uses LangGraph's native stream_mode="messages":
for message_chunk, metadata in graph.stream(
input_state,
config=config,
stream_mode="messages" # Stream LLM tokens token-by-token
):
if metadata.get("langgraph_node") == "chat":
yield {"type": "content", "content": message_chunk.content}This provides true token-by-token streaming directly from the LLM, even when using .invoke() in the graph node.
Each user_id maintains separate conversation storage. Users cannot access each other's conversations.
Within a user, different thread_id values create separate conversation contexts:
# Thread 1: Personal
client.run(message="I'm planning a vacation", thread_id="personal")
# Thread 2: Work
client.run(message="I need to debug code", thread_id="work")
# Back to Thread 1 - remembers vacation
client.run(message="Where was I going?", thread_id="personal")- Storage: SQLite database in
chat_storage/conversations.db - Checkpointing: LangGraph's
SqliteSaverhandles state persistence - Thread Safety: Database connection uses
check_same_thread=Falsefor LangGraph compatibility - Persistence: Data persists across:
- Agent restarts
- Client reconnections
- Server deployments (when using persistent storage)
- Customer Support Chatbot: Multi-user support with conversation history
- Personal Assistant: Multiple conversation threads (work, personal, etc.)
- Educational Tutor: Persistent learning sessions per student
- Multi-Tenant SaaS: Isolated conversations per tenant/user
- Ensure entrypoint tag ends with
_streamfor streaming entrypoints - Check that
stream_mode="messages"is used in the graph - Verify WebSocket connection is established
- Check
chat_storage/directory exists and is writable - Verify
persistent_foldersinrunagent.config.json - Ensure
user_idandthread_idare consistent across calls
- SQLite databases grow with conversation history
- Consider implementing conversation pruning for old threads
- Monitor database size in
chat_storage/
langchain: Core LangChain frameworklangchain-openai: OpenAI integrationlanggraph: Graph-based agent orchestrationlanggraph-checkpoint-sqlite: SQLite persistenceopenai: OpenAI API client
Part of the RunAgent examples collection.
This is an example implementation. For improvements or issues, please refer to the main RunAgent repository.