A r.uby.dev project.
Welcome to the canonical llm.rb repository.
llm.rb is not a library, framework or toolkit but an advanced runtime for building highly capable AI applications on CRuby. By default it has zero runtime dependencies although certain functionality – such as ActiveRecord support – require optional dependencies that are opt-in.
The runtime supports OpenAI, OpenAI-compatible endpoints, Anthropic, Google Gemini, DeepSeek, DeepInfra, xAI, Z.ai, AWS Bedrock, Ollama, and llama.cpp. It has first-class support for streaming, tool calls, MCP and A2A, embeddings, vector stores and the RAG pattern.
There are multiple HTTP backends to choose from, tools can be run concurrently or in parallel via threads, async tasks, fibers, ractors, and fork, and it is also possible to make a tool call while the model is still streaming.
The runtime builds on top of three core concepts: providers, contexts, and agents, so once you learn the fundamentals, everything else falls into place naturally. And once you learn llm.rb, you will also be able to use mruby-llm and wasm-llm because the API is pretty much identical.
gem install llm.rbThe LLM::Agent class is the default high-level interface,
and it is recommended for most use-cases. It manages tool execution
automatically, guards against infinite loops, manages conversation
state, and much more.
require "llm"
llm = LLM.deepseek(key: ENV["KEY"])
agent = LLM::Agent.new(llm, stream: $stdout)
agent.talk "Hello world"The LLM::Context class is at the heart of the runtime
and it is what LLM::Agent uses under the hood.
It requires that the tool call loop be managed manually -
sometimes that can be useful, but usually for advanced use-cases.
If you're new to llm.rb, try LLM::Agent first.
require "llm"
llm = LLM.deepseek(key: ENV["KEY"])
ctx = LLM::Context.new(llm, stream: $stdout)
ctx.talk "Hello world"Subclasses of LLM::Tool are plain Ruby classes with
an optional set of typed parameters.
The model can choose to
call them on your behalf, and they're one of the most powerful features
for extending the feature set or abilities of a model.
class ReadFile < LLM::Tool
name "read-file"
description "Read a file"
parameter :path, String, "The filename or path"
required %i[path]
def call(path:)
{contents: File.read(path)}
end
endStreams can be simple IO objects or subclasses of
LLM::Stream with structured callbacks for content,
reasoning, tool calls, tool returns, and compaction.
class MyStream < LLM::Stream
def on_content(content)
print content
end
def on_reasoning_content(content)
warn content
end
end
llm = LLM.deepseek(key: ENV["KEY"])
agent = LLM::Agent.new(llm, stream: MyStream.new)
agent.talk "Explain Ruby fibers."The Model Context Protocol (MCP) has first-class support
in llm.rb. The stdio and http transports work out of the
box. MCP tools are translated into subclasses of
LLM::Tool that can be used with LLM::Context
or LLM::Agent.
require "llm"
llm = LLM.deepseek(key: ENV["KEY"])
mcp = LLM::MCP.stdio(argv: ["ruby", "server.rb"])
agent = LLM::Agent.new(llm, stream: $stdout, tools: mcp.tools)
agent.talk "Run the tool"The Agent 2 Agent (A2A) protocol has first-class support
in llm.rb. The http and jsonrpc transports work out of the
box. A2A skills are translated into subclasses of
LLM::Tool that can be used with LLM::Context
or LLM::Agent.
require "llm"
llm = LLM.deepseek(key: ENV["KEY"])
a2a = LLM::A2A.rest(url: "https://remote-agent.example.com")
agent = LLM::Agent.new(llm, stream: $stdout, tools: a2a.skills)
agent.talk "Run the skill"Most providers offer an embedding model that can be used for semantic search, or similarity search. An embedding model can generate embeddings that can then be stored in a database that is optimized for storing and querying vectors, such as SQLite's sqlite-vec or PostgreSQL's pg-vector.
llm.rb also includes support for OpenAI's vector store API. It provides a vector database as a HTTP service but we won't cover that here.
require "llm"
llm = LLM.openai(key: ENV["KEY"])
body = "llm.rb is Ruby's capable AI runtime."
embedding = llm.embed([body]).embeddings.first
Document.create!(
title: "llm.rb",
body:,
embedding:,
)The runtime supports five different concurrency strategies that have different attributes. The choice between all of them often depends on the requirements of your application.
IO-bound tools are a good fit for the :task, :thread,
and :fiber strategies while true parallelism can be achieved
with the :fork and :ractor strategies. The
:fork strategy also provides a separate process that offers
isolation from its parent.
require "llm"
llm = LLM.deepseek(key: ENV["KEY"])
tools = [FetchNews, FetchStocks, FetchFeeds]
agent = LLM::Agent.new(llm, tools:, concurrency: :fork)
agent.talk "Run the tools in parallel"Because both LLM::Context, and LLM::Agent
can be serialized to JSON and stored in a simple string, both ActiveRecord
and Sequel support can be implemented within a single column on a single row.
The runtime includes first-class support for both ActiveRecord and Sequel, and
for both Rack-based applications and Rails-based applications. On databases
where it is supported, such as PostgreSQL, the column can be optimized by using
the jsonb type.
require "active_record"
require "llm"
require "llm/active_record"
class Agent < ApplicationRecord
acts_as_agent do |agent|
agent.model "deepseek-v4-pro"
agent.instructions "solve the user's query"
agent.tools [Research, FinalizeResearch, ActOnResearch]
end
private
# By convention, this method defines the provider for a model.
# If necessary, it can be renamed with: provider: :your_method.
def set_provider
LLM.deepseek(key: ENV["KEY"])
end
# By convention, this method returns the context options given
# to LLM::Context or LLM::Agent.
def set_context
{}
end
end
agent = Agent.create!
agent.talk "perform research"What providers does llm.rb support?
China-based
- DeepSeek
- zAI
US-based
- OpenAI
- Google (Gemini)
- xAI
- AWS bedrock
- DeepInfra
- Anthropic
Openweights
- DeepSeek
- zAI
- DeepInfra
- AWS bedrock
Host your own
- Ollama
- Llamacpp
I have a limited budget. What should I do?
There a few options. The first option is to host your own model, and use the ollama or llamacpp providers. This can be diffilcult though because a capable model requires hardware that can match it. If you have the ability to self-host, this would be my first option.
The second option is DeepSeek.
The deepseek-v4-flash model costs pennies to use.
And llm.rb has been optimized for deepseek. For example,
DeepSeek does not have image generation capabilities
but on the llm.rb runtime it does (vector graphics only,
though).
The same is true for structured outputs. DeepSeek does not support structured outputs in the same way as OpenAI or Google, but the llm.rb runtime makes it appear as though it does, through the `json_object` response type.
If you're on a budget, DeepSeek is hard to beat.Can I download llm.rb via a decentralized network?
You can!
We are on the radicle.network
Every commit that lands on GitHub also lands on Radicle.
Our repository ID is z2PtfQ6dYwyYaW2aGrztG1sMyDmCE.
Browse on the web.
If you like what you read so far, check out the deepdive.md to learn more. Unfortunately it wasn't possible to cover every feature without the README becoming a small book. The r.uby.dev homepage also includes more learning material and resources.
Business Source License 1.1
Commercial production use requires a commercial license.
Each version converts to the BSD Zero Clause
four years after its first public release.
Contact robert@r.uby.dev for a commercial license.
Waivers are automatically granted for:
- Personal use
- Students
- Teachers
- Evaluation, development, and testing
- Non-profits and charities
- Companies with less than or equal to 50 employees