Skip to content
Navigation Menu
Toggle navigation
Sign in
Appearance settings
Platform
GitHub Copilot
Write better code with AI
GitHub Spark
New
Build and deploy intelligent apps
GitHub Models
New
Manage and compare prompts
GitHub Advanced Security
Find and fix vulnerabilities
Actions
Automate any workflow
Codespaces
Instant dev environments
Issues
Plan and track work
Code Review
Manage code changes
Discussions
Collaborate outside of code
Code Search
Find more, search less
Explore
Why GitHub
Documentation
GitHub Skills
Blog
Integrations
GitHub Marketplace
MCP Registry
View all features
Solutions
By company size
Enterprises
Small and medium teams
Startups
Nonprofits
By use case
App Modernization
DevSecOps
DevOps
CI/CD
View all use cases
By industry
Healthcare
Financial services
Manufacturing
Government
View all industries
View all solutions
Resources
Topics
AI
DevOps
Security
Software Development
View all
Explore
Learning Pathways
Events & Webinars
Ebooks & Whitepapers
Customer Stories
Partners
Executive Insights
Open Source
GitHub Sponsors
Fund open source developers
The ReadME Project
GitHub community articles
Repositories
Topics
Trending
Collections
Enterprise
Enterprise platform
AI-powered developer platform
Available add-ons
GitHub Advanced Security
Enterprise-grade security features
Copilot for business
Enterprise-grade AI features
Premium Support
Enterprise-grade 24/7 support
Pricing
Search or jump to...
Search code, repositories, users, issues, pull requests...
Search syntax tips
Provide feedback
Saved searches
Use saved searches to filter your results more quickly
Sign in
Sign up
Appearance settings
Resetting focus
You signed in with another tab or window.
Reload
to refresh your session.
You signed out in another tab or window.
Reload
to refresh your session.
You switched accounts on another tab or window.
Reload
to refresh your session.
Dismiss alert
{{ message }}
sierra-research
/
tau-bench
Public
Notifications
You must be signed in to change notification settings
Fork
134
Star
860
Code
Issues
27
Pull requests
10
Actions
Projects
0
Security
Uh oh!
There was an error while loading.
Please reload this page
.
Insights
Additional navigation options
Code
Issues
Pull requests
Actions
Projects
Security
Insights
Commits
Branch selector
main
User selector
All users
Datepicker
All time
Commit History
Commits on Aug 28, 2025
Merge pull request #64 from noiji/patch-1
Show description for 4754e6b
noahshinn
authored
4754e6b
Copy full SHA for 4754e6b
Commits on Aug 21, 2025
handle None response cost
noiji
authored
a2d3283
Copy full SHA for a2d3283
Commits on Jul 13, 2025
Merge pull request #60 from sierra-research/victorb-sierra-patch-1
Show description for 9ed9fd9
victorb-sierra
authored
9ed9fd9
Copy full SHA for 9ed9fd9
Update README.md
victorb-sierra
authored
385d1fe
Copy full SHA for 385d1fe
Merge pull request #59 from dhh1995/main
Show description for c765667
victorb-sierra
authored
c765667
Copy full SHA for c765667
update README with information about the new tau2
dhh1995
committed
6f193ff
Copy full SHA for 6f193ff
Commits on Jan 22, 2025
Merge pull request #27 from brianzq/brianz/airline-get-user-details-fix
noahshinn
authored
14bf0ef
Copy full SHA for 14bf0ef
provide key detail in airline get_user_details function desc
brianzq
committed
edad7fc
Copy full SHA for edad7fc
Commits on Jan 21, 2025
Merge pull request #25 from adampauls/u/adpauls/revert-t-shirt
Show description for d631fa0
noahshinn
authored
d631fa0
Copy full SHA for d631fa0
Merge pull request #24 from adampauls/fix-province
Show description for 054ae2f
noahshinn
authored
054ae2f
Copy full SHA for 054ae2f
Merge pull request #22 from adampauls/main
Show description for 5c90278
noahshinn
authored
5c90278
Copy full SHA for 5c90278
Merge pull request #26 from brianzq/brianz/retail-better-func-desc
Show description for 64ff9b3
noahshinn
authored
64ff9b3
Copy full SHA for 64ff9b3
Commits on Jan 19, 2025
provide key detail in retail get_user_details function desc
brianz-openai
committed
c4b38de
Copy full SHA for c4b38de
Commits on Jan 17, 2025
Revert t-shirt change.
adampauls
committed
4709449
Copy full SHA for 4709449
Commits on Jan 16, 2025
Remove all references to province
adampauls
committed
404f33e
Copy full SHA for 404f33e
Make the user ID clear for airline.
adampauls
committed
fc90b45
Copy full SHA for fc90b45
Commits on Jan 11, 2025
Merge pull request #19 from eaplatanios/u/eaplatanios/retail-fix
Show description for e1a16cd
noahshinn
authored
e1a16cd
Copy full SHA for e1a16cd
Merge pull request #11 from acrmp/add-platform-to-auto-error-id-cmd
Show description for 563b3e6
noahshinn
authored
563b3e6
Copy full SHA for 563b3e6
Merge pull request #21 from abhishekkumawat23/main
Show description for 5e7f938
noahshinn
authored
5e7f938
Copy full SHA for 5e7f938
Making run.py part of tau_bench package
abhishekkumawat23
committed
07191f1
Copy full SHA for 07191f1
Commits on Dec 20, 2024
Merge pull request #17 from GregoireMialon/gm/add_manifest
noahshinn
authored
3ff5dd3
Copy full SHA for 3ff5dd3
Commits on Dec 12, 2024
.
eaplatanios
committed
7dd6d0d
Copy full SHA for 7dd6d0d
[Easy] Fix number of t-shirts in retail tasks.
eaplatanios
committed
cca6acd
Copy full SHA for cca6acd
Commits on Dec 10, 2024
Merge pull request #15 from eaplatanios/test-task-fixes
Show description for 77c01fc
noahshinn
authored
77c01fc
Copy full SHA for 77c01fc
Merge pull request #16 from eaplatanios/metric-change
Show description for 67e6b12
noahshinn
authored
67e6b12
Copy full SHA for 67e6b12
Commits on Dec 3, 2024
also includes json
GregoireMialon
committed
ed09ea9
Copy full SHA for ed09ea9
added manifest to include wiki.md s
GregoireMialon
committed
ee8f7a9
Copy full SHA for ee8f7a9
Commits on Nov 23, 2024
Add note about historical trajectories
Noah Shinn
committed
b581e61
Copy full SHA for b581e61
Add historical trajectories
Noah Shinn
committed
7362a17
Copy full SHA for 7362a17
Commits on Nov 21, 2024
Add a few shot agent
Noah Shinn
committed
a50b6c1
Copy full SHA for a50b6c1
Commits on Nov 11, 2024
The reward should be zero if the database state is incorrect.
eaplatanios
committed
718ffa1
Copy full SHA for 718ffa1
Fixes for some of the retail test tasks.
eaplatanios
committed
fbc1c58
Copy full SHA for fbc1c58
Commits on Oct 23, 2024
Add --platform flag to auto error id command line
Show description for 0ed3bd8
acrmp
committed
0ed3bd8
Copy full SHA for 0ed3bd8
Commits on Oct 22, 2024
Add the new claude 3.5 sonnet model to the leaderboard
Noah Shinn
committed
977d520
Copy full SHA for 977d520
Commits on Oct 17, 2024
Make compatible with python3.10
Noah Shinn
committed
1c9cf19
Copy full SHA for 1c9cf19
Pagination
Previous
Next
You can’t perform that action at this time.