Uh oh!

There was an error while loading. Please reload this page.

evaleval / every_eval_ever Public

Notifications You must be signed in to change notification settings
Fork 42
Star 87

Code
Issues 32
Pull requests 14
Discussions
Actions
Projects
Security and quality
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security and quality
Insights

Pull requests: evaleval/every_eval_ever

Labels 18 Milestones 0

New pull request New

14 Open 121 Closed

Author

Filter by author

Uh oh!

There was an error while loading. Please reload this page.

Label

Filter by label

Uh oh!

There was an error while loading. Please reload this page.

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Uh oh!

There was an error while loading. Please reload this page.

Milestones

Filter by milestone

Uh oh!

There was an error while loading. Please reload this page.

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Uh oh!

There was an error while loading. Please reload this page.

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

Add BenchPress score-matrix adapter (utils/benchpress)

#197 opened Jun 30, 2026 by borgr Collaborator

Loading…

Move validator business logic into every_eval_ever and update to de-duplication logic

#194 opened Jun 29, 2026 by nelaturuharsha Collaborator

Loading…

[Submission] Add tau-bench leaderboard adapter

#192 opened Jun 22, 2026 by benshi34

Loading…

[DRAFT] feat: add generalized Kaggle Community Benchmarks adapter

#191 opened Jun 21, 2026 by mrshu Contributor • Draft

[Adapter] Add AlpacaEval 1.0 and 2.0 leaderboard adapter

#190 opened Jun 20, 2026 by karthikchundi-commits Contributor

Loading…

Add BountyBench converter to utils

#188 opened Jun 15, 2026 by borgr Collaborator

Loading…

arc_agi adapter for flat storage with manifest and instance_level indexes for hf

#184 opened Jun 11, 2026 by DeepLumiere

Loading…

Add LEXam public leaderboard converter

#160 opened Jun 8, 2026 by JoelNiklaus

Loading…

Add Vectara Hallucination Leaderboard adapter

#157 opened Jun 3, 2026 by mohammadrezakarami

Loading…

[Feature] Add text-to-image modality support stale

#137 opened May 18, 2026 by felifri

Loading…

5 tasks done

Fix LLM Stats evaluator provenance schema

#136 opened May 16, 2026 by tommasocerruti Member

Loading…

[DRAFT] Transparent Compression

#133 opened May 8, 2026 by Erotemic Collaborator

Loading…

Canonical identity and schema upgrade tooling

#116 opened Apr 24, 2026 by yananlong Contributor

Loading…

Draft Proposal: Agent Session Result Layer stale

#70 opened Mar 17, 2026 by elronbandel Contributor • Draft

ProTip! Follow long discussions with comments:>50.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!