SearchAgent-Zero

Based on the Qwen3-8B and Verl framework, it is trained using pure reinforcement learning. Compared to models of the same level, this model achieves a state-of-the-art performance (SOTA) of 37.95 on the BrowseComp-Plus dataset, surpassing many large commercial models (such as Gemini 2.5 Pro and Kimi-K2).

This model possesses multi-turn search capabilities, averaging over 20 searches on the training set and over 40 searches on the test set, and it generalizes well to shallow search tasks.

This model is still under development; technical details will be updated here and on Zhihu later.

https://www.zhihu.com/people/li-jia-cheng-63-47

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SearchAgent-Zero

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 1

NLPJCL/SearchAgent-Zero

Folders and files

Latest commit

History

Repository files navigation

SearchAgent-Zero

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 1

Packages