Skip to content
View suc16's full-sized avatar

Block or report suc16

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Pinned Loading

  1. OpenRLHF/OpenRLHF OpenRLHF/OpenRLHF Public

    An Easy-to-use, Scalable and High-performance RLHF Framework based on Ray (PPO & GRPO & REINFORCE++ & vLLM & Ray & Dynamic Sampling & Async Agentic RL)

    Python 8.5k 822

  2. ToolAlpaca ToolAlpaca Public

    Forked from tangqiaoyu/ToolAlpaca

    ToolAlpaca: Generalized Tool Learning for Language Models with 3000 Simulated Cases

    Python