The codebase for the preprint `Do Thinking Tokens Help or Trap? Towards More Efficient Large Reasoning Model'
-
Notifications
You must be signed in to change notification settings - Fork 1
Danield21/Dual-Policy-Preference-Optimization
About
The codebase for the preprint `Do Thinking Tokens Help or Trap? Towards More Efficient Large Reasoning Model'
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published