Skip to content

Danield21/Dual-Policy-Preference-Optimization

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 

Repository files navigation

Dual-Policy-Preference-Optimization

The codebase for the preprint `Do Thinking Tokens Help or Trap? Towards More Efficient Large Reasoning Model'

About

The codebase for the preprint `Do Thinking Tokens Help or Trap? Towards More Efficient Large Reasoning Model'

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published