Keith Truongcao’s Post

REDUCE ONLINE DPO MEMORY CONSUMPTION WITH UNSLOTH QLORA (OPEN SOURCE) Find my open source contributions and details about how you can reduce VRAM usage for ONLINE DPO finetuning below in my substack blog! (All code in the blog!) I believe that more official Online RLHF support for Unsloth will be coming out sometime soon. Special thanks to the Unsloth AI team (Daniel Han and Michael Han (Unsloth)), Edward Kim and Costa Huang for helping me make this project come together! blog: https://lnkd.in/eCej2J5q

  • chart

this is huge for the employed

Like
Reply
Handaru Sakti

Building Primebone | Disciplined Forecasting for Fixed Income Investments

10mo

Such an imbalanced y_train 😂

Like
Reply

Very informative keith!

Like
Reply

Very interesting VRAM optimization with Unsloth. Great approach!

Like
Reply

Stellar work Keith! Props to you and the Unsloth AI team for your collective perserverance.

Like
Reply
See more comments

To view or add a comment, sign in

Explore content categories