Skip to content
View DolbyUUU's full-sized avatar
👻
👻

Block or report DolbyUUU

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Pinned Loading

  1. DeepEnlighten DeepEnlighten Public

    Pure RL to post-train base models for social reasoning capabilities. Lightweight replication of DeepSeek-R1-Zero with Social IQa dataset.

    Python 38

  2. Logic-RL-Lite Logic-RL-Lite Public

    Lightweight replication study of DeepSeek-R1-Zero. Interesting findings include "No Aha Moment", "Longer CoT ≠ Accuracy", and "Language Mixing in Instruct Models".

    Python 49

  3. Awesome-LLM-Interview-Questions-and-Answers Awesome-LLM-Interview-Questions-and-Answers Public

    大模型算法工程师、大模型Agent开发工程师面试:常见题目和答案

    17 1