issues Search Results · repo:LLMBook-zh/LLMBook-zh.github.io language:Python
Filter by
53 results (76 ms)
53 results
inLLMBook-zh/LLMBook-zh.github.io (press backspace or delete to remove)img width= 1369 height= 1404 alt= Image src=
https://github.com/user-attachments/assets/72768b15-f209-4950-b444-ebe1a4e84bae /
p91 式(5.27),第二项p少了转置
p92 并和对应的查询和键进行相乘进行融合。
p145 在训练方式上,指令微调与预训练较为相似,很多设置包括数据组织形式都 可以预训练阶段所采用的技术(参考第 4 章和第 6 章)。本节主要介绍指令微调所 特有的一些训练策略。
p146 指令微调中的优化器设置(AdamW 或 Adafactor)、稳定训练技巧(权重衰减 和梯度裁剪)和训练技术(3D ...
https://github.com/LLMBook-zh/LLMBook-zh.github.io/blob/main/slides/%E7%AC%AC%E4%BA%8C%E8%AF%BE%20%E6%A8%A1%E5%9E%8B%E6%9E%B6%E6%9E%84/2.3%20%E9%95%BF%E4%B8%8A%E4%B8%8B%E6%96%87%E6%A8%A1%E5%9E%8B%E5%92%8C%E6%96%B0%E5%9E%8B%E6%9E%B6%E6%9E%84.pdf ...
do you plan to release a english version? thank you for your response
在第23页的“进一步,思维链所带来的提升在540B 参数的PaLM 模型上会更加明细。”这一句内容中,最后的“明细”应该是“明显”

Learn how you can use GitHub Issues to plan and track your work.
Save views for sprints, backlogs, teams, or releases. Rank, sort, and filter issues to suit the occasion. The possibilities are endless.Learn more about GitHub IssuesProTip! Restrict your search to the title by using the in:title qualifier.