Paper list of alexine
Here is a list of papers I keep updating as I read them.
I take notes following this method.
- [NeurIPS 2017] Attention Is All You Need, arXiv
- [NAACL 2018] BERT, arXiv
- [arXiv 2025.10] Less is More: Recursive Reasoning with Tiny Networks, arXiv
- [arXiv 2020.01] Scaling Laws for Neural Language Models, arXiv
- [SC 2017] Efficient Large-Scale Deep Learning on GPU Clusters Using Megatron’s Ring-AllReduce, arXiv
- [arXiv 2022.05] FlashAttention, arXiv
- [Blog 2025.03] Memory is Slow, Disk is Fast, blog
- [arXiv 2019.09] Megatron-LM, arXiv
- [arXiv 2022.09] Monolith, arXiv
- [arXiv 2025.10] The Art of Scaling Reinforcement Learning Compute for LLMs, arXiv
- [NeurIPS 2017] Deep RL from Human Preferences (RLHF), arXiv
- [Target 2025] Self-Rewarding Vision-Language Model via Reasoning Decomposition, arXiv
- [arXiv 2023.09] SeeClick: Harnessing Web Interfaces for Generalizable Reinforcement Learning Agents, arXiv
- [arXiv 2020.09] Learning to Summarize with Human Feedback, arXiv
- [arXiv 2024.04] RL Razor, arXiv
- [Target 2025] Self-Rewarding Vision-Language Model via Reasoning Decomposition, arXiv
- [ICLR 2023] Generative Agents: Interactive Simulacra of Human Behavior, arXiv