Skip to content
View jt-zhang's full-sized avatar

Highlights

  • Pro

Organizations

@thu-ml

Block or report jt-zhang

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
jt-zhang/README.md

Hi 😊

I am a first-year PhD student in the CS Dept. at Tsinghua University, focusing on efficient training and inference of large models.

  • WeChat WeChat ID : Zjt_Tete

Pinned Loading

  1. thu-ml/SageAttention thu-ml/SageAttention Public

    [ICLR2025, ICML2025, NeurIPS2025 Spotlight] Quantized Attention achieves speedup of 2-5x compared to FlashAttention, without losing end-to-end metrics across language, image, and video models.

    Cuda 2.8k 274

  2. thu-ml/SpargeAttn thu-ml/SpargeAttn Public

    [ICML2025] SpargeAttention: A training-free sparse attention that accelerates any model inference.

    Cuda 796 67

  3. CardinalityEstimationTestbed CardinalityEstimationTestbed Public

    CardinalityEstimationTestbed

    Python 49 14

  4. Sparse_Attention_API Sparse_Attention_API Public

    Python 64 7

  5. attention-survey/Efficient_Attention_Survey attention-survey/Efficient_Attention_Survey Public

    A Survey of Efficient Attention Methods: Hardware-efficient, Sparse, Compact, and Linear Attention

    240 5