Skip to content
View Yiming992's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report Yiming992

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Pinned Loading

  1. DeepEP DeepEP Public

    Forked from deepseek-ai/DeepEP

    DeepEP: an efficient expert-parallel communication library

    Cuda

  2. DeepGEMM DeepGEMM Public

    Forked from deepseek-ai/DeepGEMM

    DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling

    Cuda

  3. sglang sglang Public

    Forked from sgl-project/sglang

    SGLang is a structured generation language designed for large language models (LLMs). It makes your interaction with models faster and more controllable.

    Python

  4. cutlass cutlass Public

    Forked from NVIDIA/cutlass

    CUDA Templates for Linear Algebra Subroutines

    C++

  5. nanochat nanochat Public

    Forked from karpathy/nanochat

    The best ChatGPT that $100 can buy.

    Python

  6. SageAttention SageAttention Public

    Forked from thu-ml/SageAttention

    Quantized Attention that achieves speedups of 2.1-3.1x and 2.7-5.1x compared to FlashAttention2 and xformers, respectively, without lossing end-to-end metrics across various models.

    Cuda