Stars
⏰ 🔥 A TCP proxy to simulate network and system conditions for chaos and resiliency testing
rocSHMEM intra-kernel networking runtime for AMD dGPUs on the ROCm platform.
DeepEP: an efficient expert-parallel communication library
The Triton Inference Server provides an optimized cloud and edge inferencing solution.
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
Synthesizer for optimal collective communication algorithms
Ongoing research training transformer models at scale
A large-scale simulation framework for LLM inference
liangyuRain / Nanoflow
Forked from efeslab/NanoflowA throughput-oriented high-performance serving framework for LLMs
Tutel MoE: Optimized Mixture-of-Experts Library, Support GptOss/DeepSeek/Kimi-K2/Qwen3 using FP8/NVFP4/MXFP4
An implementation of a deep learning recommendation model (DLRM)
Official inference library for Mistral models
TACCL: Guiding Collective Algorithm Synthesis using Communication Sketches
TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. Tensor…
The official Python library for the OpenAI API
kunpengcompute / hyperscan
Forked from intel/hyperscanA high-performance regular expression matching library
Fast and accurate DRAM power and energy estimation tool
Artifact for the OSDI '23 paper: "Ensō: A Streaming Interface for NIC-Application Communication"
Mica is a web portal for epidemiological study consortia.


