Skip to content
View A-Dying-Pig's full-sized avatar

Block or report A-Dying-Pig

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

⏰ 🔥 A TCP proxy to simulate network and system conditions for chaos and resiliency testing

Go 11,694 482 Updated Dec 1, 2025
C 2 Updated Oct 24, 2025

rocSHMEM intra-kernel networking runtime for AMD dGPUs on the ROCm platform.

C++ 129 40 Updated Dec 1, 2025

DeepEP: an efficient expert-parallel communication library

Cuda 8,769 1,010 Updated Nov 25, 2025

The Triton Inference Server provides an optimized cloud and edge inferencing solution.

Python 10,083 1,684 Updated Dec 1, 2025

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

Python 40,874 4,651 Updated Nov 26, 2025

Synthesizer for optimal collective communication algorithms

Python 121 27 Updated Apr 8, 2024

Ongoing research training transformer models at scale

Python 14,366 3,332 Updated Dec 1, 2025

A large-scale simulation framework for LLM inference

Python 488 91 Updated Jul 25, 2025
C++ 90 28 Updated Aug 27, 2025

NCCL Profiling Kit

Python 149 11 Updated Jul 1, 2024

A throughput-oriented high-performance serving framework for LLMs

Cuda 1 Updated Nov 19, 2024

Tutel MoE: Optimized Mixture-of-Experts Library, Support GptOss/DeepSeek/Kimi-K2/Qwen3 using FP8/NVFP4/MXFP4

C 945 106 Updated Nov 10, 2025
Python 1,491 219 Updated Jun 26, 2025

An implementation of a deep learning recommendation model (DLRM)

Python 3,996 869 Updated Oct 2, 2025

Official inference library for Mistral models

Jupyter Notebook 10,555 992 Updated Nov 21, 2025
TeX 69 115 Updated Jan 11, 2022

TACCL: Guiding Collective Algorithm Synthesis using Communication Sketches

Python 77 10 Updated Jul 25, 2023

TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. Tensor…

C++ 12,270 1,904 Updated Dec 1, 2025

Verbs on DPDK

C 105 26 Updated Sep 5, 2022

Efficient RPCs for datacenter networks

C++ 888 145 Updated May 9, 2024

The official Python library for the OpenAI API

Python 29,395 4,442 Updated Dec 1, 2025

Yahoo! Cloud Serving Benchmark

Java 5,159 2,314 Updated Nov 10, 2025

A high-performance regular expression matching library

C++ 95 30 Updated Oct 12, 2025

Fast and accurate DRAM power and energy estimation tool

C++ 186 53 Updated Oct 6, 2025

Artifact for the OSDI '23 paper: "Ensō: A Streaming Interface for NIC-Application Communication"

C++ 4 1 Updated Jun 1, 2023

Mica is a web portal for epidemiological study consortia.

Java 11 15 Updated Nov 25, 2025

Intel® Performance Counter Monitor (Intel® PCM)

C++ 3,136 510 Updated Nov 14, 2025
Next