Skip to content
View kebe7jun's full-sized avatar
  • DaoCloud
  • Shanghai
  • 14:11 (UTC +08:00)

Sponsors

@nekomeowww

Organizations

@CrazyForCode @DaoCloud @istio @servicemesher @merbridge

Block or report kebe7jun

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

An Open-source, self-hosted AI model hub with Hugging Face compatibility, accelerating vLLM/SGLang performance.

Go 56 12 Updated Mar 2, 2026

🐹 Deep clean and optimize your Mac.

Shell 37,578 1,031 Updated Mar 2, 2026

A production-grade Feishu/Lark channel plugin for Moltbot(Clawdbot).

TypeScript 7 2 Updated Feb 3, 2026

Your own personal AI assistant. Any OS. Any Platform. The lobster way. 🦞

TypeScript 244,324 47,238 Updated Mar 2, 2026

The open source coding agent.

TypeScript 113,795 11,455 Updated Mar 2, 2026

High-performance distributed data shuffling (all-to-all) library for MoE training and inference

Python 112 11 Updated Feb 28, 2026

MAI-UI: Real-World Centric Foundation GUI Agents ranging from 2B to 235B

Jupyter Notebook 1,703 170 Updated Feb 10, 2026

Performance-optimized AI inference on your GPUs. Unlock superior throughput by selecting and tuning engines like vLLM or SGLang.

Python 4,559 461 Updated Mar 2, 2026

NVSentinel is a cross-platform fault remediation service designed to rapidly remediate runtime node-level issues in GPU-accelerated computing environments

Go 192 49 Updated Mar 2, 2026

A browser extension for insights into GitHub, Gitee projects and developers.

TypeScript 399 103 Updated Feb 28, 2026

A list of open source games.

Python 12,147 955 Updated Feb 25, 2026

AI-Powered Photos App for the Decentralized Web 🌈💎✨

Go 39,407 2,210 Updated Mar 1, 2026

ArcticInference: vLLM plugin for high-throughput, low-latency inference

Python 403 50 Updated Feb 24, 2026

🤗 A PyTorch-native and Flexible Inference Engine with Hybrid Cache Acceleration and Parallelism for DiTs.

Python 1,059 63 Updated Mar 2, 2026

📚LeetCUDA: Modern CUDA Learn Notes with PyTorch for Beginners🐑, 200+ CUDA Kernels, Tensor Cores, HGEMM, FA-2 MMA.🎉

Cuda 9,772 977 Updated Feb 25, 2026

A framework for efficient model inference with omni-modality models

Python 2,863 470 Updated Mar 2, 2026

Transplants vocabulary between language models, enabling the creation of draft models for speculative decoding WITHOUT retraining.

Python 49 8 Updated Oct 29, 2025
Go 2 Updated Nov 27, 2025

An early research stage expert-parallel load balancer for MoE models based on linear programming.

Python 499 33 Updated Nov 19, 2025

slime is an LLM post-training framework for RL Scaling.

Python 4,508 585 Updated Mar 2, 2026

📚A curated list of Awesome LLM/VLM Inference Papers with Codes: Flash-Attention, Paged-Attention, WINT8/4, Parallelism, etc.🎉

Python 5,030 346 Updated Feb 27, 2026

🪢 Open source LLM engineering platform: LLM Observability, metrics, evals, prompt management, playground, datasets. Integrates with OpenTelemetry, Langchain, OpenAI SDK, LiteLLM, and more. 🍊YC W23

TypeScript 22,482 2,245 Updated Feb 28, 2026

FlagTree is a unified compiler supporting multiple AI chip backends for custom Deep Learning operations, which is forked from triton-lang/triton.

C++ 213 40 Updated Mar 1, 2026

Virtualized Elastic KV Cache for Dynamic GPU Sharing and Beyond

Python 797 87 Updated Feb 27, 2026

Offline optimization of your disaggregated Dynamo graph

Python 195 67 Updated Mar 2, 2026

Public repository for Agent Skills

Python 80,585 8,446 Updated Feb 25, 2026

developers.events is a community-driven platform listing developer/tech conferences and Calls for Papers (CFPs) worldwide with a list, a calendar and a map view. It helps organizers, speakers, spon…

JavaScript 1,939 493 Updated Mar 2, 2026

hpc 教程,包含集合通信(mpi、nccl)、cuda 编程、向量化 SIMD、RDMA 通信等

Cuda 261 25 Updated Feb 14, 2026

Qwen3Guard is a multilingual guardrail model series developed by the Qwen team at Alibaba Cloud.

Python 428 30 Updated Oct 21, 2025
Next