-
DaoCloud
- Shanghai
-
14:11
(UTC +08:00)
Stars
- All languages
- Assembly
- Batchfile
- C
- C#
- C++
- CSS
- CoffeeScript
- Cuda
- Dart
- Dockerfile
- Eagle
- Elixir
- Go
- Groovy
- HTML
- Haskell
- Java
- JavaScript
- Jinja
- Julia
- Jupyter Notebook
- Kotlin
- Lua
- MLIR
- Makefile
- Markdown
- Mojo
- Mustache
- Nim
- Objective-C
- PHP
- Python
- Roff
- Ruby
- Rust
- Shell
- Swift
- TypeScript
- V
- Vim Script
- Vue
- Zig
- reStructuredText
An Open-source, self-hosted AI model hub with Hugging Face compatibility, accelerating vLLM/SGLang performance.
A production-grade Feishu/Lark channel plugin for Moltbot(Clawdbot).
Your own personal AI assistant. Any OS. Any Platform. The lobster way. 🦞
High-performance distributed data shuffling (all-to-all) library for MoE training and inference
MAI-UI: Real-World Centric Foundation GUI Agents ranging from 2B to 235B
Performance-optimized AI inference on your GPUs. Unlock superior throughput by selecting and tuning engines like vLLM or SGLang.
NVSentinel is a cross-platform fault remediation service designed to rapidly remediate runtime node-level issues in GPU-accelerated computing environments
A browser extension for insights into GitHub, Gitee projects and developers.
AI-Powered Photos App for the Decentralized Web 🌈💎✨
ArcticInference: vLLM plugin for high-throughput, low-latency inference
🤗 A PyTorch-native and Flexible Inference Engine with Hybrid Cache Acceleration and Parallelism for DiTs.
📚LeetCUDA: Modern CUDA Learn Notes with PyTorch for Beginners🐑, 200+ CUDA Kernels, Tensor Cores, HGEMM, FA-2 MMA.🎉
A framework for efficient model inference with omni-modality models
Transplants vocabulary between language models, enabling the creation of draft models for speculative decoding WITHOUT retraining.
An early research stage expert-parallel load balancer for MoE models based on linear programming.
slime is an LLM post-training framework for RL Scaling.
📚A curated list of Awesome LLM/VLM Inference Papers with Codes: Flash-Attention, Paged-Attention, WINT8/4, Parallelism, etc.🎉
🪢 Open source LLM engineering platform: LLM Observability, metrics, evals, prompt management, playground, datasets. Integrates with OpenTelemetry, Langchain, OpenAI SDK, LiteLLM, and more. 🍊YC W23
FlagTree is a unified compiler supporting multiple AI chip backends for custom Deep Learning operations, which is forked from triton-lang/triton.
Virtualized Elastic KV Cache for Dynamic GPU Sharing and Beyond
Offline optimization of your disaggregated Dynamo graph
developers.events is a community-driven platform listing developer/tech conferences and Calls for Papers (CFPs) worldwide with a list, a calendar and a map view. It helps organizers, speakers, spon…
hpc 教程,包含集合通信(mpi、nccl)、cuda 编程、向量化 SIMD、RDMA 通信等
Qwen3Guard is a multilingual guardrail model series developed by the Qwen team at Alibaba Cloud.






