Skip to content
View noiji's full-sized avatar

Block or report noiji

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Pinned Loading

  1. hiyouga/LLaMA-Factory hiyouga/LLaMA-Factory Public

    Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

    Python 59.7k 7.3k

  2. vllm-project/vllm vllm-project/vllm Public

    A high-throughput and memory-efficient inference and serving engine for LLMs

    Python 59.4k 10.5k

  3. lm-sys/FastChat lm-sys/FastChat Public

    An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.

    Python 39.1k 4.8k

  4. BerriAI/litellm BerriAI/litellm Public

    Python SDK, Proxy Server (LLM Gateway) to call 100+ LLM APIs in OpenAI format - [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthropic, Sagemaker, HuggingFace, Replicate, Groq]

    Python 29.5k 4.3k

  5. NVIDIA/TensorRT-LLM NVIDIA/TensorRT-LLM Public

    TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and support state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorR…

    C++ 11.8k 1.8k

  6. microsoft/LLMLingua microsoft/LLMLingua Public

    [EMNLP'23, ACL'24] To speed up LLMs' inference and enhance LLM's perceive of key information, compress the prompt and KV-Cache, which achieves up to 20x compression with minimal performance loss.

    Python 5.5k 326