Skip to content
View Vinkle-hzt's full-sized avatar
🤯
hard working
🤯
hard working

Block or report Vinkle-hzt

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Vinkle-hzt/README.md

Anurag's GitHub stats

  • 🔭 I’m currently working on LLM inference architecture development
  • 🌱 Specializing in multi-GPU parallelism and model acceleration
  • 💻 Proficient in C++, Python, and CUDA programming
  • 🚀 Experienced in optimizing large-scale model inference
  • 🧠 Supporting various model architectures for efficient deployment
  • 💡 Passionate about pushing the boundaries of AI efficiency

Pinned Loading

  1. alibaba/rtp-llm alibaba/rtp-llm Public

    RTP-LLM: Alibaba's high-performance LLM inference engine for diverse applications.

    Cuda 1.1k 179