Skip to content
View SuperMarioYL's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report SuperMarioYL

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
SuperMarioYL/README.md

Hi, I'm Leo 👋

Typing SVG

AI Infrastructure Inference Acceleration Cloud Native

Building high-performance AI systems · LLM optimization · Multimodal inference · Scalable ML infrastructure


Blog Email Profile Views



🎯 Core Expertise

🏗️ AI Infrastructure

Building scalable ML systems and training pipelines

PyTorch TensorFlow CUDA Triton

⚡ Inference Acceleration

Optimizing model serving and reducing latency

vLLM TensorRT ONNX Quantization

☁️ Cloud Native

Deploying and orchestrating at scale

Kubernetes Docker Ray Istio



🛠️ Tech Stack

Deep Learning Frameworks

Python PyTorch TensorFlow JAX

Inference & Optimization

CUDA TensorRT ONNX Runtime vLLM Triton Inference Server

Cloud Native & DevOps

Kubernetes Docker Ray Helm ArgoCD Prometheus

Languages & Tools

C++ Go Rust Git Linux

📊 View Detailed Language Statistics →
Language Stats


My Projects

Project Description
Bison Enterprise GPU Resource Billing & Multi-Tenant Management Platform

Technical Docs

Docs Description
Cloud Native Cookbook Cloud Native Technical Deep Dive
Inference Cookbook Inference Framework Deep Dive


📈 GitHub Stats

GitHub Contribution Graph


💭 About Me

Passionate about pushing the boundaries of AI performance.
When not optimizing inference pipelines, you'll find me cycling, exploring photography, or traveling.



© 2025 Leo · Powered by passion for AI and open source

Pinned Loading

  1. trouve trouve Public

    trouve : A built-in integrated service discovery, service registration, and service forwarding general component for Spring projects

    Java 30 9

  2. Bison Bison Public

    Enterprise GPU Resource Billing & Multi-Tenant Management Platform 企业级 GPU 资源计费与多租户管理平台

    TypeScript 6