A CLI tool for estimating GPU VRAM requirements for Hugging Face models, supporting various data types, parallelization strategies, and fine-tuning scenarios like LoRA.

gpu-memory vram huggingface pipeline-parallelism memory-estimation huggingface-models hugging-face-transformers huggingface-datasets vram-monitoring vram-calculator vram-memory-estimation

Updated Oct 22, 2025
Python

manishklach / ghostkv-lab

Star

Research harness for evaluating query-time bounded elimination of reconstructable KV-cache witnesses in long-context transformer inference workloads. Related provisional filing: IN 202641062451.

transformer gpu-memory memory-systems kv-cache cxl long-context llm-inference transformer-memory ai-infrastructure flashattention transformer-optimization systems-research long-context-inference attention-optimization

Updated May 18, 2026
Python

manishklach / kv_deadline_scheduler

Star

Deadline-aware KV-cache scheduling for protecting decode-critical request-state under long-context LLM inference pressure.

inference gpu-memory memory-management nvme hbm kv-cache memory-tiering cxl llm long-context vllm pagedattention ai-infrastructure systems-research

Updated Jun 19, 2026
Python

sina-masnadi / nvidia-mg

Star

📊 A command line monitoring tool (graph) for NVIDIA GPUs

terminal monitoring graph gpu cuda nvidia gpu-memory monitoring-tool gpu-utilization

Updated Apr 22, 2020
Python

junkyard22 / holster-memory

Star

Tiered GPU memory architecture for consumer AI inference. VRAM as execution cache, system RAM as passive staging layer.

inference pytorch transformer gpu-memory memory-management offloading vram llm local-ai consumer-gpu vram-optimization

Updated Jun 11, 2026
Python

mnisperuza / hcgk-kernels

Star

Hardware Control GateKeeper Kernels for AI inference within frameworks.

python machine-learning ai deep-learning gpu cuda inference pytorch gpu-memory memory-management system-monitor hardware-detection vram model-loading ml-ops hardware-validation ai-infrastructure resource-validation

Updated Dec 19, 2025
Python

JonSnow1807 / gradient-cache

Star

GPU memory-efficient training for PyTorch - 90%+ memory savings through gradient compression

machine-learning deep-learning pytorch neural-networks gpu-memory memory-optimization gradient-compression training-optimization

Updated Aug 6, 2025
Python

alyssapowell / mlx-halo

Star

Kernel panic prevention for MLX on Apple Silicon. Five pre-flight safety checks before model loading — because Metal doesn't warn you, it just reboots.

machine-learning metal gpu-memory safety mlx kernel-panic apple-silicon gpu-memory-calculator

Updated May 7, 2026
Python

Olajide-Badejo / PyTorch-Training-Inspector

Star

Production-grade PyTorch training monitor. Wraps your loop in one context manager to track loss, gradients, LR, GPU memory and throughput, with real-time alerts for NaN, explosions, plateaus, and OOM.

deep-learning pytorch gpu-memory gradient-checking

Updated Apr 11, 2026
Python

Improve this page

Add a description, image, and links to the gpu-memory topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the gpu-memory topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gpu-memory

Here are 14 public repositories matching this topic...

parasj / checkmate

LiyuanLucasLiu / Torch-Scope

Lin-Mao / DrGPUM

eklitzke / tf-slice

obisin / dgls

joe0731 / hf_vram_calc

manishklach / ghostkv-lab

manishklach / kv_deadline_scheduler

sina-masnadi / nvidia-mg

junkyard22 / holster-memory

mnisperuza / hcgk-kernels

JonSnow1807 / gradient-cache

alyssapowell / mlx-halo

Olajide-Badejo / PyTorch-Training-Inspector

Improve this page

Add this topic to your repo