Training neural networks in TensorFlow 2.0 with 5x less memory
-
Updated
Feb 21, 2022 - Python
Training neural networks in TensorFlow 2.0 with 5x less memory
A Toolkit for Training, Tracking, Saving Models and Syncing Results
A memory profiler for NVIDIA GPUs to explore memory inefficiencies in GPU-accelerated applications.
Demonstration of generating mini-batches in Tensorlfow from GPU memory.
Dynamic GPU Layer Swapping: Train large models on consumer GPUs with intelligent memory management
A CLI tool for estimating GPU VRAM requirements for Hugging Face models, supporting various data types, parallelization strategies, and fine-tuning scenarios like LoRA.
Research harness for evaluating query-time bounded elimination of reconstructable KV-cache witnesses in long-context transformer inference workloads. Related provisional filing: IN 202641062451.
Deadline-aware KV-cache scheduling for protecting decode-critical request-state under long-context LLM inference pressure.
📊 A command line monitoring tool (graph) for NVIDIA GPUs
Tiered GPU memory architecture for consumer AI inference. VRAM as execution cache, system RAM as passive staging layer.
Hardware Control GateKeeper Kernels for AI inference within frameworks.
GPU memory-efficient training for PyTorch - 90%+ memory savings through gradient compression
Kernel panic prevention for MLX on Apple Silicon. Five pre-flight safety checks before model loading — because Metal doesn't warn you, it just reboots.
Production-grade PyTorch training monitor. Wraps your loop in one context manager to track loss, gradients, LR, GPU memory and throughput, with real-time alerts for NaN, explosions, plateaus, and OOM.
Add a description, image, and links to the gpu-memory topic page so that developers can more easily learn about it.
To associate your repository with the gpu-memory topic, visit your repo's landing page and select "manage topics."