Skip to content
View notwitcheer's full-sized avatar
πŸƒ
locked-in
πŸƒ
locked-in

Block or report notwitcheer

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
notwitcheer/README.md

AI Practitioner & Data-Driven Growth Specialist

building local LLM infrastructure, benchmarking models, publishing results

Twitter HuggingFace Telegram

Profile Views


What I Do

I build local LLM inference stacks from source on consumer hardware, benchmark models systematically, and publish datasets on HuggingFace. I also build analytics dashboards and have scaled a tech community to 20,000+ members.

Current Focus:

  • local inference optimisation (llama.cpp, CUDA..)
  • systematic benchmarks across dense, MoE, and hybrid architectures
  • quantisation testing (GGUF Q4_K_M, IQ4_XS, turboquant turbo2/turbo3)
  • context window scaling analysis and VRAM profiling
  • publishing benchmark datasets on HuggingFace

πŸ› οΈ Tech Stack

AI / ML

CUDA llama.cpp HuggingFace Python

data & analytics

Dune Analytics SQL

frontend

React Next.js TypeScript

infra & tools

Linux Bash Git


πŸ“Š Background

  • AI / ML Practitioner - local LLM inference, model evaluation, HuggingFace contributor
  • Growth Lead @ Yari Finance - DeFi protocol growth, partnerships, on-chain analytics
  • Founder @ BeraLand - built a 20K+ member blockchain community from zero
  • 15+ Dune dashboards tracking $1B+ in trading volume
  • Master's in Corporate & Market Finance - KPMG background

πŸŽ“ Learning Journey

Boot.dev Profile


I write about AI infrastructure, local inference, and model evaluation on 𝕏

Pinned Loading

  1. llm-bench-rig llm-bench-rig Public

    Dual-engine (llama.cpp + vLLM) LLM benchmarking pipeline for GGUF & safetensors on NVIDIA GPUs β€” speed, quality, live dashboard, publishable cards.

    Python 21 2

  2. hermes-recipes hermes-recipes Public

    tested hermes agent recipes: configs, deploys, mcp, automations. copy, run, build.

    Shell 98 6