Skip to content
View levalencia's full-sized avatar
🏠
Working from home
🏠
Working from home

Highlights

  • Pro

Block or report levalencia

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
levalencia/README.md

Luis Valencia

Microsoft MVP AI (2015-2025) | Author: "Mastering Scikit-Learn and PyTorch" | Senior AI Engineer | Multi-Agent Systems

levalencia


πŸ”₯ Currently Working On

🧠 Production AI Systems (Professional Work)

Project Type Business Problem Solved Tech Stack
Hybrid Redaction Engine Privacy compliance - automatically detect and redact PII, legal, financial entities from documents with 99%+ accuracy using combined Regex + LLM reasoning Azure OpenAI, Regex, Docker, Azure DevOps, CI/CD
Agentic RAG Orchestrator Enterprise knowledge access - autonomous multi-agent system that retrieves, ranks, and synthesizes answers from private knowledge bases OpenAI Agents SDK, FastAPI, Azure AI Search, Redis, Semver
Multi-Agent PDF Pipeline Data extraction from complex documents - transforms unstructured biotech PDFs (clinical trials, regulatory filings) into structured datasets for analysis Azure Document Intelligence, Azure OpenAI, Databricks, Docker
Editorial AI Platform Publishing automation - real-time document redaction for editorial teams with sub-second responsiveness SvelteKit, FastAPI, Docker, Azure DevOps, CI/CD

Engineering Practices Across All Projects:

  • βœ… CI/CD Pipelines (Azure DevOps)
  • βœ… Semantic Versioning with automated git tagging
  • βœ… Docker containerization (ACR deployment)
  • βœ… Infrastructure-as-Code
  • βœ… Clean Architecture (Domain-first, SOLID principles)
  • βœ… OpenTelemetry observability

πŸ› οΈ Personal Projects

Project Problem Solved Tech
HarmoniqHub Music organization for DJs: manual playlist building is time-consuming, inconsistent track ordering, no intelligent suggestions. Solves: AI-powered playlist generation, automatic track ordering (energy/key compatibility), wave visualization, set management, duplicate detection via acoustic fingerprinting SwiftUI, SwiftData, CoreML, Chromaprint, Azure Table Storage
SuperTradingGodMode Trading strategy optimization is what hyperparameter tuning is to ML - manual backtesting is slow, prone to overfitting. Solves: Parameterized strategy definition, automated walk-forward optimization, anti-lookahead validation, IS/OOS regime detection React, TypeScript, FastAPI, Redis, RQ, Parquet, Pytest, Docker
apple-silicon-llm-stack Want to run LLaMA/Mistral locally on Mac but inference is slow, context windows are limited. Solves: Custom Metal GPU shaders for 8x speedup, Q4 quantization for 70B models in 24GB RAM, LoRA fine-tuning via MLX Python, MLX, C++, Metal, Go, SvelteKit
DidListen Want to track what you listen during the day but existing apps don't capture speaker context. Solves: Real-time speech-to-text, speaker identification, turn detection for meeting notes, voice activity detection Swift 6, WhisperKit, ShazamKit, Clean Architecture

πŸ› οΈ Tech Stack

Python PyTorch MLX LangChain HuggingFace vLLM Go Swift CoreML Azure FastAPI React TypeScript Redis

Python | PyTorch | MLX | LangChain | HuggingFace | vLLM | Go | React | TypeScript | FastAPI | Swift | CoreML | Azure ML | Redis


πŸ“‚ Featured Projects

πŸ† apple-silicon-llm-stack

Problem: Cloud LLM inference is expensive ($/hour), running locally on Mac is slow, context windows are limited, and 70B models require expensive GPUs. Solution: Hardware/software co-design for Apple Silicon β€” runs 70B models in 24GB RAM with extreme optimization.

Technical Implementation

  • Custom Metal Shaders β€” CUDA-equivalent GPU compute kernels for 8x inference speedup
  • Q4 Quantization β€” compresses 70B model to fit in 24GB unified memory
  • LoRA/QLoRA Fine-tuning β€” 99% memory reduction via MLX
  • Go API Gateway β€” sub-millisecond latency with CGO zero-copy bridge
  • SvelteKit Telemetry UI β€” real-time SSE streaming dashboard

🧠 HarmoniqHub (macOS App)

Problem: DJs waste hours manually organizing playlists, inconsistent track ordering, no intelligent suggestions, can't visualize energy/waves for sets. Solution: AI-powered playlist generation with energy/key compatibility, automatic track ordering, and wave visualization.

Technical Implementation

  • AI Playlist Generation β€” intelligent curation based on energy, key (Camelot wheel), and mood
  • CoreML Classification β€” genre, mood, theme detection trained on 500K+ tracks
  • Wave Visualization β€” real-time audio waveform display for set planning
  • ShazamKit + Chromaprint β€” acoustic fingerprinting for duplicate detection
  • Meta-tagging Automation β€” automatic album/artist/label cleaning
  • Azure Table Storage β€” <100ms cache response times

⚑ SuperTradingGodMode

Problem: Manual trading backtesting is slow, prone to overfitting, and lacks proper IS/OOS validation β€” exactly what hyperparameter tuning is to ML. Solution: Parameterized strategy definition with automated walk-forward optimization and anti-lookahead validation.

Technical Implementation

  • Frontend: React Β· TypeScript Β· Vite Β· lightweight-charts Β· TanStack Query Β· Zustand
  • Backend: FastAPI Β· Pydantic v2 Β· SQLAlchemy
  • Data: Parquet Β· Redis Β· RQ worker
  • Infrastructure: Docker Compose Β· Pytest Β· Vitest
  • Architecture: Clean Architecture (Domain-first, SOLID principles)
  • Validation: Anti-lookahead backtesting, walk-forward IS/OOS sweep mode, 36+ passing tests

🎧 DidListen (iOS App)

Problem: Existing "what you listened" apps don't capture speaker context β€” just audio. Want to know WHO spoke, WHEN, and WHAT. Solution: Real-time speaker identification with turn detection β€” hybrid STT pipeline on device.

Technical Implementation

  • Speech-to-Text Pipeline (3 backends, switchable at runtime):

    • Local: Whisper (Tiny 39MB / Base 150MB / Medium 500MB via WhisperKit)
    • Cloud: Azure AI Speech API
    • On-device: Apple SFSpeechRecognizer
  • Voice AI Features:

    • VAD β€” RMS energy threshold with state machine
    • Turn Detection β€” SILENCE ↔ SPEAKING transition detection
    • Speaker Recognition β€” 256-dim deterministic embeddings + cosine matching
    • Pre-roll Buffer β€” 1-second circular buffer captures utterance start
  • Architecture:

    • Swift 6 Strict Concurrency (async/await, @MainActor, Sendable)
    • Clean Architecture (Domain/Data/Presentation layers)
    • MVVM + ObservableObject pattern
    • SwiftData persistence

🧩 agent-god-mode

Problem: Loading 2,300+ AI skills into an agent causes "Death by a Thousand Skills" β€” 30K+ tokens before you type anything, skyrocketing API costs, confusing responses. Solution: Local RAG with just-in-time skill injection β€” searches vault, retrieves only what it needs, saves 30K+ tokens per session.

Technical Implementation

  • Local Embeddings: @xenova/transformers (CPU-only, no API keys required)
  • RAG Architecture: Isolated skill vault hidden from agent's default prompt
  • Search: Background Node.js worker calculates cosine similarity outside sandbox
  • Just-in-Time Injection: Agent searches β†’ gets top 3 matches β†’ reads only relevant SKILL.md
  • Integration: OpenCode and Claude Code compatible
  • Scale: 2,300+ curated skills (Azure, ML, DevOps, etc.)

πŸ” DuplicateFinder

Problem: Simple filename matching misses true duplicates β€” renamed tracks, resized images, recompressed files are invisible. Solution: Multi-modal AI fingerprinting β€” acoustic + visual similarity detection beyond filenames.

Technical Implementation

  • Audio: Chromaprint acoustic fingerprinting (matches even with different filenames/bitrates)
  • Images:
    • pHash β€” perceptual hashing for near-identical images
    • CLIP (ViT-L/14) β€” vision-language model for semantic similarity ("this photo looks like that one")
  • Vector Search: FAISS for billion-scale similarity in milliseconds
  • UI: Streamlit interactive dashboard with side-by-side preview
  • Safety: One-click undo, trash manager, exportable reports

πŸ“« Contact

πŸ“ Blog: medium.com/@luisevalencia
πŸ’Ό Career: linkedin.com/in/levalencia

Pinned Loading

  1. agent-god-mode agent-god-mode Public

    Python 1 1

  2. apple-silicon-llm-stack apple-silicon-llm-stack Public

    Python

  3. HarmoniqHub HarmoniqHub Public

    Discussions and Feature Request for HarmoniqHub Apps