Microsoft MVP AI (2015-2025) | Author: "Mastering Scikit-Learn and PyTorch" | Senior AI Engineer | Multi-Agent Systems
| Project Type | Business Problem Solved | Tech Stack |
|---|---|---|
| Hybrid Redaction Engine | Privacy compliance - automatically detect and redact PII, legal, financial entities from documents with 99%+ accuracy using combined Regex + LLM reasoning | Azure OpenAI, Regex, Docker, Azure DevOps, CI/CD |
| Agentic RAG Orchestrator | Enterprise knowledge access - autonomous multi-agent system that retrieves, ranks, and synthesizes answers from private knowledge bases | OpenAI Agents SDK, FastAPI, Azure AI Search, Redis, Semver |
| Multi-Agent PDF Pipeline | Data extraction from complex documents - transforms unstructured biotech PDFs (clinical trials, regulatory filings) into structured datasets for analysis | Azure Document Intelligence, Azure OpenAI, Databricks, Docker |
| Editorial AI Platform | Publishing automation - real-time document redaction for editorial teams with sub-second responsiveness | SvelteKit, FastAPI, Docker, Azure DevOps, CI/CD |
Engineering Practices Across All Projects:
- β CI/CD Pipelines (Azure DevOps)
- β Semantic Versioning with automated git tagging
- β Docker containerization (ACR deployment)
- β Infrastructure-as-Code
- β Clean Architecture (Domain-first, SOLID principles)
- β OpenTelemetry observability
| Project | Problem Solved | Tech |
|---|---|---|
| HarmoniqHub | Music organization for DJs: manual playlist building is time-consuming, inconsistent track ordering, no intelligent suggestions. Solves: AI-powered playlist generation, automatic track ordering (energy/key compatibility), wave visualization, set management, duplicate detection via acoustic fingerprinting | SwiftUI, SwiftData, CoreML, Chromaprint, Azure Table Storage |
| SuperTradingGodMode | Trading strategy optimization is what hyperparameter tuning is to ML - manual backtesting is slow, prone to overfitting. Solves: Parameterized strategy definition, automated walk-forward optimization, anti-lookahead validation, IS/OOS regime detection | React, TypeScript, FastAPI, Redis, RQ, Parquet, Pytest, Docker |
| apple-silicon-llm-stack | Want to run LLaMA/Mistral locally on Mac but inference is slow, context windows are limited. Solves: Custom Metal GPU shaders for 8x speedup, Q4 quantization for 70B models in 24GB RAM, LoRA fine-tuning via MLX | Python, MLX, C++, Metal, Go, SvelteKit |
| DidListen | Want to track what you listen during the day but existing apps don't capture speaker context. Solves: Real-time speech-to-text, speaker identification, turn detection for meeting notes, voice activity detection | Swift 6, WhisperKit, ShazamKit, Clean Architecture |
Python | PyTorch | MLX | LangChain | HuggingFace | vLLM | Go | React | TypeScript | FastAPI | Swift | CoreML | Azure ML | Redis
Problem: Cloud LLM inference is expensive ($/hour), running locally on Mac is slow, context windows are limited, and 70B models require expensive GPUs. Solution: Hardware/software co-design for Apple Silicon β runs 70B models in 24GB RAM with extreme optimization.
Technical Implementation
- Custom Metal Shaders β CUDA-equivalent GPU compute kernels for 8x inference speedup
- Q4 Quantization β compresses 70B model to fit in 24GB unified memory
- LoRA/QLoRA Fine-tuning β 99% memory reduction via MLX
- Go API Gateway β sub-millisecond latency with CGO zero-copy bridge
- SvelteKit Telemetry UI β real-time SSE streaming dashboard
Problem: DJs waste hours manually organizing playlists, inconsistent track ordering, no intelligent suggestions, can't visualize energy/waves for sets. Solution: AI-powered playlist generation with energy/key compatibility, automatic track ordering, and wave visualization.
Technical Implementation
- AI Playlist Generation β intelligent curation based on energy, key (Camelot wheel), and mood
- CoreML Classification β genre, mood, theme detection trained on 500K+ tracks
- Wave Visualization β real-time audio waveform display for set planning
- ShazamKit + Chromaprint β acoustic fingerprinting for duplicate detection
- Meta-tagging Automation β automatic album/artist/label cleaning
- Azure Table Storage β <100ms cache response times
Problem: Manual trading backtesting is slow, prone to overfitting, and lacks proper IS/OOS validation β exactly what hyperparameter tuning is to ML. Solution: Parameterized strategy definition with automated walk-forward optimization and anti-lookahead validation.
Technical Implementation
- Frontend: React Β· TypeScript Β· Vite Β· lightweight-charts Β· TanStack Query Β· Zustand
- Backend: FastAPI Β· Pydantic v2 Β· SQLAlchemy
- Data: Parquet Β· Redis Β· RQ worker
- Infrastructure: Docker Compose Β· Pytest Β· Vitest
- Architecture: Clean Architecture (Domain-first, SOLID principles)
- Validation: Anti-lookahead backtesting, walk-forward IS/OOS sweep mode, 36+ passing tests
Problem: Existing "what you listened" apps don't capture speaker context β just audio. Want to know WHO spoke, WHEN, and WHAT. Solution: Real-time speaker identification with turn detection β hybrid STT pipeline on device.
Technical Implementation
-
Speech-to-Text Pipeline (3 backends, switchable at runtime):
- Local: Whisper (Tiny 39MB / Base 150MB / Medium 500MB via WhisperKit)
- Cloud: Azure AI Speech API
- On-device: Apple SFSpeechRecognizer
-
Voice AI Features:
- VAD β RMS energy threshold with state machine
- Turn Detection β SILENCE β SPEAKING transition detection
- Speaker Recognition β 256-dim deterministic embeddings + cosine matching
- Pre-roll Buffer β 1-second circular buffer captures utterance start
-
Architecture:
- Swift 6 Strict Concurrency (async/await, @MainActor, Sendable)
- Clean Architecture (Domain/Data/Presentation layers)
- MVVM + ObservableObject pattern
- SwiftData persistence
Problem: Loading 2,300+ AI skills into an agent causes "Death by a Thousand Skills" β 30K+ tokens before you type anything, skyrocketing API costs, confusing responses. Solution: Local RAG with just-in-time skill injection β searches vault, retrieves only what it needs, saves 30K+ tokens per session.
Technical Implementation
- Local Embeddings: @xenova/transformers (CPU-only, no API keys required)
- RAG Architecture: Isolated skill vault hidden from agent's default prompt
- Search: Background Node.js worker calculates cosine similarity outside sandbox
- Just-in-Time Injection: Agent searches β gets top 3 matches β reads only relevant SKILL.md
- Integration: OpenCode and Claude Code compatible
- Scale: 2,300+ curated skills (Azure, ML, DevOps, etc.)
Problem: Simple filename matching misses true duplicates β renamed tracks, resized images, recompressed files are invisible. Solution: Multi-modal AI fingerprinting β acoustic + visual similarity detection beyond filenames.
Technical Implementation
- Audio: Chromaprint acoustic fingerprinting (matches even with different filenames/bitrates)
- Images:
- pHash β perceptual hashing for near-identical images
- CLIP (ViT-L/14) β vision-language model for semantic similarity ("this photo looks like that one")
- Vector Search: FAISS for billion-scale similarity in milliseconds
- UI: Streamlit interactive dashboard with side-by-side preview
- Safety: One-click undo, trash manager, exportable reports
π Blog: medium.com/@luisevalencia
πΌ Career: linkedin.com/in/levalencia




