I'm an ML engineer and independent consultant at Parlance Labs. I spend most of my time helping teams build AI products. Previously, I did applied ML at GitHub and Airbnb.
I'm working to bring data science back to AI: helping teams debug, analyze, and measure their systems. I call this "evals," and after doing it across 35+ AI products, I co-authored Evals for AI Engineers (O'Reilly), covering error analysis, LLM-as-a-judge, synthetic data, production monitoring, and building data flywheels. I also co-teach a course on evals on Maven.
I write about what I learn at hamel.dev. Some recent posts:
I've contributed to tools across ML infrastructure, developer experience, and data science workflows: machine learning frameworks, workflow orchestration, Jupyter tooling, and code search. Full list here.






