lizecheng.hashnode.devMulti-Agent Systems Hit Production Scale — and Created a New Problem Nobody Is Talking AboutOriginally published at lizecheng.net The benchmark dropped on February 5th with Claude Opus 4.6: 16 parallel agents, two weeks, a 100,000-line C compiler written in Rust. The compiler passes 99% of GCC's test suite. It compiles the Linux 6.9 kernel,...8h ago·5 min read
lizecheng.hashnode.devWhen Open Source Beats Claude Sonnet: What Qwen3.5's MoE Architecture Actually Means for Your StackOriginally published at lizecheng.net Something happened this week that deserves more technical attention than it's getting. Alibaba released Qwen3.5-122B-A10B under Apache 2.0. The benchmarks: 85% on AIME 2026 mathematical reasoning, 76.9 on MMMU-P...1d ago·5 min read
lizecheng.hashnode.devAI Technology Watch | 2026-02-27title: "Inference-Time Orchestration vs. Raw Scale: A Technical Breakdown of Poetiq's ARC-AGI-2 Win" subtitle: "Six employees, no model training, beat Gemini 3 Deep Think at 40% of the cost — here's what the architecture actually looks like" tags: ai...3d ago·6 min read
lizecheng.hashnode.devWhen Senior Engineers Stop Writing Code: What Spotify's Honk System Reveals About AI-Native DevelopmentWhen Senior Engineers Stop Writing Code: What Spotify's Honk System Reveals About AI-Native Development Originally published at lizecheng.net Spotify co-CEO Gustav Söderström said something remarkable during last week's Q4 earnings call: the company'...4d ago·5 min read