Gadi Evron’s Post

View profile for Gadi Evron

Building a world-class AI security company at Knostic | CISO-in-Residence for Cloud Security Alliance

LLMs don’t give the same answer to the same input. Until now. Like a calculator that sometimes says 2+2 is five, depending on when you hit the keys, LLMs have long behaved in ways we came to accept as unpredictable. Now, Mira Murati’s team at Thinking Machines Lab has fixed it. The issue was not “randomness” or “creativity,” but noise in the infrastructure. Tiny nondeterministic effects inside GPU kernels ripple forward and change completions even when the input is the same. Their fix, naturally explained in jargon, was to design batch-invariant kernels for matmul, attention, and RMSNorm, and the bottom line is clear: same input equals same output, every time. Why it matters: - Reliability in high-stakes fields: health and finance cannot accept answers that drift with server load. - Operational savings: deterministic outputs mean caching works, cutting GPU burn. - Open source transparency: the fix is published for anyone to use and build on. This does not solve the correctness of answers, since wrong outputs can still be wrong consistently. But it clears away one of the biggest barriers to making LLMs reliable at scale. Just days ago this was another open mystery in how LLMs work. Now it is progress — another win for science and engineering. Read more here: https://lnkd.in/d7iNCjzj

Vikram Nayyar

Graduate Software Engineer at Sage | AI Enthusiast | Computer Science Graduate

2w

This sounds very interesting! Also from my experience of tech, a model that is ‘reliably wrong’ is much better to work with than one that is intermittently wrong!!

Paul Jacques, CISSP

CISO: Security Infrastructure Compliance

2w

Reminds me of the Computer adage of "garbage in, garbage out". A little worrying if AI needs to be treated with kid-gloves, I mean, it is aimed at the masses.

John Parsons

Passionate about shaping engineering excellence, mentoring teams and using AI in a Principal Engineer capacity. Hard work, persistence, pushing boundaries and taking people on a journey. Having ideas and a vision

2w

Determinism would make testing easier

Joseph Costantini

SME- Retired (1/31/2024)

2w

So now LLMs can be reliably wrong... actually, that is no surprise - so can people; and we believe that "uncertainty is built into the fabric of the universe."

Peter Kaloroumakis

D3FEND Creator/Lead at MITRE

2w

Determinism across LLM versions?

Stephen Kearney

Strategic Automation Adviser @ Secure Minded 🤖 Translating complex technology into simple business growth through AI & Microsoft Power Platform. Trusted Partner, Process Expert & Business Intelligence Specialist

2w

Very interesting. Divergence between the creative and the mathematical models has meant they haven't been in the fore of this wave. They seemed more of a creative model that speaks and understands maths. This could be a massive improvement

Super interesting and completely logical!

Kieron Seymour-Howell

IT Consultant & Technical Services

2w

Interesting 🤔

Chris H.

CEO @ Aquia | Chief Security Advisor @ Endor Labs | 2x Author | Veteran | Advisor

2w

I’m curious if we will see this implemented at scale, or if frontier model providers will not do so, concerned with impacting creativity or the randomness that some are even calling a feature, not a bug, of LLMs.

Chip Block

CEO/CTO of Kiwi Futures, LLC

2w

My biggest issue with this article is the stated goal of of determinism and what this paper is actually describing. To understand it, you need to get your math hat on (and maybe a few beers) and dig into it. What the paper is describing is how to minimize the variance of mathematical calculations by controlling input batches. This isn't making making the models deterministic, it is eliminating variability though processing input control. This takes out some of the mathematical processing variance. First, I don't want my GenAI engine to be deterministic. That is what super computers do today, straight line and deterministic. In some cases, I want the outlier probability to show me things I never considered. Second, the AI engines are acting just like people in that if you ask one question in a focused discussion you get a focused answer. If you ask the same question mixed with 15 other questions in a noisy bar, you get a less focused answer. Also, if there is a goal of determinism in LLMs, the answer is eventually going to get to restricting responses to only the highest probability response (or something limited by the developers), which is the scariest outcome.

Like
Reply
See more comments

To view or add a comment, sign in

Explore content categories