Andriy Burkov’s Post

PhD in AI, author of 📖 The Hundred-Page Language Models Book and 📖 The Hundred-Page Machine Learning Book

1mo

When finetuning LLMs with 100k+ token examples in private cloud environments becomes a commodity, this is where the real benefits of large language models for business will start to be seen. Putting instructions into a 100k+ token long prompt and hoping for the best isn't a viable business strategy. Finetuning with such long prompts (especially including the hidden reasoning segment) is currently only available to the LLM builders themselves due to the scarcity of talent capable of setting up such expensive distributed computing systems. If you are waiting for what's next in AI: the democratization of long context finetuning is what's next.

57 Comments

Paolo Perrone

No BS AI/ML Content | ML Engineer with a Plot Twist 🥷50M+ Views 📝

1mo

what infrastructure challenges block wider access to 100k-token tuning?

1 Reaction

Theodore Seeber

Enterprise Data Architect | Systems Architect | Project Manager | Fractional CTO | The SQL Unicorn you’ve been looking for

1mo

Do you really need distributed compute for only 100k-tokens? It seems to me you would be better off paying for a reasonably large server and running local.

11 Reactions

Ibrahim Errbibi

CTO | AI automation, digital marketing

1mo

Democratizing LLMs transforms the landscape for businesses. Access to tools fosters innovation and creativity. 🌍 #AIRevolution

2 Reactions

Sumit Kumar Jha

Researcher |Data Science | AI and ML| Intelligent Automation

1mo

interesting ! like synthetic data generation for HRM build application kind or thing or some thing different ...! Andriy Burkov would love to hear if there is any specific Field or application we can take as example. or analogy from.

Sheheryar Abbas

‎Data Analyst & Consultant | Helping Startups & Businesses Grow Smarter with Data-Driven Insights

1mo

Making long-context finetuning accessible will be a game-changer for companies looking to leverage LLMs effectively.

Adam Keith Milton-Barker

CogniTech Systems LTD / Peter Moss Leukaemia MedTech Research CIC / Intel Software Innovator / NVIDIA Jetson AI Specialist / Jetson AI Ambassador / Edge Impulse Expert

1mo

What's next in AI needs to be a new architecture. There is only so much shit you can pile on top of shit before it collapses.

Oleg Abrosimov, PMP®

1mo

Private environments? - what does it mean? local DC? How many NVIDIA A100 need to compute 100k token? 200 pcs?

Georgi N.

Senior AI Business Analyst at SoftServe | Agility & Product Strategy

1mo

That's also a great argument to getting smaller models with extended context windows to work for you on premises. No one actually needs a general purpose model for business purposes instead of tuning it to make work easier.

4 Reactions

Raam Venkatesan

1mo

Isn't it better and cheaper to Fund R&D towards Neuromorphic and cognition based computing instead of suggesting the impossible.?

See more comments

To view or add a comment, sign in

More Relevant Posts

AI Value Secrets

316 followers
1mo Edited
Report this post
Absolutely agree with Andriy here! When fine-tuning LLMs with 100k+ token examples in private cloud becomes a standard capability, businesses will move beyond the generic—and that’s when real, enterprise-grade benefits will materialize. Custom models trained on rich, domain-specific data will unlock targeted accuracy, security, and nuanced workflows that no out-of-the-box solution can deliver. The moment this process is commoditized, scalable, and secure... this is the tipping point for large language models to deliver transformative business impact that’s truly tangible. #AIValue #LLM #FutureOfAI

Andriy Burkov Andriy Burkov is an Influencer

PhD in AI, author of 📖 The Hundred-Page Language Models Book and 📖 The Hundred-Page Machine Learning Book
1mo

When finetuning LLMs with 100k+ token examples in private cloud environments becomes a commodity, this is where the real benefits of large language models for business will start to be seen. Putting instructions into a 100k+ token long prompt and hoping for the best isn't a viable business strategy. Finetuning with such long prompts (especially including the hidden reasoning segment) is currently only available to the LLM builders themselves due to the scarcity of talent capable of setting up such expensive distributed computing systems. If you are waiting for what's next in AI: the democratization of long context finetuning is what's next.
Like Comment
To view or add a comment, sign in
Mithula Gokul

🤖 AI & ML Enthusiast | 🖥️ Executive Member @FOSS & @ELS | 🇯🇵 JLPT Q5 Certified, Pursuing N4 | ✍️ Tech Blog Writer
1mo
Report this post
Recently, I explored running an open-source large language model locally using Ollama on my MacBook — specifically gpt-oss:20b. What surprised me was how seamless it felt: a quick pull, a simple run, and I was chatting with a 20B parameter model right from my terminal. The magic here isn’t just the novelty — it’s the privacy and control. Everything runs on-device, with no cloud dependency, meaning your prompts and data stay entirely yours. Open-source AI feels a lot like the early Linux days: community-driven, fast-evolving, and closing the gap with big cloud models step by step. It’s not about replacing GPT-4 today, but about a future where AI is democratized and runs locally for everyone. #AI #OpenSource #LLM #Ollama #GenerativeAI #GPTOss
5 Comments
Like Comment
To view or add a comment, sign in
Jim Booth

Senior Solution Architect - 2Phase DTC Liquid Cooling
1mo
Report this post
Did we just witness the next leap in enterprise AI? Oracle has officially integrated OpenAI’s GPT-5 into its cloud and database services, marking a significant pivot point for data centers and every business leveraging enterprise-grade AI. �� By embedding GPT-5 directly into core applications, Oracle isn’t just supporting AI, it’s making advanced language models the backbone of mission-critical operations. Think: hyper-personalized customer experiences and real-time insights at unprecedented scale. This level of AI integration fundamentally changes #energy, #cooling, and hardware requirements. How will data centers evolve to handle exponentially more powerful AI queries while maintaining efficiency? We're entering an era where the integration of powerful AI isn’t just a feature — it’s becoming the foundation of digital transformation. https://lnkd.in/gDFQiNrm

NVIDIA GB200 NVL72 Racks in Oracle Cloud Regions Around the World

https://www.youtube.com/
Like Comment
To view or add a comment, sign in
Oleg Ciubotaru

Herding rogue AI agents
1mo
Report this post
In one year, OpenRouter went from 133B → 3.4 Trillion tokens/week. And that’s just one [small] slice of the AI compute economy, big players OpenAI, Google, MSFT, AWS... process far more. Sonnet 4 tops the chart, even without all the tokens burned by Claude Code users. The scale of AI inference is impressive as it is today, without the gigawatt scale data centers that will come online soon.
Like Comment
To view or add a comment, sign in
Chelsie Czop

Senior Product Manager, Google Cloud AI Infrastructure
1mo
Report this post
Dealing with slow-downs in any context can be a headache. In a large-scale machine learning models a single underperforming node—or a straggler—can force thousands of other nodes to wait, dragging down your entire training time. This "fail-slow" problem is hard to find, often taking engineers days to manually debug. Our new feature straggler detection available through Cluster Director on Google Cloud changes the game. It builds a real-time communication map of your nodes, then traces the slowdown back to its source in minutes. This pinpoints the exact straggler, saving you precious hours and accelerating your ML workflows. To see it in action 🎬 https://lnkd.in/gJRPBGZs Learn more here 👇 https://lnkd.in/g5m4jFQx

Stragglers in AI: A guide to automated straggler detection | Google Cloud Blog cloud.google.com

1 Comment
Like Comment
To view or add a comment, sign in
Joe Betesh

AI for Wholesalers | Cofounder at Commerce Systems
1mo
Report this post
Anthropic just raised the bar on context windows, giving enterprises the ability to send up to 1 million tokens (about 750,000 words or 75,000 lines of code) to Claude Sonnet 4 in a single prompt. For business leaders, this means AI can now process entire product catalogs, massive codebases, or long policy documents in one go without breaking them into chunks. That unlocks faster insights, smoother automation, and more reliable outputs for teams that live in complex data. With long context now available through Amazon Bedrock and Google Cloud’s Vertex AI, Anthropic is doubling down on enterprise use cases at a time when competition with OpenAI’s GPT-5 is heating up.
1 Comment
Like Comment
To view or add a comment, sign in
Cirrascale Cloud Services

2,660 followers
1mo
Report this post
Cirrascale Cloud Services is proud to be Ai2’s cloud computing partner, providing managed services for the new hardware infrastructure that will power this effort. Ai2 has received $152M in combined support from the U.S. National Science Foundation (NSF) and NVIDIA to build a national, fully open AI ecosystem through the Open Multimodal AI Infrastructure (OMAI) project. Read the announcement: https://lnkd.in/gbsqS6zq

Cirrascale Powers Ai2’s $152M Push for Open, Scientific AI at National Scale cirrascale.com

2 Comments
Like Comment
To view or add a comment, sign in
Azure Cosmos DB

9,408 followers
1mo
Report this post
AI apps are moving from single prompts to complex, multi-agent systems. How do you design for scale? On Sept 16, 12:00–13:00 CET (6:00–7:00 AM ET), Theo van Kraay will present at the European Microsoft Fabric Community Conference: Designing for Scale in Fabric: How Azure Cosmos DB Powers the Full Lifecycle of Multi-Agent AI Apps Learn how #AzureCosmosDB supports agent memory, retrieval-augmented generation (RAG), semantic reranking, and multitenancy — all inside Microsoft Fabric. ➡️ https://msft.it/6040sdsdg #FabConEurope #Azure #AzureCosmosDB
Like Comment
To view or add a comment, sign in
Azure Cosmos DB

9,408 followers
1mo
Report this post
Your AI app works locally. Great. But can it handle millions of queries across the globe without choking? LangChain’s integration with Azure Cosmos DB for NoSQL gives you vector search with the same scale, security, and speed you use for production workloads. That’s how you ship AI for real. 📖 https://msft.it/6046sP2Bw #Azure #CosmosDB #LangChain #AI #MachineLearning #VectorSearch
Like Comment
To view or add a comment, sign in

475,681 followers

View Profile Connect

Andriy Burkov’s Post

More from this author

Artificial Intelligence #296

Artificial Intelligence #296

Artificial Intelligence #295

Explore content categories

Andriy Burkov’s Post

More Relevant Posts

NVIDIA GB200 NVL72 Racks in Oracle Cloud Regions Around the World

https://www.youtube.com/

More from this author

Artificial Intelligence #296

Artificial Intelligence #296

Artificial Intelligence #295

Explore content categories