Andriy Burkov’s Post

View profile for Andriy Burkov
Andriy Burkov Andriy Burkov is an Influencer

PhD in AI, author of 📖 The Hundred-Page Language Models Book and 📖 The Hundred-Page Machine Learning Book

When finetuning LLMs with 100k+ token examples in private cloud environments becomes a commodity, this is where the real benefits of large language models for business will start to be seen. Putting instructions into a 100k+ token long prompt and hoping for the best isn't a viable business strategy. Finetuning with such long prompts (especially including the hidden reasoning segment) is currently only available to the LLM builders themselves due to the scarcity of talent capable of setting up such expensive distributed computing systems. If you are waiting for what's next in AI: the democratization of long context finetuning is what's next.

Paolo Perrone

No BS AI/ML Content | ML Engineer with a Plot Twist 🥷50M+ Views 📝

1mo

what infrastructure challenges block wider access to 100k-token tuning?

Theodore Seeber

Enterprise Data Architect | Systems Architect | Project Manager | Fractional CTO | The SQL Unicorn you’ve been looking for

1mo

Do you really need distributed compute for only 100k-tokens? It seems to me you would be better off paying for a reasonably large server and running local.

Ibrahim Errbibi

CTO | AI automation, digital marketing

1mo

Democratizing LLMs transforms the landscape for businesses. Access to tools fosters innovation and creativity. 🌍 #AIRevolution

Sumit Kumar Jha

Researcher |Data Science | AI and ML| Intelligent Automation

1mo

interesting ! like synthetic data generation for HRM build application kind or thing or some thing different ...! Andriy Burkov would love to hear if there is any specific Field or application we can take as example. or analogy from.

Like
Reply
Sheheryar Abbas

‎Data Analyst & Consultant | Helping Startups & Businesses Grow Smarter with Data-Driven Insights

1mo

Making long-context finetuning accessible will be a game-changer for companies looking to leverage LLMs effectively.

Like
Reply
Adam Keith Milton-Barker

CogniTech Systems LTD / Peter Moss Leukaemia MedTech Research CIC / Intel Software Innovator / NVIDIA Jetson AI Specialist / Jetson AI Ambassador / Edge Impulse Expert

1mo

What's next in AI needs to be a new architecture. There is only so much shit you can pile on top of shit before it collapses.

Like
Reply
Oleg Abrosimov, PMP®

Project Manager | PMP® | AI Enthusiast | Agile Delivery | Product Owner | Telecommunications | DataCenter & Infrastructure

1mo

Private environments? - what does it mean? local DC? How many NVIDIA A100 need to compute 100k token? 200 pcs?

Like
Reply
Georgi N.

Senior AI Business Analyst at SoftServe | Agility & Product Strategy

1mo

That's also a great argument to getting smaller models with extended context windows to work for you on premises. No one actually needs a general purpose model for business purposes instead of tuning it to make work easier.

Raam Venkatesan

Head of Ecosystems, Strategic Partnerships, Business Development, Strategy & Marketing| Ambidexterity| Business Models| Digital Transformation| CVC| M&A| Innovation| Marketing| Insights| DEI

1mo

Isn't it better and cheaper to Fund R&D towards Neuromorphic and cognition based computing instead of suggesting the impossible.?

Like
Reply
See more comments

To view or add a comment, sign in

Explore content categories