AI models are reasoning, creating, and evolving. The evidence is no longer theoretical; it's peer-reviewed, measurable, and, in some domains, superhuman. In the last 18 months, we’ve seen LLMs move far beyond next-token prediction. They’re beginning to demonstrate real reasoning, hypothesis generation, long-horizon planning, and even scientific creativity. Here are six breakthroughs that redefine what these models can do: Superhuman Clinical Reasoning (Nature Medicine, 2025) In a rigorous test across 12 specialties, GPT-4 scored 89% on the NEJM Knowledge+ medical reasoning exam, outperforming the average physician score of 74%. This wasn’t just Q&A; it involved multi-hop reasoning, risk evaluation, and treatment planning. That’s structured decision-making in high-stakes domains. Creative Research Ideation (Zhou et al., 2024 – arXiv:2412.10849) Across 10 fields from physics to economics, GPT-4 and Claude generated research questions rated more creative than human-generated ones in 53% of cases. This wasn’t trivia; domain experts blindly compared ideas from AI and researchers. In over half the cases, the AI won. Falsifiable Hypotheses from Raw Data (Nemati et al., 2024) GPT-4o was fed raw experimental tables from biology and materials science and asked to propose novel hypotheses. 46% of them were judged publishable by experts, outperforming PhD students (29%) on the same task. That’s not pattern matching, that’s creative scientific reasoning from scratch. Self-Evolving Agents (2024) LLM agents that reflect, revise memory, and re-prompt themselves improved their performance on coding benchmarks from 21% → 34% in just four self-corrective cycles, without retraining. This is meta-cognition in action: learning from failure, iterating, and adapting over time. Long-Term Agent Memory (A-MEM, 2025) Agents equipped with dynamic long-term memory (inspired by Zettelkasten) achieved 2× higher success on complex web tasks, planning across multiple steps with context continuity. Emergent Social Reasoning (AgentSociety, 2025) In a simulation of 1,000 LLM-driven agents, researchers observed emergent social behaviors: rumor spreading, collaborative planning, and even economic trade. No hardcoding. Just distributed reasoning, goal propagation, and learning-by-interaction. These findings span healthcare, science, software engineering, and multi-agent simulations. They reveal systems that generate, reason, and coordinate, not just predict. So when some argue that “AI is only simulating thought,” we should ask: Are the tests capturing how real reasoning happens? The Tower of Hanoi isn’t where science, medicine, or innovation happens. The real test is: 1. Can a model make a novel discovery? 2. Can it self-correct across steps? 3. Can it outperform domain experts in structured judgment? And increasingly, the answer is: yes. Let’s not confuse symbolic puzzles with intelligence. Reasoning is already here, and it’s evolving.
Innovative Approaches in Scientific Machine Learning
Explore top LinkedIn content from expert professionals.
-
-
How do materials fail, and how can we design stronger, tougher, and more resilient ones? Published in #PNAS, our physics-aware AI model integrates advanced reasoning, rational thinking, and strategic planning capabilities models with the ability to write and execute code, perform atomistic simulations to solicit new physics data from “first principles”, and conduct visual analysis of graphed results and molecular mechanisms. By employing a multiagent strategy, these capabilities are combined into an intelligent system designed to solve complex scientific analysis and design tasks, as applied here to alloy design and discovery. This is significant because our model overcomes the limitations of traditional data-driven approaches by integrating diverse AI capabilities—reasoning, simulations, and multimodal analysis—into a collaborative system, enabling autonomous, adaptive, and efficient solutions to complex, multiobjective materials design problems that were previously slow, expert-dependent, and domain-specific. Wonderful work by my postdoc Alireza Ghafarollahi! Background: The design of new alloys is a multiscale problem that requires a holistic approach that involves retrieving relevant knowledge, applying advanced computational methods, conducting experimental validations, and analyzing the results, a process that is typically slow and reserved for human experts. Machine learning can help accelerate this process, for instance, through the use of deep surrogate models that connect structural and chemical features to material properties, or vice versa. However, existing data-driven models often target specific material objectives, offering limited flexibility to integrate out-of-domain knowledge and cannot adapt to new, unforeseen challenges. Our model overcomes these limitations by leveraging the distinct capabilities of multiple AI agents that collaborate autonomously within a dynamic environment to solve complex materials design tasks. The proposed physics-aware generative AI platform, AtomAgents, synergizes the intelligence of LLMs and the dynamic collaboration among AI agents with expertise in various domains, incl. knowledge retrieval, multimodal data integration, physics-based simulations, and comprehensive results analysis across modalities. The concerted effort of the multiagent system allows for addressing complex materials design problems, as demonstrated by examples that include autonomously designing metallic alloys with enhanced properties compared to their pure counterparts. We demonstrate accurate prediction of key characteristics across alloys and highlight the crucial role of solid solution alloying to steer the development of alloys. Paper: https://lnkd.in/enusweMf Code: https://lnkd.in/eWv2eKwS MIT Schwarzman College of Computing MIT Civil and Environmental Engineering MIT Department of Mechanical Engineering (MechE) MIT Industrial Liaison Program MIT School of Engineering
-
Some of the most exciting breakthroughs happen when we step back and ask ourselves: Are we solving this problem the right way - or just the way it’s always been solved? Early in my journey, I learned the power of first principles thinking - stripping away assumptions and breaking problems down to their simplest truths. This mindset has stuck with me, and it’s a driving force behind how we think about innovation at GreyOrange. Lately, I’ve been fascinated by the potential of Agentic AI - not just as a tool to improve what we do, but as a way to rethink the very foundation of how we solve problems. Here’s what I mean: There’s a class of problems called NP-Hard problems, the kind that make most optimization challenges look like a walk in the park. Finding the most optimal solution to these problems in a timebound space isn’t just tough - it’s often considered impossible. Until recently, we’ve had to rely on approximations, accepting “good enough” as the best we could do. But a combination of supervised learning and reinforcement learning is changing the game. Instead of heuristic based algorithms, we’re now building systems that are dynamic - learning, adapting, and strengthening themselves over time. What started with AlphaGo has come a long way! And here’s the truly exciting part: it’s not just about solving problems better, it’s about reshaping the very process of optimization. Imagine a world where algorithms don’t just calculate - they innovate. That’s what current AI models allow us to do. When I think about this, I can’t help but reflect on how rare it is to start with a completely new way of thinking. It’s not often we get the chance to rewrite the rules, and that’s exactly what’s happening here. For me, this is the heart of innovation: challenging what we think we know and daring to ask, what if? What problems could we tackle differently if we embraced this approach more often? #firstprinciples #agenticAI #genAI #AI #AIML #NPHard #DeepMind
-
Google DeepMind’s AI Co-Scientist paper was just released, and you should check it out! It represents a paradigm shift in scientific discovery, leveraging a multi-agent system built on Gemini 2.0 to autonomously generate, refine, and validate new research hypotheses. 🔹How does it work? Well the system uses a generate, debate, and evolve framework, where distinct agents called Generation, Reflection, Ranking, Evolution, Proximity, and Meta-Review, collaborate in an iterative hypothesis refinement loop. 🔹Some key innovations that pop out include an asynchronous task execution framework, which enables dynamic allocation of computational resources, and a tournament-based Elo ranking system that continuously optimizes hypothesis quality through simulated scientific debates. 🔹The agentic orchestration accelerates hypothesis validation for processes that take humans decades in some instance. For example empirical validation in biomedical applications, such as drug repurposing for acute myeloid leukemia (AML) and epigenetic target discovery for liver fibrosis, quickly helped researchers generate clinically relevant insights. What should we all get from this? 🔸Unlike traditional AI-assisted research tools, AI Co-Scientist doesn’t summarize existing knowledge but instead proposes experimentally testable, original hypotheses, fundamentally reshaping the research paradigm by acting as an intelligent collaborator that augments human scientific inquiry. Do take some time this Sunday to read! #genai #technology #artificialintelligence
-
🔬 Exciting Progress in AI for Science this week as Google Unveils AI Co-Scientist - A New Era of Accelerated Scientific Discovery! Key takeaways from this new paper published yesterday: 🤖 Introduction of AI Co-Scientist: Google has developed an AI system named "AI Co-Scientist," built on Gemini 2.0, designed to function as a virtual collaborator for scientists. This system aims to assist in generating novel hypotheses and accelerating scientific and biomedical discoveries. 👨👩👦👦 Multi-Agent Architecture: The AI Co-Scientist employs a multi-agent framework that mirrors the scientific method. It utilizes a "generate, debate, and evolve" approach, allowing for flexible scaling of computational resources and iterative improvement of hypothesis quality. 🧬 Biomedical Applications: In its initial applications, the AI Co-Scientist has demonstrated potential in several areas: 1. Drug Repurposing: Identified candidates for acute myeloid leukemia that exhibited tumor inhibition in vitro at clinically relevant concentrations. 2. Novel Target Discovery: Proposed new epigenetic targets for liver fibrosis, validated by anti-fibrotic activity and liver cell regeneration in human hepatic organoids. 3. Understanding Bacterial Evolution: Recapitulated unpublished experimental results by discovering a novel gene transfer mechanism in bacterial evolution through in silico methods. 🤝 Collaborative Enhancement: The system is designed to augment, not replace, human researchers. By handling extensive literature synthesis and proposing innovative research directions, it allows scientists to focus more on experimental validation and creative problem-solving. 💡 Implications for Future Research: The AI Co-Scientist represents a significant advancement in AI-assisted research, potentially accelerating the pace of scientific breakthroughs and fostering deeper interdisciplinary collaboration. This development underscores the transformative role AI can play in scientific inquiry, offering tools that enhance human ingenuity and expedite the journey from hypothesis to discovery.
-
Here’s a truly impactful AI multi-agent application that I’m excited to share! Imagine a world where the boundaries of scientific research are pushed beyond traditional limits, not just by human intelligence but with the help of AI Agents. That's exactly what the Virtual Lab is doing! At the heart of this innovation lies large language models (LLMs) that are reshaping how we approach interdisciplinary science. These LLMs have recently shown an impressive ability to aid researchers across diverse domains by answering scientific questions. 𝐅𝐨𝐫 𝐦𝐚𝐧𝐲 𝐬𝐜𝐢𝐞𝐧𝐭𝐢𝐬𝐭𝐬, 𝐚𝐜𝐜𝐞𝐬𝐬𝐢𝐧𝐠 𝐚 𝐝𝐢𝐯𝐞𝐫𝐬𝐞 𝐭𝐞𝐚𝐦 𝐨𝐟 𝐞𝐱𝐩𝐞𝐫𝐭𝐬 𝐜𝐚𝐧 𝐛𝐞 𝐜𝐡𝐚𝐥𝐥𝐞𝐧𝐠𝐢𝐧𝐠. But with Virtual Lab, few Stanford Researchers turned that dream into reality by creating an AI human research collaboration. 𝐇𝐞𝐫𝐞'𝐬 𝐡𝐨𝐰 𝐢𝐭 𝐰𝐨𝐫𝐤𝐬: → The Virtual Lab is led by an LLM principal investigator agent. → This agent guides a team of LLM agents, each with a distinct scientific expertise. → A human researcher provides high level feedback to steer the project. → Team meetings are held by agents to discuss scientific agendas. → Individual agent meetings focus on specific tasks assigned to each agent. 𝐖𝐡𝐲 𝐢𝐬 𝐭𝐡𝐢𝐬 𝐚 𝐠𝐚𝐦𝐞𝐜𝐡𝐚𝐧𝐠𝐞𝐫? The Stanford team applied the Virtual Lab to tackle the complex problem of designing nanobody binders for SARSCoV2 variants. This requires expertise from biology to computer science. The results? A novel computational design pipeline that churned out 92 new nanobodies. Among these, two exhibit improved binding to new variants while maintaining efficacy against the ancestral virus. making them promising candidates for future studies and treatments. This is not just a theoretical exercise. It's a real-world application that holds significant promise for scientific discovery and medical advancements. AI isn't just a tool anymore; it's becoming a partner in discovery. Isn't it time we embrace the future of collaborative research? What do you think about the potential of AI in revolutionizing science? Let's discuss! Read the full research here: https://lnkd.in/eBxUQ7Zy #aiagents #scientificrevolution #artificialintelligence
-
Generative Multiagent Systems are accelerating scientific discovery by overcoming traditional research barriers and igniting a revolution in interdisciplinary innovation. 🤖 In today’s rapidly evolving research landscape, interdisciplinary collaboration is key to solving complex scientific challenges. 🔬 Yet, many scientists lack ready access to experts across all relevant domains related to their scientific inquiries. 🔭 This is where Generative Multiagent Systems, that are powered by large language models, are poised to make a transformative impact. 🌟 Imagine a specialized team of computational experts composed and orchestrated by a research leader and guided by incisive human insight and prescience. 💡 This bold fusion of #GenerativeAI with #AgenticAI and human ingenuity is transforming research by turbocharging scientific discovery. 💎 1️⃣ Imagine a research system where an ensemble of LLMs acts as a principal investigator that builds and manages a team of specialized research agents. 2️⃣ Each AI agent brings domain-specific expertise to the table, engaging in both collective “team meetings” and focused individual sessions. 3️⃣ During team meetings, AI agents deliberate on a scientific agenda, iterating hypotheses and aligning on research strategies. 4️⃣ In individual sessions, each AI agent tackles targeted tasks, from experimental design to computational modeling and rigorous self-critique. 5️⃣ Throughout this process, a human researcher provides overall direction and strategic oversight, ensuring that the system’s outputs align with real-world scientific priorities. By harnessing the diverse perspectives of specialized agents under a unified, intelligent framework, Generative Multiagent Systems can rapidly generate novel insights and accelerate the discovery process. 💫 This human and #AI research collaboration not only enhances efficiency but also broadens the scope of scientific inquiry, opening pathways for breakthroughs in areas such as drug discovery and beyond. ✨ I was delighted to welcome my dear friend and globally renowned thought leader, Professor James Zou, to the Synthetic Intelligence Forum for a discussion about Virtual Lab. ⚡ In this talk, Professor Zou describes Virtual Lab which is a Generative Multiagent System for scientific research. 🖥️ As an Associate Professor of Biomedical Data Science, with courtesy appointments in Computer Science and Electrical Engineering departments, at Stanford University, Professor Zou is reputed for his high impact research in computational biology, data science, machine learning, and public health. 📚 During our session, Professor Zou offered a roadmap for extending and expanding the coverage of Virtual Lab across multiple scientific disciplines. 🔦 Special thanks to my distinguished partner in the Synthetic Intelligence Forum, Olga, for her esteemed collaboration in convening this thoughtful and thought provoking discussion. 🚀 Recording: https://lnkd.in/eEN6UPpP 🌐
Generative Multiagent Systems for Advancing Scientific Research (James Zou, PhD)
https://www.youtube.com/
-
The same AI models that generate images can now design new materials — And they are going it as a team. Microsoft Research accounced two specialized material AI agents that work together: MatterGen is the brainstormer. It uses diffusion models – similar to the algorithms powering image generation – to design novel molecular structures and predict their fundamental properties. MatterSim is the critic, assessing the physical stability and viability of MatterGen's proposed structures by applying fundamental quantum-mechanical principles. This agentic AI workflow can massivly accelerate the materials discovery timeline compared to the guess and check methods we’re stuck with today. Beyond the speed, there's a deeper insight here relevant to the advancement of AI in science: the power of general machine learning architectures. A General approache is yet again proving highly effective for complex, specialized problems. Here they eliminate the need for intricate, computationally intensive, domain-specific Field Theory. The adaptability that allows these models to excel at tasks from image creation to atomic-scale simulation underscores their potential in material science. We wonder: could a model like this be applied to polymers? Rampi Ramprasad Chiho Kim. Who's all in? who's skeptical? Timothy McGee David Breslauer, PhD Nikolaus Mackay Joanna Pool, PhD, PMP Each week Kir Titievsky 🇺🇦 and I have been diving into new research and applications for AI and material science. Our thesis? AI is shifting new materials from art to science. Follow for more! --- Amazing researchers behind the work: behind the work: Tian Xie, Ziheng LU, Claudio Zeni Robert Pinsler Daniel Zugner, Andrew Fowler, Matthew Horton, Ryota Tomioka, and many more. #ArtificialIntelligence #MaterialsScience #Innovation #DeepTech #MachineLearning #ComputationalChemistry #DigitalTransformation ##Investment #Research #AIinScience #AICoE @