Log inSign up
Shital Shah
7,298 posts
user avatar
Shital Shah
@sytelus
Mostly research and code. If universe is an optimizer, what is its loss function? All opinions are my own.
Redmond, WA
shital.com
Joined July 2007
13.4K
Following
13.9K
Followers
  • Pinned
    user avatar
    Shital Shah
    @sytelus
    Jun 3
    We are so happy to announce our new model Aion 1.0 today! Our team at AI Frontiers Lab at Microsoft Research had been cooking hard on this for quite a while. Aion 1.0 is 14B model that can run locally with reasoning + tool calling capabilities. You can choose whatever agentic
    94K
  • user avatar
    Shital Shah
    @sytelus
    Dec 2, 2022
    ChatGPT was dropped on us just bit over 24 hours. It's like you wake up to the news of first nuclear explosion and you don't know yet what to think about it but you know world will never be the same again. Here some interesting snapshots of this "explosion"🧵:
  • user avatar
    Shital Shah
    @sytelus
    Nov 3, 2024
    Now that we are done with counting r in strawberry…
    ChatGPT o1-preview


You said:
Name the state in USA which has letter q in it.

ChatGPT

Thought for 4 seconds
The U.S. state that contains the letter "q" in its name is Connecticut.
    422K
  • user avatar
    Shital Shah
    @sytelus
    Jan 27, 2025
    Do people even understand that majority of chip buys are for inference and not training? Inference needs are going to grow exponentially yoy no matter how much juice we try to squeeze out.
    574K
  • user avatar
    Shital Shah
    @sytelus
    Sep 14, 2024
    Terence Tao’s grading: GPT-4o: Completely incompetent graduate student o1-preview: Mediocre but not completely incompetent graduate student A step change.
    304K
  • user avatar
    Shital Shah
    @sytelus
    Jan 4, 2024
    So, this robot was made under $32k. It’s driven by a cheap laptop with a mobile 3070ti. It has 2 low res cameras on wrist and one front facing (+ proprioception from arms). Models are tiny ResNet18 backbones. The key insight is that co-training improves performance! 1/3
    user avatar
    Zipeng Fu
    @zipengfu
    Jan 3, 2024
    Introduce 𝐌𝐨𝐛𝐢𝐥𝐞 𝐀𝐋𝐎𝐇𝐀🏄 -- Learning! With 50 demos, our robot can autonomously complete complex mobile manipulation tasks: - cook and serve shrimp🦐 - call and take elevator🛗 - store a 3Ibs pot to a two-door cabinet Open-sourced! Co-led @tonyzzhao, @chelseabfinn
    00:00
    422K
  • user avatar
    Shital Shah
    @sytelus
    Oct 21, 2023
    RL community should be in awe and shock from Eureka paper🫨. The idea here is that you feed the source code of environment to GPT-4 and ask it to write code for the reward function itself! Then you evaluate this reward function in simulation and provide your evaluation results
    user avatar
    Jim Fan
    @DrJimFan
    Oct 20, 2023
    Can GPT-4 teach a robot hand to do pen spinning tricks better than you do? I'm excited to announce Eureka, an open-ended agent that designs reward functions for robot dexterity at super-human level. It’s like Voyager in the space of a physics simulator API! Eureka bridges the
    00:00
    940K
  • user avatar
    Shital Shah
    @sytelus
    Feb 26, 2025
    After we learned that DeepSeek folks were using undocumented PTX instructions, now we are learning that they are using stuff that probably even NVidia people don't know. 🫡
    162K
  • user avatar
    Shital Shah
    @sytelus
    Dec 13, 2023
    Mistral-7B is cool but you know what's cooler? A more powerful model in just 1/3rd of the size! Welcome to Phi-2. This is something our team at Microsoft Research had been tirelessly working on and now we have more numbers comparing with Llama-7B, 13B, 70B and Gemini Nano. 👇
    568K
  • user avatar
    Shital Shah
    @sytelus
    Sep 12, 2024
    wow.... so ChatGPT o1 is getting 80% on my privately held benchmark. The previous best was 30% by Sonnet 3.5 and 20% by GPT 4o. Before folks jump to conclusion that there is some simple new algo waiting to be replicated, let's take time to appreciate that this was a research
    235K
  • user avatar
    Shital Shah
    @sytelus
    Dec 13, 2024
    Are you ready for an early Christmas present from our team at Microsoft Research? Introducing the most powerful smol model ever built in the world! Welcome to Phi-4! 👇
    216K
  • user avatar
    Shital Shah
    @sytelus
    Dec 9, 2023
    I think we haven’t fully grasped the impact of Mamba paper that was just dropped this week. From the results so far, it is very likely that Mamba might just be the architecture that finally unseats the attention from its long held grip on the throne.🧵
    426K
  • user avatar
    Shital Shah
    @sytelus
    Oct 21, 2024
    DeepMind's chess paper has sharply divided AI community: Some are pointing to it as evidence that LLMs can do reasoning and planning while others say it's just lookup table/memorization. In reality, I think the paper uncovers something else if you look into details! 🧵
    411K
  • user avatar
    Shital Shah
    @sytelus
    Jan 6, 2023
    Just about to wrap up my day and saw VALL-E! Wow!! This model takes 3 seconds of speech sample for a person and can synthesize text-to-speech in same voice with unbelievable fidelity. It can maintain even emotion and acoustic environment in the sample. valle-demo.github.io
    249K

New to X?

Sign up now to get your own personalized timeline!

Create account

By signing up, you agree to the Terms of Service and Privacy Policy, including Cookie Use.

Terms·Privacy·Cookies·Accessibility·Ads Info·© 2026 X Corp.
Don't miss what's happening
People on X are the first to know.
Log inSign up