📙Devstral 2 - How to Run Guide

Guide for local running Mistral Devstral 2 models: 123B-Instruct-2512 and Small-2-24B-Instruct-2512.

Devstral 2 are Mistral’s new coding and agentic LLMs for software engineering, available in 24B and 123B sizes. The 123B model achieves SOTA in SWE-bench, coding, tool-calling and agent use-cases. The 24B model fits in 25GB RAM/VRAM and 123B fits in 128GB.

13th December 2025 Update

We’ve resolved issues in Devstral’s chat template, and results should be significantly better. The 24B & 123B have been updated. Also install the latest llama.cpp as at 13th Dec 2025!

Devstral 2 supports vision capabilities, a 256k context window and uses the same architecture as Ministral 3. You can now run and fine-tune both models locally with Unsloth.

All Devstral 2 uploads use our Unsloth Dynamic 2.0 methodology, delivering the best performance on Aider Polyglot and 5-shot MMLU benchmarks.

Devstral-Small-2-24BDevstral-2-123B

Devstral 2 - Unsloth Dynamic GGUFs:

Devstral-Small-2-24B-Instruct-2512

Devstral-2-123B-Instruct-2512

Devstral-Small-2-24B-Instruct-2512-GGUF

Devstral-2-123B-Instruct-2512-GGUF

🖥️ Running Devstral 2

See our step-by-step guides for running Devstral 24B and the large Devstral 123B models. Both models support vision support but currently vision is not supported in llama.cpp

⚙️ Usage Guide

Here are the recommended settings for inference:

Temperature ~0.15
Min_P of 0.01 (optional, but 0.01 works well, llama.cpp default is 0.1)
Use --jinja to enable the system prompt.
Max context length = 262,144
Recommended minimum context: 16,384
Install the latest llama.cpp since a December 13th 2025 pull request fixes issues.

🎩Devstral-Small-2-24B

The full precision (Q8) Devstral-Small-2-24B GGUF will fit in 25GB RAM/VRAM. Text only for now.

✨ Run Devstral-Small-2-24B-Instruct-2512 in llama.cpp

Obtain the latest llama.cpp on GitHub here. You can follow the build instructions below as well. Change -DGGML_CUDA=ON to -DGGML_CUDA=OFF if you don't have a GPU or just want CPU inference.

If you want to use llama.cpp directly to load models, you can do the below: (:Q4_K_XL) is the quantization type. You can also directly pull from Hugging Face:

Download the model via (after installing pip install huggingface_hub hf_transfer ). You can choose UD_Q4_K_XL or other quantized versions.

Run the model in conversation mode:

👀Devstral and vision

To play with Devstral's image capabilities, let's first download a image like this FP8 Reinforcement Learning with Unsloth below:
We get the image via wget https://unsloth.ai/cgi/image/fp8grpolarge_KharloZxEEaHAY2X97CEX.png?width=3840%26quality=80%26format=auto -O unsloth_fp8.png which will save the image as "unsloth_fp8.png"
Then load the image in via /image unsloth_fp8.png after the model is loaded as seen below:
We then prompt it Describe this image and get the below:

🚚Devstral-2-123B

The full precision (Q8) Devstral-Small-2-123B GGUF will fit in 128GB RAM/VRAM. Text only for now.

✨ Run Devstral-2-123B-Instruct-2512 Tutorial

Obtain the latest llama.cpp on GitHub here. You can follow the build instructions below as well. Change -DGGML_CUDA=ON to -DGGML_CUDA=OFF if you don't have a GPU or just want CPU inference.

You can directly pull from HuggingFace via:

Download the model via (after installing pip install huggingface_hub hf_transfer ). You can choose UD_Q4_K_XL or other quantized versions.

Run the model in conversation mode:

🦥 Fine-tuning Devstral 2 with Unsloth

Just like Ministral 3, Unsloth supports Devstral 2 fine-tuning. Training is 2x faster, use 70% less VRAM and supports 8x longer context lengths. Devstral 2 fits comfortably in a 24GB VRAM L4 GPU.

Unfortunately, Devstral 2 slightly exceeds the memory limits of a 16GB VRAM, so fine-tuning it for free on Google Colab isn't possible for now. However, you can fine-tune the model for free using our Kaggle notebook, which offers access to dual GPUs. Just change the notebook's Magistral model name to the unsloth/Devstral-Small-2-24B-Instruct-2512 model.

We made free Unsloth notebooks to fine-tune Ministral 3, and directly supports Devstral 2, since they share the same architecture! Change the name to use the desired model.

Ministral-3B-Instruct Vision notebook (vision) (Change model name to Devstral 2)
Ministral-3B-Instruct GRPO notebook (Change model name to Devstral 2)

Devstral Vision finetuning notebook

Google Colabcolab.research.google.com

Devstral Sudoku GRPO RL notebook

Google Colabcolab.research.google.com

😎Llama-server serving & deployment

To deploy Devstral 2 for production, we use llama-server In a new terminal say via tmux, deploy the model via:

When you run the above, you will get:

Then in a new terminal, after doing pip install openai, do:

Which will simply print 4.

🧰Tool Calling with Devstral 2 Tutorial

After following Llama-server serving & deployment we then can load up some tools and see Devstral in action! Let's make some tools - copy paste and execute them in Python.

We then ask a simple question from a random list of possible messages to test the model:

We then use the below functions (copy and paste and execute) which will parse the function calls automatically - Devstral 2 might make multiple in tandem!

And after 1 minute, we get:

Or in JSON form:

PreviousGLM-4.7 NextQwen3-VL

Last updated 5 days ago

Was this helpful?

import json, subprocess, random from typing import Any def add_number(a: float | str, b: float | str) -> float: return float(a) + float(b) def multiply_number(a: float | str, b: float | str) -> float: return float(a) * float(b) def substract_number(a: float | str, b: float | str) -> float: return float(a) - float(b) def write_a_story() -> str: return random.choice([ "A long time ago in a galaxy far far away...", "There were 2 friends who loved sloths and code...", "The world was ending because every sloth evolved to have superhuman intelligence...", "Unbeknownst to one friend, the other accidentally coded a program to evolve sloths...", ]) def terminal(command: str) -> str: if "rm" in command or "sudo" in command or "dd" in command or "chmod" in command: msg = "Cannot execute 'rm, sudo, dd, chmod' commands since they are dangerous" print(msg); return msg print(f"Executing terminal command `{command}`") try: return str(subprocess.run(command, capture_output = True, text = True, shell = True, check = True).stdout) except subprocess.CalledProcessError as e: return f"Command failed: {e.stderr}" def python(code: str) -> str: data = {} exec(code, data) del data["__builtins__"] return str(data) MAP_FN = { "add_number": add_number, "multiply_number": multiply_number, "substract_number": substract_number, "write_a_story": write_a_story, "terminal": terminal, "python": python, } tools = [ { "type": "function", "function": { "name": "add_number", "description": "Add two numbers.", "parameters": { "type": "object", "properties": { "a": { "type": "string", "description": "The first number.", }, "b": { "type": "string", "description": "The second number.", }, }, "required": ["a", "b"], }, }, }, { "type": "function", "function": { "name": "multiply_number", "description": "Multiply two numbers.", "parameters": { "type": "object", "properties": { "a": { "type": "string", "description": "The first number.", }, "b": { "type": "string", "description": "The second number.", }, }, "required": ["a", "b"], }, }, }, { "type": "function", "function": { "name": "substract_number", "description": "Substract two numbers.", "parameters": { "type": "object", "properties": { "a": { "type": "string", "description": "The first number.", }, "b": { "type": "string", "description": "The second number.", }, }, "required": ["a", "b"], }, }, }, { "type": "function", "function": { "name": "write_a_story", "description": "Writes a random story.", "parameters": { "type": "object", "properties": {}, "required": [], }, }, }, { "type": "function", "function": { "name": "terminal", "description": "Perform operations from the terminal.", "parameters": { "type": "object", "properties": { "command": { "type": "string", "description": "The command you wish to launch, e.g `ls`, `rm`, ...", }, }, "required": ["command"], }, }, }, { "type": "function", "function": { "name": "python", "description": "Call a Python interpreter with some Python code that will be ran.", "parameters": { "type": "object", "properties": { "code": { "type": "string", "description": "The Python code to run", }, }, "required": ["code"], }, }, }, ]

[ { "role": "assistant", "tool_calls": [ { "id": "JviLK0wUveWguuKQHgZdFdYI2adu85jy", "function": { "arguments": "{}", "name": "write_a_story" }, "type": "function" } ], "content": null }, { "role": "tool", "tool_call_id": "JviLK0wUveWguuKQHgZdFdYI2adu85jy", "name": "write_a_story", "content": "A long time ago in a galaxy far far away..." }, { "role": "assistant", "tool_calls": null, "content": "In a distant galaxy, where the stars burned with an otherworldly glow, there was a planet named Eldoria. Eldoria was a world of stark contrasts\u2014vibrant cities nestled between towering mountains and vast deserts that stretched endlessly under twin suns. The people of Eldoria were known far and wide for their mastery of energy, particularly their legendary laser sabers.\n\nThese sabers were not mere weapons; they were extensions of their wielders' souls, forged from rare crystals that pulsed with the essence of the cosmos. Each saber was unique, its color and power reflecting the personality and spirit of its owner. The most skilled warriors, known as the Saberborn, could wield their sabers with such precision and grace that they seemed to dance with the very fabric of reality.\n\nAmong the Saberborn, there was a warrior named Kael. Kael was an outcast, a former guardian of the sacred Saber Temples who had been exiled for defying the council's orders. The council had sought to hoard the power of the laser sabers, using them to control the people of Eldoria. Kael believed that the sabers should be wielded by those who sought to protect, not dominate.\n\nOne day, Kael received a distress signal from a small village on the outskirts of the desert. The village was under attack by a rogue faction of Saberborn, led by a ruthless warlord named Vexis. Vexis sought to claim the village's ancient relic\u2014a crystal said to amplify the power of any laser saber tenfold. If Vexis succeeded, his army would become unstoppable, and Eldoria would fall into darkness.\n\nKael knew he had to act. He strapped on his saber\u2014a deep blue blade that hummed with the energy of the cosmos\u2014and set off across the desert. The journey was treacherous, with sandstorms and hidden traps set by Vexis's scouts. But Kael pressed on, driven by the memory of the people he had once sworn to protect.\n\nWhen he reached the village, the battle was already in full swing. Vexis's warriors wielded their sabers with brutal efficiency, cutting down defenders with ease. Kael leapt into the fray, his blue saber a blur of light as he disarmed and defeated one enemy after another. The villagers, seeing their savior arrive, rallied behind him, their own sabers flashing as they fought to reclaim their home.\n\nKael faced Vexis in the center of the village square. The warlord's saber was a sickly green, pulsing with dark energy. \"You are too late, Kael,\" Vexis sneered. \"The relic is mine, and with it, I will rule Eldoria.\" Kael stood his ground, his saber raised. \"Over my dead body,\" he replied.\n\nThe two warriors clashed, their sabers locking in a shower of sparks. Kael felt the raw power of the relic coursing through Vexis's blade, but he refused to back down. He channeled his own energy, his saber glowing brighter as he pushed back against Vexis's assault. With a final, desperate strike, Kael disarmed Vexis, sending his saber clattering to the ground.\n\nVexis snarled in defeat, but Kael did not kill him. Instead, he offered him a choice: \"Join me in protecting Eldoria, or leave and never return.\" Vexis, humbled and seeing the truth in Kael's words, chose to stand with him.\n\nWith Vexis's faction now allies, Kael and the villagers reclaimed the relic, using its power to restore balance to Eldoria. The Saber Temples were reformed, and the laser sabers were once again wielded by those who sought to protect, not control.\n\nKael's legend grew, and he became a symbol of hope for the people of Eldoria. His story reminded them that even in the darkest times, the light of courage and justice could prevail. And so, the Saberborn lived on, their laser sabers a beacon of strength and unity in a galaxy full of shadows." } ]