💧Liquid LFM2.5: How To Run & Fine-tune

Run and fine-tune LFM2.5 Instruct and Vision locally on your device!

Liquid AI releases LFM2.5, including their instruct and vision model. LFM2.5-1.2B-Instruct is a 1.17B parameter hybrid reasoning model trained on 28T tokens and RL, delivering best-in-class performance at the 1B scale for instruction following, tool use, and agentic tasks. See Hugging Face Jobs on using Codex to train LFM!

LFM2.5 runs on under 1GB RAM and achieves 239 tok/s decode on AMD CPU. You can also fine-tune it locally with Unsloth.

Text LFM2.5-InstructVision LFM2.5-VL

Model Specifications:

  • Parameters: 1.17B

  • Architecture: 16 layers (10 double-gated LIV convolution blocks + 6 GQA blocks)

  • Training Budget: 28T tokens

  • Context Length: 32,768 tokens

  • Vocabulary Size: 65,536

  • Languages: English, Arabic, Chinese, French, German, Japanese, Korean, Spanish

⚙️ Usage Guide

Liquid AI recommends these settings for inference:

  • temperature = 0.1

  • top_k = 50

  • top_p = 0.1

  • repetition_penalty = 1.05

  • Maximum context length: 32,768

Chat Template Format

LFM2.5 uses a ChatML-like format:

LFM2.5 chat template:

Tool Use

LFM2.5 supports function calling with special tokens <|tool_call_start|> and <|tool_call_end|>. Provide tools as a JSON object in the system prompt:

🖥️ Run LFM2.5-1.2B-Instruct

📖 llama.cpp Tutorial (GGUF)

1. Build llama.cpp

Obtain the latest llama.cpp from GitHub. Change -DGGML_CUDA=ON to -DGGML_CUDA=OFF if you don't have a GPU.

2. Run directly from Hugging Face

3. Or download the model first

4. Run in conversation mode

🦥 Fine-tuning LFM2.5 with Unsloth

Unsloth supports fine-tuning LFM2.5 models. The 1.2B model fits comfortably on a free Colab T4 GPU. Training is 2x faster with 50% less VRAM.

Free Colab Notebook:

LFM2.5 is recommended for agentic tasks, data extraction, RAG, and tool use. It is not recommended for knowledge-intensive tasks or programming.

Unsloth Config for LFM2.5

Training Setup

Save and Export

🎉 llama-server Serving & Deployment

To deploy LFM2.5 for production with an OpenAI-compatible API:

Test with OpenAI client:

📊 Benchmarks

LFM2.5-1.2B-Instruct delivers best-in-class performance at the 1B scale and offers fast CPU inference with low memory usage:

💧 Liquid LFM2.5-1.2B-VL Guide

LFM2.5-VL-1.6B is a vision LLM built on top of LFM2.5-1.2B-Base and tuned for stronger real-world performance. You can now fine-tune it locally with Unsloth.

Running TutorialFine-tuning Tutorial

Dynamic GGUFs
16-bit Instruct

Model Specifications:

  • LM Backbone: LFM2.5-1.2B-Base

  • Vision encoder: SigLIP2 NaFlex shape-optimized 400M

  • Context length: 32,768 tokens

  • Vocabulary size: 65,536

  • Languages: English, Arabic, Chinese, French, German, Japanese, Korean, and Spanish

  • Native resolution processing: Handles images up to 512×512 pixels without upscaling and preserves non-standard aspect ratios without distortion

  • Tiling strategy: Splits large images into non-overlapping 512×512 patches and includes thumbnail encoding for global context

  • Inference-time flexibility: User-tunable maximum image tokens and tile count for speed/quality tradeoff without retraining

⚙️ Usage Guide

Liquid AI recommends these settings for inference:

  • Text: temperature=0.1, min_p=0.15, repetition_penalty=1.05

  • Vision: min_image_tokens=64, max_image_tokens=256, do_image_splitting=True

Chat Template Format

LFM2.5-VL uses a ChatML-like format:

LFM2.5-VL chat template:

🖥️ Run LFM2.5-VL-1.6B

📖 llama.cpp Tutorial (GGUF)

1. Build llama.cpp

Obtain the latest llama.cpp from GitHub. Change -DGGML_CUDA=ON to -DGGML_CUDA=OFF if you don't have a GPU.

2. Run directly from Hugging Face

🦥 Fine-tuning LFM2.5-VL with Unsloth

Unsloth supports fine-tuning LFM2.5 models. The 1.6B model fits comfortably on a free Colab T4 GPU. Training is 2x faster with 50% less VRAM.

Free Colab Notebook:

Unsloth Config for LFM2.5

Training Setup

Save and Export

📊 Benchmarks

LFM2.5-VL-1.6B delivers best-in-class performance:

Model
MMStar
MM-IFEval
BLINK
InfoVQA (Val)
OCRBench (v2)
RealWorldQA
MMMU (Val)
MMMB (avg)
Multilingual MMBench (avg)

LFM2.5-VL-1.6B

50.67

52.29

48.82

62.71

41.44

64.84

40.56

76.96

65.90

LFM2-VL-1.6B

49.87

46.35

44.50

58.35

35.11

65.75

39.67

72.13

60.57

InternVL3.5-1B

50.27

36.17

44.19

60.99

33.53

57.12

41.89

68.93

58.32

FastVLM-1.5B

53.13

24.99

43.29

23.92

26.61

61.56

38.78

64.84

50.89

📚 Resources

Last updated

Was this helpful?