Check model inputs - hidden states #40994

zucchini-nlp · 2025-09-19T08:44:06Z

What does this PR do?

In most vision models the output.hidden_states are the hiddens right after encoder blocks, i.e. before the last layernorm. Therefore for these models output.hidden_states != output.last_hidden_state

Currently check_model_inputs assumes that last hidden state is the correct one to return which is true for language models only. This PR adds a kwarg for check_model_inputs which decides whether to replace last hidden state or not

TBH, i think the way it is done in LMs is the ultimate correct version and we probably need to "break" vision models. But I can't think of a way to obtain pre-norm last hidden states which are needed for some VLMs

HuggingFaceDocBuilderDev · 2025-09-19T08:53:16Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

jiqing-feng · 2025-09-24T04:55:06Z

Hi @zucchini-nlp . Are you still working on this PR? Please let me know when it's ready, I'd like to verify it. Thanks!

zucchini-nlp · 2025-09-24T13:19:23Z

It should be working with VLMs, just need to fix CI and code style. I'll take it after finishing one huge PR I'm working on currently

zucchini-nlp · 2025-09-24T14:56:25Z

tests/models/aria/test_modeling_aria.py

+    @unittest.skip(
+        reason="This architecture seems to not compute gradients for the last vision-layernorm because the model uses hidden states pre-norm"
+    )
+    def test_training_gradient_checkpointing(self):
+        pass
+
+    @unittest.skip(
+        reason="This architecture seems to not compute gradients for the last vision-layernorm because the model uses hidden states pre-norm"
+    )
+    def test_training_gradient_checkpointing_use_reentrant(self):
+        pass
+
+    @unittest.skip(
+        reason="This architecture seems to not compute gradients for the last vision-layernorm because the model uses hidden states pre-norm"
+    )
+    def test_training_gradient_checkpointing_use_reentrant_false(self):
+        pass
+


verified that prev and when model was released, it never used vision features after the norm. So the test was prob re-activated in the meanwhile when we had incorrect hidden_states from vision tower

zucchini-nlp · 2025-09-24T15:12:53Z

I hate rebasing, looks like new models were added to check_model_inputs. I will fix them soon

Cyrilvallez

Nice! I would just like to change the name of the arg, i.e. I don't really understand what is post_ln_hiddens. Let's try to find something more clear, and let's also document what it is in the decorator definition!
Feel free to merge afterwards! 🤗

src/transformers/models/internvl/modular_internvl.py

ArthurZucker

LGTM, but yeah let's rename the arg its not explicit!

github-actions · 2025-10-06T09:11:25Z

[For maintainers] Suggested jobs to run (before merge)

run-slow: aimv2, albert, apertus, arcee, aria, audio_spectrogram_transformer, aya_vision, bert, bert_generation, bitnet, blip, blip_2, blt, camembert, cohere, cohere2

* update all models * fix copies * skip aria tests * update other models * skip should be in test, not tester * i think this is more descriptive as a name * find and replace for new models

zucchini-nlp added 2 commits September 18, 2025 16:42

update all models

9ee3f43

fix copies

63bb723

zucchini-nlp requested a review from ArthurZucker September 19, 2025 08:45

zucchini-nlp removed the request for review from ArthurZucker September 19, 2025 10:54

zucchini-nlp added 2 commits September 24, 2025 16:46

merge main

6419997

skip aria tests

883008b

zucchini-nlp commented Sep 24, 2025

View reviewed changes

zucchini-nlp requested review from ArthurZucker and Cyrilvallez September 24, 2025 14:56

zucchini-nlp added 2 commits September 24, 2025 18:07

update other models

0db1f50

skip should be in test, not tester

e5d6d59

zucchini-nlp mentioned this pull request Sep 30, 2025

Regression in SmolVLM results in different vision embeddings #41190

Closed

4 tasks

Cyrilvallez approved these changes Sep 30, 2025

View reviewed changes

src/transformers/models/internvl/modular_internvl.py Show resolved Hide resolved

ArthurZucker approved these changes Oct 1, 2025

View reviewed changes

zucchini-nlp added 2 commits October 6, 2025 11:09

i think this is more descriptive as a name

c0c58e1

Merge branch 'main' into check-model-inputs-hidden

310a94c

find and replace for new models

68a7b7e

zucchini-nlp merged commit 9db58ab into huggingface:main Oct 6, 2025
25 checks passed

hmellor added the for patch Tag issues / labels that should be included in the next patch label Nov 12, 2025

hmellor mentioned this pull request Dec 1, 2025

Handle decorator with optional arguments better #42512

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Check model inputs - hidden states #40994

Check model inputs - hidden states #40994

Uh oh!

zucchini-nlp commented Sep 19, 2025

HuggingFaceDocBuilderDev commented Sep 19, 2025

jiqing-feng commented Sep 24, 2025

zucchini-nlp commented Sep 24, 2025

zucchini-nlp Sep 24, 2025

zucchini-nlp commented Sep 24, 2025

Cyrilvallez left a comment

Uh oh!

ArthurZucker left a comment

github-actions bot commented Oct 6, 2025

Uh oh!

Labels

6 participants

Check model inputs - hidden states #40994

Check model inputs - hidden states #40994

Uh oh!

Conversation

zucchini-nlp commented Sep 19, 2025

What does this PR do?

HuggingFaceDocBuilderDev commented Sep 19, 2025

jiqing-feng commented Sep 24, 2025

zucchini-nlp commented Sep 24, 2025

zucchini-nlp Sep 24, 2025

Choose a reason for hiding this comment

zucchini-nlp commented Sep 24, 2025

Cyrilvallez left a comment

Choose a reason for hiding this comment

Uh oh!

ArthurZucker left a comment

Choose a reason for hiding this comment

github-actions bot commented Oct 6, 2025

Uh oh!

Labels

6 participants