Skip to content

support mtp for gemma4#1316

Open
WANDY666 wants to merge 61 commits into
mainfrom
gemma4_mtp
Open

support mtp for gemma4#1316
WANDY666 wants to merge 61 commits into
mainfrom
gemma4_mtp

Conversation

@WANDY666

Copy link
Copy Markdown
Contributor

No description provided.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces comprehensive support for the Gemma-4 model family, including multimodal vision capabilities and Multi-Token Prediction (MTP) assistant models. Key technical additions include heterogeneous attention mechanisms for sliding window and full attention layers, tanh-approximate GELU activations in MoE kernels, and a specialized eagle_frozen_kv MTP mode. The implementation also features a new reasoning parser for Gemma-4's Harmony-like format and updates to various Triton kernels. Feedback on the code changes suggests adopting more idiomatic PyTorch advanced indexing for row selection in the MTP post-layer inference and improving robustness by replacing bare except blocks with except Exception in configuration utilities.

Comment on lines +68 to +70
token_num, num_selected, H
)
# Sparse logits: dot product per token vs its selected rows.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using advanced indexing is more idiomatic and readable than index_select followed by a view when selecting rows from a weight matrix. PyTorch's advanced indexing handles this pattern efficiently.

Suggested change
token_num, num_selected, H
)
# Sparse logits: dot product per token vs its selected rows.
selected_embeddings = lm_head_w[selected_vocab]
return [eos_token_id]
elif isinstance(eos_token_id, list):
return list(eos_token_id)
except:

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Avoid using bare except: as it can catch unexpected errors like KeyboardInterrupt or SystemExit, making debugging difficult. Use except Exception: instead.

Suggested change
except:
except Exception:
if model_type in ["gemma4"]:
logger.info("Gemma4 uses tanh-approximate-gelu for FFN")
return True
except:

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Avoid using bare except: as it can catch unexpected errors. Use except Exception: instead to follow best practices for error handling.

Suggested change
except:
except Exception:
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

3 participants