Skip to content

support mtp for gemma4#1316

Open
WANDY666 wants to merge 60 commits into
mainfrom
gemma4_mtp
Open

support mtp for gemma4#1316
WANDY666 wants to merge 60 commits into
mainfrom
gemma4_mtp

Conversation

@WANDY666
Copy link
Copy Markdown
Contributor

No description provided.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces comprehensive support for the Gemma-4 model family, including multimodal vision capabilities and Multi-Token Prediction (MTP) assistant models. Key technical additions include heterogeneous attention mechanisms for sliding window and full attention layers, tanh-approximate GELU activations in MoE kernels, and a specialized eagle_frozen_kv MTP mode. The implementation also features a new reasoning parser for Gemma-4's Harmony-like format and updates to various Triton kernels. Feedback on the code changes suggests adopting more idiomatic PyTorch advanced indexing for row selection in the MTP post-layer inference and improving robustness by replacing bare except blocks with except Exception in configuration utilities.

Comment on lines +68 to +70
token_num, num_selected, H
)
# Sparse logits: dot product per token vs its selected rows.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using advanced indexing is more idiomatic and readable than index_select followed by a view when selecting rows from a weight matrix. PyTorch's advanced indexing handles this pattern efficiently.

Suggested change
token_num, num_selected, H
)
# Sparse logits: dot product per token vs its selected rows.
selected_embeddings = lm_head_w[selected_vocab]

return [eos_token_id]
elif isinstance(eos_token_id, list):
return list(eos_token_id)
except:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Avoid using bare except: as it can catch unexpected errors like KeyboardInterrupt or SystemExit, making debugging difficult. Use except Exception: instead.

Suggested change
except:
except Exception:

if model_type in ["gemma4"]:
logger.info("Gemma4 uses tanh-approximate-gelu for FFN")
return True
except:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Avoid using bare except: as it can catch unexpected errors. Use except Exception: instead to follow best practices for error handling.

Suggested change
except:
except Exception:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants