Change dump output format to dict with value and metadata#18879
Change dump output format to dict with value and metadata#18879fzyzcjy merged 3 commits intosgl-project:mainfrom
Conversation
Summary of ChangesHello @fzyzcjy, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly refactors the debugging dumper utility to standardize its output format and improve its configurability. Dumped data is now stored as a dictionary containing both the actual value and associated metadata, making the debugging information more comprehensive. The dumper's initialization process has been made more explicit and robust, and helper functions for environment variable parsing have been introduced. These changes enhance the dumper's utility, maintainability, and testability. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Changelog
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request refactors the dumper to save output as a dictionary containing both the value and metadata, which is a good improvement for enriching the dumped data. The changes are well-implemented across the dumper, loader, and comparator. The refactoring of the _Dumper class to separate configuration from instantiation is a nice improvement for testability. The test suite has been updated thoroughly to reflect these changes and to verify the new output format. I have one minor suggestion regarding a type hint in python/sglang/srt/environ.py.
python/sglang/srt/environ.py
Outdated
| for key in env_vars: | ||
| if key.startswith("SGLANG_") or key.startswith("SGL_"): | ||
| raise ValueError("temp_set_env should not be used for sglang env vars") | ||
| def temp_set_env(*, allow_sglang: bool = False, **env_vars: dict[str, Any]): |
There was a problem hiding this comment.
The type hint for **env_vars is incorrect. The syntax **kwargs: dict[str, Any] is not the standard way to type keyword arguments. The type hint should specify the type of the values of the keyword arguments. Since the values are converted to strings later with str(value), Any would be an appropriate type here.
| def temp_set_env(*, allow_sglang: bool = False, **env_vars: dict[str, Any]): | |
| def temp_set_env(*, allow_sglang: bool = False, **env_vars: Any): |
44b7ac5 to
a6ad5b0
Compare
…et_env for distributed tests - Extract _Dumper constructor to accept explicit keyword params instead of reading env vars directly - Add from_env() classmethod as the singleton factory - Add allow_sglang parameter to temp_set_env for SGLANG_DUMPER_* env vars - Replace _reload_dumper hack with temp_set_env context manager in all distributed tests
a6ad5b0 to
c2285b8
Compare
|
@gemini-code-assist review |
There was a problem hiding this comment.
Code Review
The pull request successfully transitions the dump output format to a dictionary containing both the value and metadata, which significantly improves the utility of the debug dumps. The refactoring of the _Dumper class to use environment variable helpers and the updates to the test suite are well-implemented. However, a regression was identified in the _torch_save fallback logic for non-pickleable torch.nn.Parameter objects, which needs to be addressed to ensure that dumps do not fail silently for these types.
| output_data = { | ||
| "value": value, | ||
| "meta": dict(**full_kwargs), | ||
| } | ||
| _torch_save(output_data, str(path)) |
There was a problem hiding this comment.
The fallback logic in _torch_save (lines 174-184) is now broken for the dump call path. _torch_save expects its first argument to be a torch.nn.Parameter to handle pickling errors by falling back to .data. However, it is now being passed a dictionary output_data. If the tensor being dumped is a non-pickleable Parameter subclass, torch.save(output_data, path) will fail, but the isinstance(value, torch.nn.Parameter) check inside _torch_save will be false because value is now the dictionary wrapper. It is recommended to handle the Parameter conversion before wrapping it in the dictionary.
| output_data = { | |
| "value": value, | |
| "meta": dict(**full_kwargs), | |
| } | |
| _torch_save(output_data, str(path)) | |
| output_data = { | |
| "value": value.data if isinstance(value, torch.nn.Parameter) else value, | |
| "meta": full_kwargs, | |
| } | |
| _torch_save(output_data, str(path)) |
Save each dump as {"value": tensor, "meta": {name, rank, ...}} instead
of just the raw tensor. This co-locates metadata with the data and
lets downstream tools (dump_comparator, dump_loader) access context
without parsing filenames.
Update dump_comparator._load_object and DumpLoader.load to extract
the value from the new dict format while staying backward-compatible
with raw tensor files.
c2285b8 to
9159018
Compare
No description provided.