-
Notifications
You must be signed in to change notification settings - Fork 1.5k
[AutoDeploy] Merge Feature Branch Week 3 #5054
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AutoDeploy] Merge Feature Branch Week 3 #5054
Conversation
* add an arg to load checkpoint on cpu Signed-off-by: Frida Hou <[email protected]> * fix pre-commit Signed-off-by: Frida Hou <[email protected]> * remove ckpt-device from factory Signed-off-by: Frida Hou <[email protected]> --------- Signed-off-by: Frida Hou <[email protected]>
* example of inductor pattern matcher for RoPE with explicit cos/sin matcher Signed-off-by: Frida Hou <[email protected]> * move to utils Signed-off-by: Frida Hou <[email protected]> * add usage of scalar_workaround, support op_ignore_type Signed-off-by: Ubuntu <[email protected]> * minor Signed-off-by: Ubuntu <[email protected]> * update all 3 types of RoPE matcher to use inductor pattern matcher Signed-off-by: Frida Hou <[email protected]> * address feedback and refine code/doc Signed-off-by: Frida Hou <[email protected]> * minor Signed-off-by: Ubuntu <[email protected]> * fix 2e2 for llama4 and ds rope, remove legalize_graph in canonicalize_graph, update ds rope impl to match with the exported graph Signed-off-by: Frida Hou <[email protected]> * deprecate previous rope matcher Signed-off-by: Ubuntu <[email protected]> --------- Signed-off-by: Frida Hou <[email protected]> Signed-off-by: Ubuntu <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR merges the Week 3 auto-deploy feature branch, unifying RoPE pattern handling, extending the test harness, and adding a checkpoint_device
option across user APIs.
- Replace separate
match_explicit_rope
/match_complex_rope
calls with a singlematch_rope_pattern
that returns match counts - Extend
run_test
to accept and verifycheck_num_matches
and update all tests accordingly - Introduce
checkpoint_device
in CLI args, transformation pipeline, example configs, and build scripts
Reviewed Changes
Copilot reviewed 11 out of 11 changed files in this pull request and generated 1 comment.
Show a summary per file
File | Description |
---|---|
tests/unittest/_torch/auto_deploy/unit/singlegpu/transformations/library/test_rope_transformation.py | Switch to match_rope_pattern , consolidate test branches, and pass expected match counts |
tests/unittest/_torch/auto_deploy/unit/singlegpu/transformations/library/test_quantization.py | Add None placeholders for new dynamic_shapes and check_num_matches parameters |
tests/unittest/_torch/auto_deploy/_utils_test/_model_test_utils.py | Split shared freqs into freqs_q /freqs_k for clarity in apply_rotary_pos_emb_complex |
tests/unittest/_torch/auto_deploy/_utils_test/_graph_test_helpers.py | Extend run_test signature with check_num_matches and branch on it |
tensorrt_llm/llmapi/llm_args.py | Add optional checkpoint_device argument to LLM CLI args |
tensorrt_llm/_torch/auto_deploy/utils/pattern_matcher.py | New FX‐pattern‐matcher utilities for registering and applying inductive patterns |
tensorrt_llm/_torch/auto_deploy/transformations/transform.py | Replace legacy RoPE match calls with match_rope_pattern , respect checkpoint_device on load |
tensorrt_llm/_torch/auto_deploy/transformations/_graph.py | Remove obsolete legalize_graph invocation during GraphModule cleanup |
examples/auto_deploy/simple_config.py | Introduce checkpoint_device field in example config |
examples/auto_deploy/build_and_run_ad.py | Forward checkpoint_device from config into the deployment builder |
Comments suppressed due to low confidence (2)
tests/unittest/_torch/auto_deploy/_utils_test/_graph_test_helpers.py:35
- The docstring for
run_test
doesn’t mention the newcheck_num_matches
parameter—please update the header comment to explain this argument.
def run_test(
examples/auto_deploy/simple_config.py:29
Optional
is not imported in this file; addfrom typing import Optional
to avoid aNameError
.
checkpoint_device: Optional[str] = None # Device on which to load the model checkpoint
/bot run --disable-fail-fast |
PR_Github #8165 [ run ] triggered by Bot |
PR_Github #8165 [ run ] completed with state |
/bot skip --comment "all relevant tests passed in pipeline #8165" |
PR_Github #8328 [ skip ] triggered by Bot |
PR_Github #8328 [ skip ] completed with state |
Signed-off-by: Frida Hou <[email protected]>
Merging week feature branch. Take a look at individual PRs and copilot summary for more info