Skip to content

[AutoDeploy] Merge Feature Branch Week 3 #5054

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jun 10, 2025

Conversation

lucaslie
Copy link
Member

@lucaslie lucaslie commented Jun 9, 2025

Merging week feature branch. Take a look at individual PRs and copilot summary for more info

Fridah-nv added 2 commits June 9, 2025 13:55
* add an arg to load checkpoint on cpu

Signed-off-by: Frida Hou <[email protected]>

* fix pre-commit

Signed-off-by: Frida Hou <[email protected]>

* remove ckpt-device from factory

Signed-off-by: Frida Hou <[email protected]>

---------

Signed-off-by: Frida Hou <[email protected]>
* example of inductor pattern matcher for RoPE with explicit cos/sin matcher

Signed-off-by: Frida Hou <[email protected]>

* move to utils

Signed-off-by: Frida Hou <[email protected]>

* add usage of scalar_workaround, support op_ignore_type

Signed-off-by: Ubuntu <[email protected]>

* minor

Signed-off-by: Ubuntu <[email protected]>

* update all 3 types of RoPE matcher to use inductor pattern matcher

Signed-off-by: Frida Hou <[email protected]>

* address feedback and refine code/doc

Signed-off-by: Frida Hou <[email protected]>

* minor

Signed-off-by: Ubuntu <[email protected]>

* fix 2e2 for llama4 and ds rope, remove legalize_graph in canonicalize_graph, update ds rope impl to match with the exported graph

Signed-off-by: Frida Hou <[email protected]>

* deprecate previous rope matcher

Signed-off-by: Ubuntu <[email protected]>

---------

Signed-off-by: Frida Hou <[email protected]>
Signed-off-by: Ubuntu <[email protected]>
@lucaslie lucaslie self-assigned this Jun 9, 2025
@lucaslie lucaslie requested a review from a team as a code owner June 9, 2025 22:49
@lucaslie lucaslie requested a review from suyoggupta June 9, 2025 22:49
@lucaslie lucaslie requested review from Copilot and suyoggupta and removed request for suyoggupta June 9, 2025 22:49
@lucaslie lucaslie enabled auto-merge (squash) June 9, 2025 22:51
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR merges the Week 3 auto-deploy feature branch, unifying RoPE pattern handling, extending the test harness, and adding a checkpoint_device option across user APIs.

  • Replace separate match_explicit_rope/match_complex_rope calls with a single match_rope_pattern that returns match counts
  • Extend run_test to accept and verify check_num_matches and update all tests accordingly
  • Introduce checkpoint_device in CLI args, transformation pipeline, example configs, and build scripts

Reviewed Changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
tests/unittest/_torch/auto_deploy/unit/singlegpu/transformations/library/test_rope_transformation.py Switch to match_rope_pattern, consolidate test branches, and pass expected match counts
tests/unittest/_torch/auto_deploy/unit/singlegpu/transformations/library/test_quantization.py Add None placeholders for new dynamic_shapes and check_num_matches parameters
tests/unittest/_torch/auto_deploy/_utils_test/_model_test_utils.py Split shared freqs into freqs_q/freqs_k for clarity in apply_rotary_pos_emb_complex
tests/unittest/_torch/auto_deploy/_utils_test/_graph_test_helpers.py Extend run_test signature with check_num_matches and branch on it
tensorrt_llm/llmapi/llm_args.py Add optional checkpoint_device argument to LLM CLI args
tensorrt_llm/_torch/auto_deploy/utils/pattern_matcher.py New FX‐pattern‐matcher utilities for registering and applying inductive patterns
tensorrt_llm/_torch/auto_deploy/transformations/transform.py Replace legacy RoPE match calls with match_rope_pattern, respect checkpoint_device on load
tensorrt_llm/_torch/auto_deploy/transformations/_graph.py Remove obsolete legalize_graph invocation during GraphModule cleanup
examples/auto_deploy/simple_config.py Introduce checkpoint_device field in example config
examples/auto_deploy/build_and_run_ad.py Forward checkpoint_device from config into the deployment builder
Comments suppressed due to low confidence (2)

tests/unittest/_torch/auto_deploy/_utils_test/_graph_test_helpers.py:35

  • The docstring for run_test doesn’t mention the new check_num_matches parameter—please update the header comment to explain this argument.
def run_test(

examples/auto_deploy/simple_config.py:29

  • Optional is not imported in this file; add from typing import Optional to avoid a NameError.
checkpoint_device: Optional[str] = None  # Device on which to load the model checkpoint

@lucaslie
Copy link
Member Author

lucaslie commented Jun 9, 2025

/bot run --disable-fail-fast

@tensorrt-cicd
Copy link
Collaborator

PR_Github #8165 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #8165 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #5917 completed with status: 'FAILURE'

@lucaslie
Copy link
Member Author

/bot skip --comment "all relevant tests passed in pipeline #8165"

@tensorrt-cicd
Copy link
Collaborator

PR_Github #8328 [ skip ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #8328 [ skip ] completed with state SUCCESS
Skipping testing for commit 1c88c0a

@lucaslie lucaslie merged commit 7ddc4d6 into NVIDIA:main Jun 10, 2025
3 checks passed
yunruis pushed a commit to yunruis/TensorRT-LLM that referenced this pull request Jun 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants