ModAI changes to export xnnpack delegated non_lowered_server_model #10989

tarun292 · 2025-05-19T23:18:52Z

Summary:
This adds a xnnpack delegated model to non_lowered_server_model. This will help in speeding up server evals in aten mode by delegating to XNNPack. We run a const_prop_pass before running the delegation because this will help get rid of some unnecessary q=>dq patterns that will slow the model down.

Improvements seen for some models inference time are:
MLD model ~900ms=>450ms
OFI model ~450ms => 230ms

Reviewed By: navsud

Differential Revision: D70704201

pytorch-bot · 2025-05-19T23:18:55Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/10989

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 0ad1f99 with merge base b73f9d5 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

facebook-github-bot · 2025-05-19T23:19:05Z

This pull request was exported from Phabricator. Differential Revision: D70704201

…10989) Summary: This adds a xnnpack delegated model to non_lowered_server_model. This will help in speeding up server evals in aten mode by delegating to XNNPack. We run a const_prop_pass before running the delegation because this will help get rid of some unnecessary q=>dq patterns that will slow the model down. Improvements seen for some models inference time are: MLD model ~900ms=>450ms OFI model ~450ms => 230ms Reviewed By: navsud Differential Revision: D70704201

facebook-github-bot · 2025-05-19T23:41:58Z

This pull request was exported from Phabricator. Differential Revision: D70704201

…10989) Summary: This adds a xnnpack delegated model to non_lowered_server_model. This will help in speeding up server evals in aten mode by delegating to XNNPack. We run a const_prop_pass before running the delegation because this will help get rid of some unnecessary q=>dq patterns that will slow the model down. Improvements seen for some models inference time are: MLD model ~900ms=>450ms OFI model ~450ms => 230ms Reviewed By: navsud, YIWENX14, Gasoonjia Differential Revision: D70704201

facebook-github-bot · 2025-05-20T06:25:25Z

This pull request was exported from Phabricator. Differential Revision: D70704201

tarun292 requested review from JacobSzwejbka and larryliu0820 as code owners May 19, 2025 23:18

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label May 19, 2025

facebook-github-bot added the fb-exported label May 19, 2025

tarun292 added topic: not user facing release notes: none Do not include this in the release notes labels May 19, 2025

facebook-github-bot force-pushed the export-D70704201 branch from 95af8c4 to aa7a1d4 Compare May 19, 2025 23:41

YIWENX14 self-requested a review May 20, 2025 04:29

YIWENX14 approved these changes May 20, 2025

View reviewed changes

Gasoonjia approved these changes May 20, 2025

View reviewed changes

facebook-github-bot force-pushed the export-D70704201 branch from aa7a1d4 to 0ad1f99 Compare May 20, 2025 06:25

facebook-github-bot merged commit 6b48e89 into main May 20, 2025
87 of 90 checks passed

facebook-github-bot deleted the export-D70704201 branch May 20, 2025 18:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

ModAI changes to export xnnpack delegated non_lowered_server_model #10989

ModAI changes to export xnnpack delegated non_lowered_server_model #10989

Uh oh!

tarun292 commented May 19, 2025

Uh oh!

pytorch-bot bot commented May 19, 2025 •

edited

Loading

Uh oh!

facebook-github-bot commented May 19, 2025

Uh oh!

facebook-github-bot commented May 19, 2025

Uh oh!

facebook-github-bot commented May 20, 2025

Uh oh!

Uh oh!

Uh oh!

ModAI changes to export xnnpack delegated non_lowered_server_model #10989

ModAI changes to export xnnpack delegated non_lowered_server_model #10989

Uh oh!

Conversation

tarun292 commented May 19, 2025

Uh oh!

pytorch-bot bot commented May 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/10989

✅ No Failures

Uh oh!

facebook-github-bot commented May 19, 2025

Uh oh!

facebook-github-bot commented May 19, 2025

Uh oh!

facebook-github-bot commented May 20, 2025

Uh oh!

Uh oh!

Uh oh!

pytorch-bot bot commented May 19, 2025 •

edited

Loading