Add tests for qwen + allow uninitialized weights in Llama model #8552

jackzhxng · 2025-02-18T23:31:46Z

Summary

Add basic ci test for Qwen model. Requires some changes to llama/model.py to allow uninitialized (random) weights.

Test plan

Tested and passed locally
See if CI passes for the new test.

pytorch-bot · 2025-02-18T23:31:49Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/8552

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

⏳ No Failures, 1 Pending

As of commit 8b51959 with merge base 2859e47 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

larryliu0820 · 2025-02-18T23:38:01Z

examples/models/llama/model.py

+        try:
+            # assign=True: load params/buffers by assignment instead of performing an in-place copy.
+            # Because we are using device="meta", tensors do not have memory associated with them
+            # and an in-place copy is a no-op. Use assign=True in load_state_dict for this scenario.
+            missing, unexpected = self.model_.load_state_dict(
+                checkpoint,
+                strict=False,
+                assign=True,
+            )  # self.model_ = Transformer(gptconf)
+        except RuntimeError as e:


Why is this needed?

So it doesn't error out when loading examples/models/llama/params/demo_rand_params.pth or any checkpoint that is incompatible with the model architecture. We also have no way to not specify a checkpoint, I looked into removing the default val for that arg but it's going to take some work since it's relied on internally in a lot of places

jackzhxng · 2025-02-19T04:53:16Z

examples/models/llama/model.py

+        try:
+            # assign=True: load params/buffers by assignment instead of performing an in-place copy.
+            # Because we are using device="meta", tensors do not have memory associated with them
+            # and an in-place copy is a no-op. Use assign=True in load_state_dict for this scenario.
+            missing, unexpected = self.model_.load_state_dict(
+                checkpoint,
+                strict=False,
+                assign=True,
+            )  # self.model_ = Transformer(gptconf)
+        except RuntimeError as e:


So it doesn't error out when loading examples/models/llama/params/demo_rand_params.pth or any checkpoint that is incompatible with the model architecture. We also have no way to not specify a checkpoint, I looked into removing the default val for that arg but it's going to take some work since it's relied on internally in a lot of places

jackzhxng · 2025-02-19T04:56:43Z

examples/models/llama/export_llama_lib.py

 EXECUTORCH_DEFINED_MODELS = [
    "stories110m",
    "llama2",
    "llama3",
    "llama3_1",
    "llama3_2",
    "static_llama",
+    "qwen2_5",


Sorry I accidentally deleted the original comment about ordering, but I was going to say that I think this is clearer to list all the llama models first

larryliu0820

I'm ok with the changes but I'm really concerned on llama/model.py and I think we should clean it up. I'll create a separate issue.

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 18, 2025

jackzhxng added 13 commits February 18, 2025 15:32

Add qwen 2.5

fc07cc2

Fix output embedding

110abd0

Comment / lint

42fdb0d

Add 1.5 config

3ab0bd9

Comment

0a17e3b

Remove qwen rope, use hf rope instead

a27ed67

Back to meta

8aadf45

Parametrize qkv bias

8b0b9f9

Parametrize use hf rope

52d7a11

Add ci tests

9258a68

Test ci pull

1b7de2f

Add qwen to export_llama --models

5422420

Leave weights uninitialized for checkopint load fail

12d4073

jackzhxng force-pushed the jz/export_qwen_tests branch from 5ef3fef to c58edc5 Compare February 18, 2025 23:33

jackzhxng changed the base branch from main to jz/export_qwen February 18, 2025 23:33

Meta -> cpu for uninitialized weights

955b991

jackzhxng force-pushed the jz/export_qwen_tests branch from c58edc5 to 955b991 Compare February 18, 2025 23:34

jackzhxng mentioned this pull request Feb 18, 2025

Add qwen 2.5 #8355

Merged

jackzhxng added the topic: not user facing label Feb 18, 2025

larryliu0820 reviewed Feb 18, 2025

View reviewed changes

pytorch deleted a comment from larryliu0820 Feb 19, 2025

jackzhxng commented Feb 19, 2025

View reviewed changes

mergennachin requested a review from guangy10 February 19, 2025 11:46

Skip executor runner for qwen2 test

9b5516b

jackzhxng force-pushed the jz/export_qwen_tests branch from 47bed4c to 9b5516b Compare February 19, 2025 16:51

larryliu0820 approved these changes Feb 19, 2025

View reviewed changes

jackzhxng added 2 commits February 19, 2025 14:44

Clean up convert_weights

347c6fb

Add README.md

44aa34d

jackzhxng added 3 commits February 21, 2025 08:08

Bias for static attention

93064d2

Merge branch 'main' into jz/export_qwen

7f398c5

Merge branch 'jz/export_qwen' into jz/export_qwen_tests

43221f1

jackzhxng requested a review from lucylq as a code owner February 24, 2025 19:47

jackzhxng added 2 commits February 24, 2025 13:58

Merge branch 'main' into jz/export_qwen

d25aaaa

Merge branch 'jz/export_qwen' into jz/export_qwen_tests

7142ab7

Base automatically changed from jz/export_qwen to main February 24, 2025 22:43

jackzhxng added 2 commits February 24, 2025 14:44

Merge branch 'main' into jz/export_qwen_tests

ca1d6c0

Remove comments

8b51959

jackzhxng merged commit 0add080 into main Feb 25, 2025
86 checks passed

jackzhxng deleted the jz/export_qwen_tests branch February 25, 2025 02:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add tests for qwen + allow uninitialized weights in Llama model #8552

Add tests for qwen + allow uninitialized weights in Llama model #8552

Uh oh!

jackzhxng commented Feb 18, 2025 •

edited

Loading

Uh oh!

pytorch-bot bot commented Feb 18, 2025 •

edited

Loading

Uh oh!

larryliu0820 Feb 18, 2025

Uh oh!

jackzhxng Feb 19, 2025

Uh oh!

jackzhxng Feb 19, 2025

Uh oh!

jackzhxng Feb 19, 2025

Uh oh!

larryliu0820 left a comment

Uh oh!

Uh oh!

Uh oh!

Add tests for qwen + allow uninitialized weights in Llama model #8552

Add tests for qwen + allow uninitialized weights in Llama model #8552

Uh oh!

Conversation

jackzhxng commented Feb 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Uh oh!

pytorch-bot bot commented Feb 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/8552

⏳ No Failures, 1 Pending

Uh oh!

larryliu0820 Feb 18, 2025

Choose a reason for hiding this comment

Uh oh!

jackzhxng Feb 19, 2025

Choose a reason for hiding this comment

Uh oh!

jackzhxng Feb 19, 2025

Choose a reason for hiding this comment

Uh oh!

jackzhxng Feb 19, 2025

Choose a reason for hiding this comment

Uh oh!

larryliu0820 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

jackzhxng commented Feb 18, 2025 •

edited

Loading

pytorch-bot bot commented Feb 18, 2025 •

edited

Loading