Revert default to 8650 for llama #5100

guangy10 · 2024-09-05T01:09:15Z

No description provided.

pytorch-bot · 2024-09-05T01:09:18Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/5100

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 17fde41 with merge base e4a2322 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

cccclai · 2024-09-05T01:17:21Z

extension/llm/export/partitioner_lib.py

@@ -105,6 +105,7 @@ def get_coreml_partitioner(


 def get_qnn_partitioner(
+    soc_model,


Suggested change

soc_model,

soc_model = SM8650,

Add add a log here

Added a log entry here.

Providing a default could be error-prone. Users must know which target device to run the model, so to avoid issue due to silent mismatch, it's better to ask users to explicitly provide the chipset info when requesting a qnn_partitioner.

I'm not sure how different it is comparing to use SM8650 directly in partitioner_lib.py. This partitioner_lib.py is only used in export_llama_lib.py and it's either we hardcoded inside the code or one level above. Neither of them is configurable by users.

Is partitioner_lib user-facing? My understanding is that when users work on their model, they should get the partitioner matches with the target device in their script. export_llama_lib.py is dedicated for llama as an example, and you just told me that it won't work for SM8450. So what's the issue of hard-coded SM8650 in export_llama_lib.py?

cccclai · 2024-09-05T01:18:37Z

examples/models/llama2/export_llama_lib.py

        from executorch.extension.llm.custom_ops import model_sharding

        partitioners.append(
            get_qnn_partitioner(
-                args.use_kv_cache, args.pt2e_quantize, args.num_sharding
+                QcomChipset.SM8650,  # Llama 2 works only on SM8650


It's still default to SM8650 right?

I think we need to add a arg to export_llama script

get_qnn_partitioner is only used here per https://github.com/search?q=repo%3Apytorch%2Fexecutorch%20get_qnn_partitioner&type=code.

I'm curious how other non-genai models are partitioned.

As I said, this partitioner_lib is only used by llm and not used by anywhere. This is controlled by the compile spect when constructing the qnn partitioner. Should we just revert the change and use the default value?

For other non-genai models, they're using this function instead

executorch/examples/qualcomm/utils.py

Line 185 in d23548b

soc_model,

Are all llms going to always use SM8650 so you want to hard coded inside the partitioner_lib?

@cccclai As titled this PR is to make it possible to get a qnn parititioner matching with the target device so that users can use it in their own lowering script for other LLMs. Sure today it's only used by the llama example in our codebase, and llama example wants it set to be SM8650 but I fail to understand why you prefer reverting the changes and hard-coded it back inside parititioner_lib after people are experiencing the chipset mismatch issue in #4973 but I will leave the decision up to you as you have more expertise and the PoC of QNN delegate.

I'll be fine with that. And notice that extension/llm/export/partitioner_lib.py is under llm folder, meaning it's only for llm. I don't think we will have bw to test on devices other than s24 before PTC.

Also the real qnn partitioner is defined here

executorch/backends/qualcomm/partition/qnn_partitioner.py

Lines 100 to 114 in b8a2cbd

class QnnPartitioner(Partitioner):

def __init__(

self,

compiler_specs: List[CompileSpec],

skip_node_id_set: set = None,

skip_node_op_set: set = None,

):

self.compiler_specs_snapshot = copy.deepcopy(compiler_specs)

self.delegation_spec = DelegationSpec(

QnnBackend.__name__, self.compiler_specs_snapshot

)

self.partition_tags: Dict[str, DelegationSpec] = {}

self.skip_node_id_set = set() if skip_node_id_set is None else skip_node_id_set

self.skip_node_op_set = set() if skip_node_op_set is None else skip_node_op_set

The file you're changing is just a util function wrapped around qnn partitioner.

facebook-github-bot · 2024-09-05T02:03:29Z

@guangy10 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot · 2024-09-05T20:06:57Z

@guangy10 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

cccclai

Thanks!

guangy10 requested review from cccclai and shewu-quic September 5, 2024 01:09

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Sep 5, 2024

guangy10 changed the base branch from main to match_qnn_chipset_w_device_pool September 5, 2024 01:09

cccclai reviewed Sep 5, 2024

View reviewed changes

Base automatically changed from match_qnn_chipset_w_device_pool to main September 5, 2024 01:52

guangy10 force-pushed the get_qnn_partitioner_with_soc_model branch from ba901f1 to 7edf990 Compare September 5, 2024 02:00

guangy10 requested a review from cccclai September 5, 2024 02:00

guangy10 mentioned this pull request Sep 5, 2024

Make QNN chipset with the device pool #5098

Merged

Revert default soc for llama

17fde41

guangy10 force-pushed the get_qnn_partitioner_with_soc_model branch from 7edf990 to 17fde41 Compare September 5, 2024 20:06

guangy10 changed the title ~~Get qnn partitioner with soc model~~ Revert default to 8650 for llama Sep 5, 2024

cccclai approved these changes Sep 5, 2024

View reviewed changes

cccclai mentioned this pull request Sep 5, 2024

Revert llama soc for QNN to SM8650 #5113

Closed

facebook-github-bot merged commit 8fb1def into main Sep 5, 2024
37 checks passed

facebook-github-bot deleted the get_qnn_partitioner_with_soc_model branch September 5, 2024 22:36

		@@ -105,6 +105,7 @@ def get_coreml_partitioner(


		def get_qnn_partitioner(
		soc_model,

	class QnnPartitioner(Partitioner):
	def __init__(
	self,
	compiler_specs: List[CompileSpec],
	skip_node_id_set: set = None,
	skip_node_op_set: set = None,
	):
	self.compiler_specs_snapshot = copy.deepcopy(compiler_specs)

	self.delegation_spec = DelegationSpec(
	QnnBackend.__name__, self.compiler_specs_snapshot
	)
	self.partition_tags: Dict[str, DelegationSpec] = {}
	self.skip_node_id_set = set() if skip_node_id_set is None else skip_node_id_set
	self.skip_node_op_set = set() if skip_node_op_set is None else skip_node_op_set

Revert default to 8650 for llama #5100

Revert default to 8650 for llama #5100

Uh oh!

Conversation

guangy10 commented Sep 5, 2024

Uh oh!

pytorch-bot bot commented Sep 5, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/5100

✅ No Failures

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cccclai Sep 5, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

guangy10 Sep 5, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot commented Sep 5, 2024

Uh oh!

facebook-github-bot commented Sep 5, 2024

Uh oh!

cccclai left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

pytorch-bot bot commented Sep 5, 2024 •

edited

Loading

cccclai Sep 5, 2024 •

edited

Loading

guangy10 Sep 5, 2024 •

edited

Loading