Core ML Has Added Index_Put Support, No Need to Skip Anymore (#2975)

yifan_shen3 · facebook-github-bot · commit 7d4bafcea55f · 2024-04-11T00:06:59.000-07:00
Summary: It was a workaround to skip `aten.index_put` op in Core ML delegation, at the cost of partitioning the Llama model into 13 pieces. For better performance, we prefer to delegate the whole model to Core ML. Since Core ML has added the [necessary support](apple/coremltools#2190), it is time to revert this workaround Pull Request resolved: #2975 Reviewed By: kirklandsign Differential Revision: D56002979 Pulled By: cccclai fbshipit-source-id: e7a7c8c43706cb57eba3e6f720b3d713bec5065b
diff --git a/examples/models/llama2/export_llama_lib.py b/examples/models/llama2/export_llama_lib.py
@@ -605,9 +605,6 @@ def _export_llama(modelname, args) -> str:  # noqa: C901
         partitioners.append(
             # pyre-ignore: Undefined attribute [16]: Module `executorch.backends` has no attribute `apple`
             CoreMLPartitioner(
-                skip_ops_for_coreml_delegation=[
-                    "aten.index_put.default",
-                ],
                 compile_specs=compile_specs,
             )
         )

Original file line number	Diff line number	Diff line change
`@@ -605,9 +605,6 @@ def _export_llama(modelname, args) -> str: # noqa: C901`
`605`	`605`	`partitioners.append(`
`606`	`606`	# pyre-ignore: Undefined attribute [16]: Module `executorch.backends` has no attribute `apple`
`607`	`607`	`CoreMLPartitioner(`
`608`		`- skip_ops_for_coreml_delegation=[`
`609`		`- "aten.index_put.default",`
`610`		`- ],`
`611`	`608`	`compile_specs=compile_specs,`
`612`	`609`	`)`
`613`	`610`	`)`