Skip to content

Commit 030fc3f

Browse files
authored
[LLAVA] Enable 2nd XNNPACK Partition pass for the text model
Differential Revision: D62279641 Pull Request resolved: #4968
1 parent 40720f0 commit 030fc3f

File tree

1 file changed

+6
-1
lines changed

1 file changed

+6
-1
lines changed

examples/models/llava/export_llava.py

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -211,10 +211,15 @@ def export_all(llava_model: LlavaModel):
211211
partitioner={
212212
"image_encoder": [XnnpackPartitioner()],
213213
"text_model": [
214+
# First partition the DQLinear nodes, then partition the rest of the nodes,
215+
# to avoid multiple DQLinear nodes in the same partition,
216+
# to avoid holding multiple unpacked and packed weight buffers in memory,
217+
# to reduce peak memory footprint.
214218
XnnpackPartitioner(
215219
config_precisions=ConfigPrecisionType.DYNAMIC_QUANT,
216220
per_op_mode=True,
217-
)
221+
),
222+
XnnpackPartitioner(),
218223
],
219224
},
220225
compile_config=EdgeCompileConfig(_check_ir_validity=False),

0 commit comments

Comments
 (0)