Skip to content

Commit 263653c

Browse files
committed
Update on "[ExecuTorch] Make ForcedUnroll usage in bf16 BlasKernel actually work for -Oz builds"
Clang is very resistant to inlining under -Oz. For ForcedUnroll to actually unroll, we need to force-inline the lambda. Differential Revision: [D62154247](https://our.internmc.facebook.com/intern/diff/D62154247/) [ghstack-poisoned]
2 parents fd4ebd5 + 77213c3 commit 263653c

File tree

1 file changed

+15
-0
lines changed

1 file changed

+15
-0
lines changed

build/cmake_deps.toml

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -116,6 +116,20 @@ deps = [
116116
"executorch",
117117
]
118118

119+
[targets.optimized_native_cpu_ops_oss]
120+
buck_targets = [
121+
"//configurations:optimized_native_cpu_ops_oss",
122+
]
123+
filters = [
124+
".cpp$",
125+
]
126+
excludes = [
127+
]
128+
deps = [
129+
"executorch_no_prim_ops",
130+
"executorch",
131+
"portable_kernels",
132+
]
119133
# ---------------------------------- core end ----------------------------------
120134
# ---------------------------------- extension start ----------------------------------
121135
[targets.extension_data_loader]
@@ -341,5 +355,6 @@ deps = [
341355
"portable_kernels",
342356
"quantized_kernels",
343357
"xnnpack_backend",
358+
"optimized_native_cpu_ops_oss",
344359
]
345360
# ---------------------------------- LLama end ----------------------------------

0 commit comments

Comments
 (0)