Skip to content

Commit 27af7f6

Browse files
committed
also enable for non-aarch64 on "Mostly sync BlasKernel.cpp with ATen ReducedPrecisionGemvFastPathKernel"
The two files were similar, but diverged due to recent changes. Since we have sharing of PyTorch headers, we can keep them mostly the same; differences are some of the namespace stuff, lintrunner, and a couple of EXECUTORCH NOTEs. Differential Revision: [D74702689](https://our.internmc.facebook.com/intern/diff/D74702689/) [ghstack-poisoned]
1 parent 294798b commit 27af7f6

File tree

1 file changed

+0
-2
lines changed

1 file changed

+0
-2
lines changed

kernels/optimized/blas/BlasKernel.h

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -158,7 +158,6 @@ void gemm_transa_(
158158
}
159159
}
160160

161-
#ifdef __aarch64__
162161
namespace internal {
163162
float bf16_dot_with_fp32_arith(const torch::executor::BFloat16* vec1, const torch::executor::BFloat16* vec2, int64_t len);
164163
} // namespace internal
@@ -204,7 +203,6 @@ inline void gemm_transa_<torch::executor::BFloat16, torch::executor::BFloat16>(
204203
}
205204
});
206205
}
207-
#endif
208206

209207
// clang-format on
210208

0 commit comments

Comments
 (0)