Skip to content

Commit f93fd80

Browse files
kimishpatelmalfet
authored andcommitted
Enable embedding quant ops in runner (#423)
Summary: Link against quantized ops lib Test Plan: python torchchat.py download stories15M export PRMT="Once upon a time in a land far away" python torchchat.py export stories15M --quant '{"linear:a8w4dq" : {"groupsize": 32}, "embedding" : {"bitwidth": 8, "groupsize": 0}}' --output-pte-path ./model.pte ./scripts/install_et.sh rm -rf build/cmake-out/ cmake -S ./runner-et -B ./runner-et/cmake-out -G Ninja cmake --build ./runner-et/cmake-out ./runner-et/cmake-out/run ./model.pte -z ./tokenizer.bin -t 0 -i "${PRMT}" Reviewers: Subscribers: Tasks: Tags:
1 parent 98f64b6 commit f93fd80

File tree

1 file changed

+3
-0
lines changed

1 file changed

+3
-0
lines changed

runner-et/CMakeLists.txt

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -43,17 +43,20 @@ target_link_libraries(
4343
extension_module
4444
${TORCHCHAT_ROOT}/${ET_BUILD_DIR}/src/executorch/${CMAKE_OUT_DIR}/extension/data_loader/libextension_data_loader.a # This one does not get installed by ExecuTorch
4545
optimized_kernels
46+
quantized_kernels
4647
portable_kernels
4748
cpublas
4849
eigen_blas
4950
# The libraries below need to be whole-archived linked
5051
optimized_native_cpu_ops_lib
52+
quantized_ops_lib
5153
xnnpack_backend
5254
XNNPACK
5355
pthreadpool
5456
cpuinfo
5557
)
5658
target_link_options_shared_lib(optimized_native_cpu_ops_lib)
59+
target_link_options_shared_lib(quantized_ops_lib)
5760
target_link_options_shared_lib(xnnpack_backend)
5861
# Not clear why linking executorch as whole-archive outside android/apple is leading
5962
# to double registration. Most likely because of linkage issues.

0 commit comments

Comments
 (0)