Skip to content

Enable embedding quant ops in runner #423

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 24, 2024

Conversation

kimishpatel
Copy link
Contributor

@kimishpatel kimishpatel commented Apr 23, 2024

Summary:
Link against quantized ops lib

Test Plan:
works on centos

python torchchat.py download stories15M
export PRMT="Once upon a time in a land far away"
python torchchat.py export stories15M --quant '{"linear:a8w4dq" : {"groupsize": 32}, "embedding" : {"bitwidth": 8, "groupsize": 0}}' --output-pte-path ./model.pte
./scripts/install_et.sh
rm -rf build/cmake-out/
cmake -S ./runner-et -B ./runner-et/cmake-out -G Ninja cmake --build ./runner-et/cmake-out
./runner-et/cmake-out/run ./model.pte -z ./tokenizer.bin -t 0 -i "${PRMT}"

Reviewers:

Subscribers:

Tasks:

Tags:

Summary:
Link against quantized ops lib

Test Plan:
python torchchat.py download stories15M
export PRMT="Once upon a time in a land far away"
python torchchat.py export stories15M --quant '{"linear:a8w4dq" :
{"groupsize": 32}, "embedding" : {"bitwidth": 8, "groupsize": 0}}'
--output-pte-path ./model.pte
./scripts/install_et.sh
rm -rf build/cmake-out/
cmake -S ./runner-et -B ./runner-et/cmake-out -G Ninja
cmake --build ./runner-et/cmake-out
./runner-et/cmake-out/run ./model.pte -z ./tokenizer.bin -t 0 -i
"${PRMT}"

Reviewers:

Subscribers:

Tasks:

Tags:
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label Apr 23, 2024
@kimishpatel kimishpatel merged commit b9d7ca1 into main Apr 24, 2024
@kimishpatel kimishpatel deleted the enable_emb_quant_in_runner_et branch April 24, 2024 01:14
malfet pushed a commit that referenced this pull request Jul 17, 2024
Summary:
Link against quantized ops lib

Test Plan:
python torchchat.py download stories15M
export PRMT="Once upon a time in a land far away"
python torchchat.py export stories15M --quant '{"linear:a8w4dq" :
{"groupsize": 32}, "embedding" : {"bitwidth": 8, "groupsize": 0}}'
--output-pte-path ./model.pte
./scripts/install_et.sh
rm -rf build/cmake-out/
cmake -S ./runner-et -B ./runner-et/cmake-out -G Ninja
cmake --build ./runner-et/cmake-out
./runner-et/cmake-out/run ./model.pte -z ./tokenizer.bin -t 0 -i
"${PRMT}"

Reviewers:

Subscribers:

Tasks:

Tags:
malfet pushed a commit that referenced this pull request Jul 17, 2024
Summary:
Link against quantized ops lib

Test Plan:
python torchchat.py download stories15M
export PRMT="Once upon a time in a land far away"
python torchchat.py export stories15M --quant '{"linear:a8w4dq" :
{"groupsize": 32}, "embedding" : {"bitwidth": 8, "groupsize": 0}}'
--output-pte-path ./model.pte
./scripts/install_et.sh
rm -rf build/cmake-out/
cmake -S ./runner-et -B ./runner-et/cmake-out -G Ninja
cmake --build ./runner-et/cmake-out
./runner-et/cmake-out/run ./model.pte -z ./tokenizer.bin -t 0 -i
"${PRMT}"

Reviewers:

Subscribers:

Tasks:

Tags:
malfet pushed a commit that referenced this pull request Jul 17, 2024
Summary:
Link against quantized ops lib

Test Plan:
python torchchat.py download stories15M
export PRMT="Once upon a time in a land far away"
python torchchat.py export stories15M --quant '{"linear:a8w4dq" :
{"groupsize": 32}, "embedding" : {"bitwidth": 8, "groupsize": 0}}'
--output-pte-path ./model.pte
./scripts/install_et.sh
rm -rf build/cmake-out/
cmake -S ./runner-et -B ./runner-et/cmake-out -G Ninja
cmake --build ./runner-et/cmake-out
./runner-et/cmake-out/run ./model.pte -z ./tokenizer.bin -t 0 -i
"${PRMT}"

Reviewers:

Subscribers:

Tasks:

Tags:
malfet pushed a commit that referenced this pull request Jul 17, 2024
Summary:
Link against quantized ops lib

Test Plan:
python torchchat.py download stories15M
export PRMT="Once upon a time in a land far away"
python torchchat.py export stories15M --quant '{"linear:a8w4dq" :
{"groupsize": 32}, "embedding" : {"bitwidth": 8, "groupsize": 0}}'
--output-pte-path ./model.pte
./scripts/install_et.sh
rm -rf build/cmake-out/
cmake -S ./runner-et -B ./runner-et/cmake-out -G Ninja
cmake --build ./runner-et/cmake-out
./runner-et/cmake-out/run ./model.pte -z ./tokenizer.bin -t 0 -i
"${PRMT}"

Reviewers:

Subscribers:

Tasks:

Tags:
malfet pushed a commit that referenced this pull request Jul 17, 2024
Summary:
Link against quantized ops lib

Test Plan:
python torchchat.py download stories15M
export PRMT="Once upon a time in a land far away"
python torchchat.py export stories15M --quant '{"linear:a8w4dq" :
{"groupsize": 32}, "embedding" : {"bitwidth": 8, "groupsize": 0}}'
--output-pte-path ./model.pte
./scripts/install_et.sh
rm -rf build/cmake-out/
cmake -S ./runner-et -B ./runner-et/cmake-out -G Ninja
cmake --build ./runner-et/cmake-out
./runner-et/cmake-out/run ./model.pte -z ./tokenizer.bin -t 0 -i
"${PRMT}"

Reviewers:

Subscribers:

Tasks:

Tags:
malfet pushed a commit that referenced this pull request Jul 17, 2024
Summary:
Link against quantized ops lib

Test Plan:
python torchchat.py download stories15M
export PRMT="Once upon a time in a land far away"
python torchchat.py export stories15M --quant '{"linear:a8w4dq" :
{"groupsize": 32}, "embedding" : {"bitwidth": 8, "groupsize": 0}}'
--output-pte-path ./model.pte
./scripts/install_et.sh
rm -rf build/cmake-out/
cmake -S ./runner-et -B ./runner-et/cmake-out -G Ninja
cmake --build ./runner-et/cmake-out
./runner-et/cmake-out/run ./model.pte -z ./tokenizer.bin -t 0 -i
"${PRMT}"

Reviewers:

Subscribers:

Tasks:

Tags:
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Meta Open Source bot.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants