Skip to content

Commit 143e557

Browse files
metascroymalfet
authored andcommitted
fix fp16 issue (#492)
Fixes FP16 + BF16 issue in aoti runner. ``` torchchat % ./cmake-out/aoti_run ./model_bf16.so -z ./.model-artifacts/meta-llama/Llama-2-7b-chat-hf/tokenizer.bin -t 0 -i "Once upon a time" Failed to load ./.model-artifacts/meta-llama/Llama-2-7b-chat-hf/tokenizer.bin into a Tiktoken tokenizer. Trying sentencepiece tokenizer.. Once upon a time, there was a little girl named Lily. She loved to play outside in the sunshine. One day, she saw a big, red ball in the sky. It was the sun! She thought it was so pretty. Lily wanted to play with the ball, but it was too high up in the sky. She tried to jump and reach it, but she couldn't. Then, she had an idea. She would use a stick to get the ball down. Lily found a long stick and tried to reach the ball. She poked and poked, but the ball didn't come down. She was sad. But then, she saw a bird flying by. The bird had a big, red ball in its beak! Lily was so happy! She thanked the bird and played with her new ball all day long. achieved tok/s: 65.842124 ```
1 parent f56fbbc commit 143e557

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

runner/run.cpp

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -157,7 +157,7 @@ float* forward(Transformer* transformer, int token, int pos) {
157157
torch::Tensor pos_tensor = torch::from_blob(pos_buffer, {1}, torch::kLong);
158158
std::vector<torch::Tensor> inputs{token_tensor, pos_tensor};
159159

160-
torch::Tensor result = transformer->runner->run(inputs)[0];
160+
torch::Tensor result = transformer->runner->run(inputs)[0].to(torch::dtype(torch::kFloat32));
161161
auto logits = result[0].data_ptr();
162162

163163
#else // __ET_MODEL__

0 commit comments

Comments
 (0)