Skip to content

Commit e172c5c

Browse files
Lunwen Hefacebook-github-bot
authored andcommitted
add performance number for 1B/3B (#5704)
Summary: Pull Request resolved: #5704 as title Reviewed By: mergennachin Differential Revision: D63483641 fbshipit-source-id: 12e23f6dfa627c8b523925ce30c1c130d2f3e9d4
1 parent 7e9eaa8 commit e172c5c

File tree

1 file changed

+8
-0
lines changed

1 file changed

+8
-0
lines changed

examples/models/llama2/README.md

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -70,6 +70,14 @@ We have verified running Llama 2 7B [mobile applications](#step-6-build-mobile-a
7070

7171
## Performance
7272

73+
### Llama 3.2 1B and 3B
74+
Llama 3.2 1B and 3B performance was measured on the OnePlus 12 device. The performance measurement is expressed in terms of tokens per second using an [adb binary-based approach](#step-5-run-benchmark-on) for generating 128 tokens.
75+
76+
|Model | bf16 | SpinQuant
77+
|--------| ---------------------- | ---------------
78+
|1B | 19.4 tokens/second | 53.41 tokens/second |
79+
|3B | 7.76 tokens/second | 22.98 tokens/second |
80+
7381
### Llama3 8B and Llama3.1 8B
7482
Llama 3 8B performance was measured on the Samsung Galaxy S22, S24, and OnePlus 12 devices. The performance measurement is expressed in terms of tokens per second using an [adb binary-based approach](#step-5-run-benchmark-on).
7583

0 commit comments

Comments
 (0)