Skip to content

Commit a81d4a5

Browse files
authored
Update demo in README.md (ggml-org#6)
* Update demo video in README.md * Update demo at README.md
1 parent 64d83e1 commit a81d4a5

File tree

1 file changed

+5
-3
lines changed

1 file changed

+5
-3
lines changed

README.md

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,13 @@
11
# PowerInfer: Fast Large Language Model Serving with a Consumer-grade GPU
22
---
33

4-
*Demo* 🔥
4+
## Demo 🔥
55

6-
https://github.com/hodlen/PowerInfer/assets/34213478/b782ccc8-0a2a-42b6-a6aa-07b2224a66f7
6+
https://github.com/SJTU-IPADS/PowerInfer/assets/34213478/d26ae05b-d0cf-40b6-8788-bda3fe447e28
77

8-
<sub>The demo is running with a single 24G 4090 GPU, the model is Falcon (ReLU)-40B, and the precision is FP16.</sub>
8+
PowerInfer v.s. llama.cpp on a single RTX 4090(24G) running Falcon(ReLU)-40B-FP16 with a 11x speedup!
9+
10+
<sub>Both PowerInfer and llama.cpp were running on the same hardware and fully utilized VRAM on RTX 4090.</sub>
911

1012
---
1113
## Abstract

0 commit comments

Comments
 (0)