1
1
# Building runner-aoti and runner-et
2
- Building the runners is straightforward and is covered in the next sections.
2
+ Building the runners is straightforward and is covered in the next sections. We will showcase the runners using stories15M.
3
+
4
+ The runners accept the following CLI arguments:
5
+
6
+ ```
7
+ Options:
8
+ -t <float> temperature in [0,inf], default 1.0
9
+ -p <float> p value in top-p (nucleus) sampling in [0,1] default 0.9
10
+ -s <int> random seed, default time(NULL)
11
+ -n <int> number of steps to run for, default 256. 0 = max_seq_len
12
+ -i <string> input prompt
13
+ -z <string> optional path to custom tokenizer
14
+ -m <string> mode: generate|chat, default: generate
15
+ -y <string> (optional) system prompt in chat mode
16
+ ```
3
17
4
18
## Building and running runner-aoti
5
19
To build runner-aoti, run the following commands * from the torchchat root directory*
@@ -16,19 +30,14 @@ We first download stories15M and export it to AOTI.
16
30
17
31
```
18
32
python torchchat.py download stories15M
19
- python torchchat.py export --output-dso-path ./model.dso
20
- ```
21
-
22
- We also need a tokenizer.bin file for the stories15M model:
23
-
24
- ```
25
- wget ./tokenizer.bin https://github.com/karpathy/llama2.c/raw/master/tokenizer.bin
33
+ python torchchat.py export stories15M --output-dso-path ./model.so
26
34
```
27
35
28
36
We can now execute the runner with:
29
37
30
38
```
31
- ./runner-aoti/cmake-out/run ./model.dso -z ./tokenizer.bin -i "Once upon a time"
39
+ wget -O ./tokenizer.bin https://github.com/karpathy/llama2.c/raw/master/tokenizer.bin
40
+ ./runner-aoti/cmake-out/run ./model.so -z ./tokenizer.bin -i "Once upon a time"
32
41
```
33
42
34
43
## Building and running runner-et
@@ -43,7 +52,7 @@ cmake -S ./runner-et -B ./runner-et/cmake-out -G Ninja
43
52
cmake --build ./runner-et/cmake-out
44
53
```
45
54
46
- After running these, the runner-et binary is located at ./runner-et/cmake-out/runner-et .
55
+ After running these, the runner-et binary is located at ./runner-et/cmake-out/run .
47
56
48
57
Let us try using it with an example.
49
58
We first download stories15M and export it to ExecuTorch.
@@ -53,14 +62,9 @@ python torchchat.py download stories15M
53
62
python torchchat.py export stories15M --output-pte-path ./model.pte
54
63
```
55
64
56
- We also need a tokenizer.bin file for the stories15M model:
57
-
58
- ```
59
- wget ./tokenizer.bin https://github.com/karpathy/llama2.c/raw/master/tokenizer.bin
60
- ```
61
-
62
65
We can now execute the runner with:
63
66
64
67
```
65
- ./runner-et/cmake-out/runner_et ./model.pte -z ./tokenizer.bin -i "Once upon a time"
68
+ wget -O ./tokenizer.bin https://github.com/karpathy/llama2.c/raw/master/tokenizer.bin
69
+ ./runner-et/cmake-out/run ./model.pte -z ./tokenizer.bin -i "Once upon a time"
66
70
```
0 commit comments