Skip to content

Commit bdd2356

Browse files
authored
Merge branch 'main' into pinbump1111
2 parents a9fa27e + e0ce144 commit bdd2356

File tree

7 files changed

+78
-8
lines changed

7 files changed

+78
-8
lines changed

.ci/scripts/run-docs

Lines changed: 18 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -75,9 +75,6 @@ if [ "$1" == "advanced" ]; then
7575
fi
7676

7777
if [ "$1" == "evaluation" ]; then
78-
79-
exit 0
80-
8178
echo "::group::Create script to run evaluation"
8279
python3 torchchat/utils/scripts/updown.py --file torchchat/utils/docs/evaluation.md --replace 'llama3:stories15M,-l 3:-l 2' --suppress huggingface-cli,HF_TOKEN > ./run-evaluation.sh
8380
# for good measure, if something happened to updown processor,
@@ -95,7 +92,7 @@ fi
9592
if [ "$1" == "multimodal" ]; then
9693

9794
# Expecting that this might fail this test as-is, because
98-
# it's the first on-pr test depending on githib secrets for access with HF token access
95+
# it's the first on-pr test depending on github secrets for access with HF token access
9996

10097
echo "::group::Create script to run multimodal"
10198
python3 torchchat/utils/scripts/updown.py --file docs/multimodal.md > ./run-multimodal.sh
@@ -111,3 +108,20 @@ if [ "$1" == "multimodal" ]; then
111108
bash -x ./run-multimodal.sh
112109
echo "::endgroup::"
113110
fi
111+
112+
if [ "$1" == "native" ]; then
113+
114+
echo "::group::Create script to run native-execution"
115+
python3 torchchat/utils/scripts/updown.py --file docs/native-execution.md > ./run-native.sh
116+
# for good measure, if something happened to updown processor,
117+
# and it did not error out, fail with an exit 1
118+
echo "exit 1" >> ./run-native.sh
119+
echo "::endgroup::"
120+
121+
echo "::group::Run native-execution"
122+
echo "*******************************************"
123+
cat ./run-native.sh
124+
echo "*******************************************"
125+
bash -x ./run-native.sh
126+
echo "::endgroup::"
127+
fi

.github/workflows/run-readme-pr-mps.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@ jobs:
1010
uses: pytorch/test-infra/.github/workflows/macos_job.yml@main
1111
with:
1212
runner: macos-m1-14
13+
timeout-minutes: 50
1314
script: |
1415
conda create -y -n test-readme-mps-macos python=3.10.11 llvm-openmp
1516
conda activate test-readme-mps-macos

.github/workflows/run-readme-pr.yml

Lines changed: 43 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -227,3 +227,46 @@ jobs:
227227
echo "::endgroup::"
228228
229229
TORCHCHAT_DEVICE=cpu .ci/scripts/run-docs multimodal
230+
231+
test-native-any:
232+
uses: pytorch/test-infra/.github/workflows/linux_job.yml@main
233+
with:
234+
runner: linux.g5.4xlarge.nvidia.gpu
235+
gpu-arch-type: cuda
236+
gpu-arch-version: "12.1"
237+
timeout: 60
238+
script: |
239+
echo "::group::Print machine info"
240+
uname -a
241+
echo "::endgroup::"
242+
243+
echo "::group::Install newer objcopy that supports --set-section-alignment"
244+
yum install -y devtoolset-10-binutils
245+
export PATH=/opt/rh/devtoolset-10/root/usr/bin/:$PATH
246+
echo "::endgroup::"
247+
248+
.ci/scripts/run-docs native
249+
250+
echo "::group::Completion"
251+
echo "tests complete"
252+
echo "*******************************************"
253+
echo "::endgroup::"
254+
255+
test-native-cpu:
256+
uses: pytorch/test-infra/.github/workflows/linux_job.yml@main
257+
with:
258+
runner: linux.g5.4xlarge.nvidia.gpu
259+
gpu-arch-type: cuda
260+
gpu-arch-version: "12.1"
261+
timeout: 60
262+
script: |
263+
echo "::group::Print machine info"
264+
uname -a
265+
echo "::endgroup::"
266+
267+
echo "::group::Install newer objcopy that supports --set-section-alignment"
268+
yum install -y devtoolset-10-binutils
269+
export PATH=/opt/rh/devtoolset-10/root/usr/bin/:$PATH
270+
echo "::endgroup::"
271+
272+
TORCHCHAT_DEVICE=cpu .ci/scripts/run-docs native

README.md

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -231,6 +231,8 @@ python3 torchchat.py server llama3.1
231231
```
232232
[skip default]: end
233233

234+
[shell default]: python3 torchchat.py server llama3.1 & server_pid=$!
235+
234236
In another terminal, query the server using `curl`. Depending on the model configuration, this query might take a few minutes to respond.
235237

236238
> [!NOTE]
@@ -244,8 +246,6 @@ Setting `stream` to "true" in the request emits a response in chunks. If `stream
244246

245247
**Example Input + Output**
246248

247-
[skip default]: begin
248-
249249
```
250250
curl http://127.0.0.1:5000/v1/chat/completions \
251251
-H "Content-Type: application/json" \
@@ -265,12 +265,14 @@ curl http://127.0.0.1:5000/v1/chat/completions \
265265
]
266266
}'
267267
```
268+
[skip default]: begin
268269
```
269270
{"response":" I'm a software developer with a passion for building innovative and user-friendly applications. I have experience in developing web and mobile applications using various technologies such as Java, Python, and JavaScript. I'm always looking for new challenges and opportunities to learn and grow as a developer.\n\nIn my free time, I enjoy reading books on computer science and programming, as well as experimenting with new technologies and techniques. I'm also interested in machine learning and artificial intelligence, and I'm always looking for ways to apply these concepts to real-world problems.\n\nI'm excited to be a part of the developer community and to have the opportunity to share my knowledge and experience with others. I'm always happy to help with any questions or problems you may have, and I'm looking forward to learning from you as well.\n\nThank you for visiting my profile! I hope you find my information helpful and interesting. If you have any questions or would like to discuss any topics, please feel free to reach out to me. I"}
270271
```
271272

272273
[skip default]: end
273274

275+
[shell default]: kill ${server_pid}
274276

275277
</details>
276278

@@ -664,6 +666,6 @@ awesome libraries and tools you've built around local LLM inference.
664666

665667
torchchat is released under the [BSD 3 license](LICENSE). (Additional
666668
code in this distribution is covered by the MIT and Apache Open Source
667-
licenses.) However you may have other legal obligations that govern
669+
licenses.) However, you may have other legal obligations that govern
668670
your use of content, such as the terms of service for third-party
669671
models.

docs/ADVANCED-USERS.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -251,6 +251,8 @@ To improve performance, you can compile the model with `--compile`
251251
trading off the time to first token processed with time per token. To
252252
improve performance further, you may also compile the prefill with
253253
`--compile-prefill`. This will increase further compilation times though.
254+
For CPU, you can use `--max-autotune` to further improve the performance
255+
with `--compile` and `compile-prefill`. See [`max-autotune on CPU tutorial`](https://pytorch.org/tutorials/prototype/max_autotune_on_CPU_tutorial.html).
254256

255257
Parallel prefill is not yet supported by exported models, and may be
256258
supported in a future release.

docs/model_customization.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -34,6 +34,9 @@ prefill with `--compile_prefill`.
3434

3535
To learn more about compilation, check out: https://pytorch.org/get-started/pytorch-2.0/
3636

37+
For CPU, you can use `--max-autotune` to further improve the performance with `--compile` and `compile-prefill`.
38+
39+
See [`max-autotune on CPU tutorial`](https://pytorch.org/tutorials/prototype/max_autotune_on_CPU_tutorial.html).
3740

3841
## Model Precision
3942

torchchat/utils/docs/evaluation.md

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,8 +4,13 @@
44

55
# Evaluation Features
66

7+
<!--
8+
79
[shell default]: ./install/install_requirements.sh
810
11+
[shell default]: TORCHCHAT_ROOT=${PWD} ./torchchat/utils/scripts/install_et.sh
12+
13+
-->
914

1015
Torchchat provides evaluation functionality for your language model on
1116
a variety of tasks using the
@@ -14,7 +19,7 @@ library.
1419

1520
## Usage
1621

17-
The evaluation mode of `torchchat.py` script can be used to evaluate your language model on various tasks available in the `lm_eval` library such as "wikitext". You can specify the task(s) you want to evaluate using the `--tasks` option, and limit the evaluation using the `--limit` option. If no task is specified, it will default to evaluating on "wikitext".
22+
The evaluation mode of `torchchat.py` script can be used to evaluate your language model on various tasks available in the `lm_eval` library such as "wikitext". You can specify the task(s) you want to evaluate using the `--tasks` option, and limit the evaluation using the `--limit` option. If no task is specified, the task will default to evaluating on "wikitext".
1823

1924
**Examples**
2025

0 commit comments

Comments
 (0)