Skip to content

Update README.md #366

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Apr 22, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
46 changes: 23 additions & 23 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ Torchchat is an easy-to-use library for running large language models (LLMs) on

## Quick Start
### Initialize the Environment
The following steps requires you have [Python 3.10](https://www.python.org/downloads/release/python-3100/) installed
The following steps require that you have [Python 3.10](https://www.python.org/downloads/release/python-3100/) installed.

```
# get the code
Expand All @@ -31,20 +31,20 @@ source .venv/bin/activate
./install_requirements.sh

# ensure everything installed correctly
python torchchat.py --help
python3 torchchat.py --help

```

### Generating Text

```
python torchchat.py generate stories15M
python3 torchchat.py generate stories15M
```
That’s all there is to it!
Read on to learn how to use the full power of torchchat.

## Customization
For the full details on all commands and parameters run `python torchchat.py --help`
For the full details on all commands and parameters run `python3 torchchat.py --help`

### Download
For supported models, torchchat can download model weights. Most models use HuggingFace as the distribution channel, so you will need to create a HuggingFace
Expand All @@ -54,46 +54,46 @@ To install `huggingface-cli`, run `pip install huggingface-cli`. After installin
HuggingFace.

```
python torchchat.py download llama3
python3 torchchat.py download llama3
```

### Chat
Designed for interactive and conversational use.
In chat mode, the LLM engages in a back-and-forth dialogue with the user. It responds to queries, participates in discussions, provides explanations, and can adapt to the flow of conversation.

For more information run `python torchchat.py chat --help`
For more information run `python3 torchchat.py chat --help`

**Examples**
```
python torchchat.py chat llama3 --tiktoken
python3 torchchat.py chat llama3 --tiktoken
```

### Generate
Aimed at producing content based on specific prompts or instructions.
In generate mode, the LLM focuses on creating text based on a detailed prompt or instruction. This mode is often used for generating written content like articles, stories, reports, or even creative writing like poetry.

For more information run `python torchchat.py generate --help`
For more information run `python3 torchchat.py generate --help`

**Examples**
```
python torchchat.py generate llama3 --dtype=fp16 --tiktoken
python3 torchchat.py generate llama3 --dtype=fp16 --tiktoken
```

### Export
Compiles a model and saves it to run later.

For more information run `python torchchat.py export --help`
For more information run `python3 torchchat.py export --help`

**Examples**

AOT Inductor:
```
python torchchat.py export stories15M --output-dso-path stories15M.so
python3 torchchat.py export stories15M --output-dso-path stories15M.so
```

ExecuTorch:
```
python torchchat.py export stories15M --output-pte-path stories15M.pte
python3 torchchat.py export stories15M --output-pte-path stories15M.pte
```

### Browser
Expand All @@ -102,7 +102,7 @@ Run a chatbot in your browser that’s supported by the model you specify in the
**Examples**

```
python torchchat.py browser stories15M --temperature 0 --num-samples 10
python3 torchchat.py browser stories15M --temperature 0 --num-samples 10
```

*Running on http://127.0.0.1:5000* should be printed out on the terminal. Click the link or go to [http://127.0.0.1:5000](http://127.0.0.1:5000) on your browser to start interacting with it.
Expand All @@ -112,19 +112,19 @@ Enter some text in the input box, then hit the enter key or click the “SEND”
### Eval
Uses lm_eval library to evaluate model accuracy on a variety of tasks. Defaults to wikitext and can be manually controlled using the tasks and limit args.

For more information run `python torchchat.py eval --help`
For more information run `python3 torchchat.py eval --help`

**Examples**

Eager mode:
```
python torchchat.py eval stories15M -d fp32 --limit 5
python3 torchchat.py eval stories15M -d fp32 --limit 5
```

To test the perplexity for lowered or quantized model, pass it in the same way you would to generate:

```
python torchchat.py eval stories15M --pte-path stories15M.pte --limit 5
python3 torchchat.py eval stories15M --pte-path stories15M.pte --limit 5
```

## Models
Expand Down Expand Up @@ -153,17 +153,17 @@ See the [documentation on GGUF](docs/GGUF.md) to learn how to use GGUF files.

```
# Llama 3 8B Instruct
python torchchat.py chat llama3 --tiktoken
python3 torchchat.py chat llama3 --tiktoken
```

```
# Stories 15M
python torchchat.py chat stories15M
python3 torchchat.py chat stories15M
```

```
# CodeLama 7B for Python
python torchchat.py chat codellama
python3 torchchat.py chat codellama
```

## Desktop Execution
Expand All @@ -175,10 +175,10 @@ AOT compiles models into machine code before execution, enhancing performance an
The following example uses the Stories15M model.
```
# Compile
python torchchat.py export stories15M --output-dso-path stories15M.so
python3 torchchat.py export stories15M --output-dso-path stories15M.so

# Execute
python torchchat.py generate --dso-path stories15M.so --prompt "Hello my name is"
python3 torchchat.py generate --dso-path stories15M.so --prompt "Hello my name is"
```

NOTE: The exported model will be large. We suggest you quantize the model, explained further down, before deploying the model on device.
Expand All @@ -190,10 +190,10 @@ ExecuTorch enables you to optimize your model for execution on a mobile or embed
The following example uses the Stories15M model.
```
# Compile
python torchchat.py export stories15M --output-pte-path stories15M.pte
python3 torchchat.py export stories15M --output-pte-path stories15M.pte

# Execute
python torchchat.py generate --device cpu --pte-path stories15M.pte --prompt "Hello my name is"
python3 torchchat.py generate --device cpu --pte-path stories15M.pte --prompt "Hello my name is"
```

See below under Mobile Execution if you want to deploy and execute a model in your iOS or Android app.
Expand Down