Skip to content

Commit 834695f

Browse files
authored
Minor: Readme fixed grammar, spelling, and misc updates (#1071)
1 parent f7d0509 commit 834695f

File tree

1 file changed

+30
-33
lines changed

1 file changed

+30
-33
lines changed

README.md

Lines changed: 30 additions & 33 deletions
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ Inference of [LLaMA](https://arxiv.org/abs/2302.13971) model in pure C/C++
99

1010
**Warnings**
1111

12-
- `Q4_2` and `Q4_3` are still in development. Do not expect any kind of backward compatibility until they are finalize
12+
- `Q4_2` and `Q4_3` are still in development. Do not expect any kind of backward compatibility until they are finalized
1313

1414
**Hot topics:**
1515

@@ -19,7 +19,7 @@ Inference of [LLaMA](https://arxiv.org/abs/2302.13971) model in pure C/C++
1919

2020
## Description
2121

22-
The main goal is to run the model using 4-bit quantization on a MacBook
22+
The main goal of llama.cpp is to run the llama model using 4-bit quantization on a MacBook.
2323

2424
- Plain C/C++ implementation without dependencies
2525
- Apple silicon first-class citizen - optimized via ARM NEON and Accelerate framework
@@ -156,7 +156,7 @@ https://user-images.githubusercontent.com/1991296/224442907-7693d4be-acaa-4e01-8
156156

157157
## Usage
158158

159-
Here are the step for the LLaMA-7B model.
159+
Here are the steps for the LLaMA-7B model.
160160

161161
### Get the Code
162162

@@ -214,8 +214,7 @@ When running the larger models, make sure you have enough disk space to store al
214214

215215
### Memory/Disk Requirements
216216

217-
As the models are currently fully loaded into memory, you will need adequate disk space to save them
218-
and sufficient RAM to load them. At the moment, memory and disk requirements are the same.
217+
As the models are currently fully loaded into memory, you will need adequate disk space to save them and sufficient RAM to load them. At the moment, memory and disk requirements are the same.
219218

220219
| model | original size | quantized size (4-bit) |
221220
|-------|---------------|------------------------|
@@ -227,18 +226,18 @@ and sufficient RAM to load them. At the moment, memory and disk requirements are
227226
### Interactive mode
228227

229228
If you want a more ChatGPT-like experience, you can run in interactive mode by passing `-i` as a parameter.
230-
In this mode, you can always interrupt generation by pressing Ctrl+C and enter one or more lines of text which will be converted into tokens and appended to the current context. You can also specify a *reverse prompt* with the parameter `-r "reverse prompt string"`. This will result in user input being prompted whenever the exact tokens of the reverse prompt string are encountered in the generation. A typical use is to use a prompt which makes LLaMa emulate a chat between multiple users, say Alice and Bob, and pass `-r "Alice:"`.
229+
In this mode, you can always interrupt generation by pressing Ctrl+C and entering one or more lines of text, which will be converted into tokens and appended to the current context. You can also specify a *reverse prompt* with the parameter `-r "reverse prompt string"`. This will result in user input being prompted whenever the exact tokens of the reverse prompt string are encountered in the generation. A typical use is to use a prompt that makes LLaMa emulate a chat between multiple users, say Alice and Bob, and pass `-r "Alice:"`.
231230

232-
Here is an example few-shot interaction, invoked with the command
231+
Here is an example of a few-shot interaction, invoked with the command
233232

234233
```bash
235-
# default arguments using 7B model
234+
# default arguments using a 7B model
236235
./examples/chat.sh
237236
238-
# advanced chat with 13B model
237+
# advanced chat with a 13B model
239238
./examples/chat-13B.sh
240239
241-
# custom arguments using 13B model
240+
# custom arguments using a 13B model
242241
./main -m ./models/13B/ggml-model-q4_0.bin -n 256 --repeat_penalty 1.0 --color -i -r "User:" -f prompts/chat-with-bob.txt
243242
```
244243

@@ -277,7 +276,7 @@ cadaver, cauliflower, cabbage (vegetable), catalpa (tree) and Cailleach.
277276
### Using [GPT4All](https://github.com/nomic-ai/gpt4all)
278277
279278
- Obtain the `gpt4all-lora-quantized.bin` model
280-
- It is distributed in the old `ggml` format which is now obsoleted
279+
- It is distributed in the old `ggml` format, which is now obsoleted
281280
- You have to convert it to the new format using [./convert-gpt4all-to-ggml.py](./convert-gpt4all-to-ggml.py). You may also need to
282281
convert the model from the old format to the new format with [./migrate-ggml-2023-03-30-pr613.py](./migrate-ggml-2023-03-30-pr613.py):
283282
@@ -291,7 +290,7 @@ convert the model from the old format to the new format with [./migrate-ggml-202
291290

292291
### Obtaining and verifying the Facebook LLaMA original model and Stanford Alpaca model data
293292

294-
- **Under no circumstances share IPFS, magnet links, or any other links to model downloads anywhere in this respository, including in issues, discussions or pull requests. They will be immediately deleted.**
293+
- **Under no circumstances should IPFS, magnet links, or any other links to model downloads be shared anywhere in this repository, including in issues, discussions, or pull requests. They will be immediately deleted.**
295294
- The LLaMA models are officially distributed by Facebook and will **never** be provided through this repository.
296295
- Refer to [Facebook's LLaMA repository](https://github.com/facebookresearch/llama/pull/73/files) if you need to request access to the model data.
297296
- Please verify the [sha256 checksums](SHA256SUMS) of all downloaded model files to confirm that you have the correct model data files before creating an issue relating to your model files.
@@ -303,29 +302,27 @@ convert the model from the old format to the new format with [./migrate-ggml-202
303302

304303
`shasum -a 256 --ignore-missing -c SHA256SUMS` on macOS
305304

306-
- If your issue is with model generation quality then please at least scan the following links and papers to understand the limitations of LLaMA models. This is especially important when choosing an appropriate model size and appreciating both the significant and subtle differences between LLaMA models and ChatGPT:
307-
- LLaMA:
308-
- [Introducing LLaMA: A foundational, 65-billion-parameter large language model](https://ai.facebook.com/blog/large-language-model-llama-meta-ai/)
309-
- [LLaMA: Open and Efficient Foundation Language Models](https://arxiv.org/abs/2302.13971)
310-
- GPT-3
311-
- [Language Models are Few-Shot Learners](https://arxiv.org/abs/2005.14165)
312-
- GPT-3.5 / InstructGPT / ChatGPT:
313-
- [Aligning language models to follow instructions](https://openai.com/research/instruction-following)
314-
- [Training language models to follow instructions with human feedback](https://arxiv.org/abs/2203.02155)
305+
- If your issue is with model generation quality, then please at least scan the following links and papers to understand the limitations of LLaMA models. This is especially important when choosing an appropriate model size and appreciating both the significant and subtle differences between LLaMA models and ChatGPT:
306+
- LLaMA:
307+
- [Introducing LLaMA: A foundational, 65-billion-parameter large language model](https://ai.facebook.com/blog/large-language-model-llama-meta-ai/)
308+
- [LLaMA: Open and Efficient Foundation Language Models](https://arxiv.org/abs/2302.13971)
309+
- GPT-3
310+
- [Language Models are Few-Shot Learners](https://arxiv.org/abs/2005.14165)
311+
- GPT-3.5 / InstructGPT / ChatGPT:
312+
- [Aligning language models to follow instructions](https://openai.com/research/instruction-following)
313+
- [Training language models to follow instructions with human feedback](https://arxiv.org/abs/2203.02155)
315314

316-
### Perplexity (Measuring model quality)
315+
### Perplexity (measuring model quality)
317316

318-
You can use the `perplexity` example to measure perplexity over the given prompt. For more background,
319-
see https://huggingface.co/docs/transformers/perplexity. However, in general, lower perplexity is better for LLMs.
317+
You can use the `perplexity` example to measure perplexity over the given prompt. For more background, see [https://huggingface.co/docs/transformers/perplexity](https://huggingface.co/docs/transformers/perplexity). However, in general, lower perplexity is better for LLMs.
320318

321319
#### Latest measurements
322320

323-
The latest perplexity scores for the various model sizes and quantizations are being tracked in [discussion #406](https://github.com/ggerganov/llama.cpp/discussions/406). `llama.cpp` is measuring very well
324-
compared to the baseline implementations. Quantization has a small negative impact to quality, but, as you can see, running
321+
The latest perplexity scores for the various model sizes and quantizations are being tracked in [discussion #406](https://github.com/ggerganov/llama.cpp/discussions/406). `llama.cpp` is measuring very well compared to the baseline implementations. Quantization has a small negative impact on quality, but, as you can see, running
325322
13B at q4_0 beats the 7B f16 model by a significant amount.
326323

327-
All measurements are done against wikitext2 test dataset (https://paperswithcode.com/dataset/wikitext-2), with default options (512 length context).
328-
Note that the changing the context length will have a significant impact on perplexity (longer context = better perplexity).
324+
All measurements are done against the wikitext2 test dataset (https://paperswithcode.com/dataset/wikitext-2), with default options (512 length context).
325+
Note that changing the context length will have a significant impact on perplexity (longer context = better perplexity).
329326
```
330327
Perplexity - model options
331328
5.5985 - 13B, q4_0
@@ -367,7 +364,7 @@ https://user-images.githubusercontent.com/271616/225014776-1d567049-ad71-4ef2-b0
367364

368365
#### Prerequisites
369366
* Docker must be installed and running on your system.
370-
* Create a folder to store big models & intermediate files (in ex. im using /llama/models)
367+
* Create a folder to store big models & intermediate files (ex. /llama/models)
371368

372369
#### Images
373370
We have two Docker images available for this project:
@@ -381,17 +378,17 @@ The easiest way to download the models, convert them to ggml and optimize them i
381378

382379
Replace `/path/to/models` below with the actual path where you downloaded the models.
383380

384-
```bash
381+
```bash
385382
docker run -v /path/to/models:/models ghcr.io/ggerganov/llama.cpp:full --all-in-one "/models/" 7B
386383
```
387384

388-
On complete, you are ready to play!
385+
On completion, you are ready to play!
389386

390387
```bash
391388
docker run -v /path/to/models:/models ghcr.io/ggerganov/llama.cpp:full --run -m /models/7B/ggml-model-q4_0.bin -p "Building a website can be done in 10 simple steps:" -n 512
392389
```
393390

394-
or with light image:
391+
or with a light image:
395392

396393
```bash
397394
docker run -v /path/to/models:/models ghcr.io/ggerganov/llama.cpp:light -m /models/7B/ggml-model-q4_0.bin -p "Building a website can be done in 10 simple steps:" -n 512
@@ -412,7 +409,7 @@ docker run -v /path/to/models:/models ghcr.io/ggerganov/llama.cpp:light -m /mode
412409
- Always consider cross-compatibility with other operating systems and architectures
413410
- Avoid fancy looking modern STL constructs, use basic `for` loops, avoid templates, keep it simple
414411
- There are no strict rules for the code style, but try to follow the patterns in the code (indentation, spaces, etc.). Vertical alignment makes things more readable and easier to batch edit
415-
- Clean-up any trailing whitespaces, use 4 spaces indentation, brackets on same line, `void * ptr`, `int & a`
412+
- Clean-up any trailing whitespaces, use 4 spaces for indentation, brackets on the same line, `void * ptr`, `int & a`
416413
- See [good first issues](https://github.com/ggerganov/llama.cpp/issues?q=is%3Aissue+is%3Aopen+label%3A%22good+first+issue%22) for tasks suitable for first contributions
417414

418415
### Docs

0 commit comments

Comments
 (0)