Skip to content

Commit 60d83a0

Browse files
authored
update main readme (#8333)
1 parent 87e25a1 commit 60d83a0

File tree

2 files changed

+25
-20
lines changed

2 files changed

+25
-20
lines changed

README.md

Lines changed: 24 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -391,28 +391,21 @@ The `grammars/` folder contains a handful of sample grammars. To write your own,
391391

392392
For authoring more complex JSON grammars, you can also check out https://grammar.intrinsiclabs.ai/, a browser app that lets you write TypeScript interfaces which it compiles to GBNF grammars that you can save for local use. Note that the app is built and maintained by members of the community, please file any issues or FRs on [its repo](http://github.com/intrinsiclabsai/gbnfgen) and not this one.
393393

394-
### Obtaining and using the Facebook LLaMA 2 model
394+
## Build
395395

396-
- Refer to [Facebook's LLaMA download page](https://ai.meta.com/resources/models-and-libraries/llama-downloads/) if you want to access the model data.
397-
- Alternatively, if you want to save time and space, you can download already converted and quantized models from [TheBloke](https://huggingface.co/TheBloke), including:
398-
- [LLaMA 2 7B base](https://huggingface.co/TheBloke/Llama-2-7B-GGUF)
399-
- [LLaMA 2 13B base](https://huggingface.co/TheBloke/Llama-2-13B-GGUF)
400-
- [LLaMA 2 70B base](https://huggingface.co/TheBloke/Llama-2-70B-GGUF)
401-
- [LLaMA 2 7B chat](https://huggingface.co/TheBloke/Llama-2-7B-chat-GGUF)
402-
- [LLaMA 2 13B chat](https://huggingface.co/TheBloke/Llama-2-13B-chat-GGUF)
403-
- [LLaMA 2 70B chat](https://huggingface.co/TheBloke/Llama-2-70B-chat-GGUF)
396+
Please refer to [Build llama.cpp locally](./docs/build.md)
404397

405-
### Seminal papers and background on the models
398+
## Supported backends
406399

407-
If your issue is with model generation quality, then please at least scan the following links and papers to understand the limitations of LLaMA models. This is especially important when choosing an appropriate model size and appreciating both the significant and subtle differences between LLaMA models and ChatGPT:
408-
- LLaMA:
409-
- [Introducing LLaMA: A foundational, 65-billion-parameter large language model](https://ai.facebook.com/blog/large-language-model-llama-meta-ai/)
410-
- [LLaMA: Open and Efficient Foundation Language Models](https://arxiv.org/abs/2302.13971)
411-
- GPT-3
412-
- [Language Models are Few-Shot Learners](https://arxiv.org/abs/2005.14165)
413-
- GPT-3.5 / InstructGPT / ChatGPT:
414-
- [Aligning language models to follow instructions](https://openai.com/research/instruction-following)
415-
- [Training language models to follow instructions with human feedback](https://arxiv.org/abs/2203.02155)
400+
| Backend | Target devices |
401+
| --- | --- |
402+
| [Metal](./docs/build.md#metal-build) | Apple Silicon |
403+
| [BLAS](./docs/build.md#blas-build) | All |
404+
| [BLIS](./docs/backend/BLIS.md) | All |
405+
| [SYCL](./docs/backend/SYCL.md) | Intel and Nvidia GPU |
406+
| [CUDA](./docs/build.md#cuda) | Nvidia GPU |
407+
| [hipBLAS](./docs/build.md#hipblas) | AMD GPU |
408+
| [Vulkan](./docs/build.md#vulkan) | GPU |
416409

417410
## Tools
418411

@@ -460,3 +453,15 @@ To learn more how to measure perplexity using llama.cpp, [read this documentatio
460453
- [Build on Android](./docs/android.md)
461454
- [Performance troubleshooting](./docs/token_generation_performance_tips.md)
462455
- [GGML tips & tricks](https://github.com/ggerganov/llama.cpp/wiki/GGML-Tips-&-Tricks)
456+
457+
**Seminal papers and background on the models**
458+
459+
If your issue is with model generation quality, then please at least scan the following links and papers to understand the limitations of LLaMA models. This is especially important when choosing an appropriate model size and appreciating both the significant and subtle differences between LLaMA models and ChatGPT:
460+
- LLaMA:
461+
- [Introducing LLaMA: A foundational, 65-billion-parameter large language model](https://ai.facebook.com/blog/large-language-model-llama-meta-ai/)
462+
- [LLaMA: Open and Efficient Foundation Language Models](https://arxiv.org/abs/2302.13971)
463+
- GPT-3
464+
- [Language Models are Few-Shot Learners](https://arxiv.org/abs/2005.14165)
465+
- GPT-3.5 / InstructGPT / ChatGPT:
466+
- [Aligning language models to follow instructions](https://openai.com/research/instruction-following)
467+
- [Training language models to follow instructions with human feedback](https://arxiv.org/abs/2203.02155)

docs/build.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -85,7 +85,7 @@ Building the program with BLAS support may lead to some performance improvements
8585
8686
### Accelerate Framework:
8787
88-
This is only available on Mac PCs and it's enabled by default. You can just build using the normal instructions.
88+
This is only available on Mac PCs and it's enabled by default. You can just build using the normal instructions.
8989

9090
### OpenBLAS:
9191

0 commit comments

Comments
 (0)