Skip to content

Commit 827e454

Browse files
ngxsonNeo Zhang
authored andcommitted
update main readme (ggml-org#8333)
1 parent 9fe6eb4 commit 827e454

File tree

2 files changed

+25
-20
lines changed

2 files changed

+25
-20
lines changed

README.md

Lines changed: 24 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -425,28 +425,21 @@ The `grammars/` folder contains a handful of sample grammars. To write your own,
425425

426426
For authoring more complex JSON grammars, you can also check out https://grammar.intrinsiclabs.ai/, a browser app that lets you write TypeScript interfaces which it compiles to GBNF grammars that you can save for local use. Note that the app is built and maintained by members of the community, please file any issues or FRs on [its repo](http://github.com/intrinsiclabsai/gbnfgen) and not this one.
427427

428-
### Obtaining and using the Facebook LLaMA 2 model
428+
## Build
429429

430-
- Refer to [Facebook's LLaMA download page](https://ai.meta.com/resources/models-and-libraries/llama-downloads/) if you want to access the model data.
431-
- Alternatively, if you want to save time and space, you can download already converted and quantized models from [TheBloke](https://huggingface.co/TheBloke), including:
432-
- [LLaMA 2 7B base](https://huggingface.co/TheBloke/Llama-2-7B-GGUF)
433-
- [LLaMA 2 13B base](https://huggingface.co/TheBloke/Llama-2-13B-GGUF)
434-
- [LLaMA 2 70B base](https://huggingface.co/TheBloke/Llama-2-70B-GGUF)
435-
- [LLaMA 2 7B chat](https://huggingface.co/TheBloke/Llama-2-7B-chat-GGUF)
436-
- [LLaMA 2 13B chat](https://huggingface.co/TheBloke/Llama-2-13B-chat-GGUF)
437-
- [LLaMA 2 70B chat](https://huggingface.co/TheBloke/Llama-2-70B-chat-GGUF)
430+
Please refer to [Build llama.cpp locally](./docs/build.md)
438431

439-
### Seminal papers and background on the models
432+
## Supported backends
440433

441-
If your issue is with model generation quality, then please at least scan the following links and papers to understand the limitations of LLaMA models. This is especially important when choosing an appropriate model size and appreciating both the significant and subtle differences between LLaMA models and ChatGPT:
442-
- LLaMA:
443-
- [Introducing LLaMA: A foundational, 65-billion-parameter large language model](https://ai.facebook.com/blog/large-language-model-llama-meta-ai/)
444-
- [LLaMA: Open and Efficient Foundation Language Models](https://arxiv.org/abs/2302.13971)
445-
- GPT-3
446-
- [Language Models are Few-Shot Learners](https://arxiv.org/abs/2005.14165)
447-
- GPT-3.5 / InstructGPT / ChatGPT:
448-
- [Aligning language models to follow instructions](https://openai.com/research/instruction-following)
449-
- [Training language models to follow instructions with human feedback](https://arxiv.org/abs/2203.02155)
434+
| Backend | Target devices |
435+
| --- | --- |
436+
| [Metal](./docs/build.md#metal-build) | Apple Silicon |
437+
| [BLAS](./docs/build.md#blas-build) | All |
438+
| [BLIS](./docs/backend/BLIS.md) | All |
439+
| [SYCL](./docs/backend/SYCL.md) | Intel and Nvidia GPU |
440+
| [CUDA](./docs/build.md#cuda) | Nvidia GPU |
441+
| [hipBLAS](./docs/build.md#hipblas) | AMD GPU |
442+
| [Vulkan](./docs/build.md#vulkan) | GPU |
450443

451444
## Tools
452445

@@ -492,3 +485,15 @@ To learn more how to measure perplexity using llama.cpp, [read this documentatio
492485
- [Build on Android](./docs/android.md)
493486
- [Performance troubleshooting](./docs/token_generation_performance_tips.md)
494487
- [GGML tips & tricks](https://github.com/ggerganov/llama.cpp/wiki/GGML-Tips-&-Tricks)
488+
489+
**Seminal papers and background on the models**
490+
491+
If your issue is with model generation quality, then please at least scan the following links and papers to understand the limitations of LLaMA models. This is especially important when choosing an appropriate model size and appreciating both the significant and subtle differences between LLaMA models and ChatGPT:
492+
- LLaMA:
493+
- [Introducing LLaMA: A foundational, 65-billion-parameter large language model](https://ai.facebook.com/blog/large-language-model-llama-meta-ai/)
494+
- [LLaMA: Open and Efficient Foundation Language Models](https://arxiv.org/abs/2302.13971)
495+
- GPT-3
496+
- [Language Models are Few-Shot Learners](https://arxiv.org/abs/2005.14165)
497+
- GPT-3.5 / InstructGPT / ChatGPT:
498+
- [Aligning language models to follow instructions](https://openai.com/research/instruction-following)
499+
- [Training language models to follow instructions with human feedback](https://arxiv.org/abs/2203.02155)

docs/build.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -85,7 +85,7 @@ Building the program with BLAS support may lead to some performance improvements
8585
8686
### Accelerate Framework:
8787
88-
This is only available on Mac PCs and it's enabled by default. You can just build using the normal instructions.
88+
This is only available on Mac PCs and it's enabled by default. You can just build using the normal instructions.
8989

9090
### OpenBLAS:
9191

0 commit comments

Comments
 (0)