You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+24-19Lines changed: 24 additions & 19 deletions
Original file line number
Diff line number
Diff line change
@@ -425,28 +425,21 @@ The `grammars/` folder contains a handful of sample grammars. To write your own,
425
425
426
426
For authoring more complex JSON grammars, you can also check out https://grammar.intrinsiclabs.ai/, a browser app that lets you write TypeScript interfaces which it compiles to GBNF grammars that you can save for local use. Note that the app is built and maintained by members of the community, please file any issues or FRs on [its repo](http://github.com/intrinsiclabsai/gbnfgen) and not this one.
427
427
428
-
### Obtaining and using the Facebook LLaMA 2 model
428
+
##Build
429
429
430
-
- Refer to [Facebook's LLaMA download page](https://ai.meta.com/resources/models-and-libraries/llama-downloads/) if you want to access the model data.
431
-
- Alternatively, if you want to save time and space, you can download already converted and quantized models from [TheBloke](https://huggingface.co/TheBloke), including:
Please refer to [Build llama.cpp locally](./docs/build.md)
438
431
439
-
### Seminal papers and background on the models
432
+
##Supported backends
440
433
441
-
If your issue is with model generation quality, then please at least scan the following links and papers to understand the limitations of LLaMA models. This is especially important when choosing an appropriate model size and appreciating both the significant and subtle differences between LLaMA models and ChatGPT:
442
-
- LLaMA:
443
-
-[Introducing LLaMA: A foundational, 65-billion-parameter large language model](https://ai.facebook.com/blog/large-language-model-llama-meta-ai/)
444
-
-[LLaMA: Open and Efficient Foundation Language Models](https://arxiv.org/abs/2302.13971)
445
-
- GPT-3
446
-
-[Language Models are Few-Shot Learners](https://arxiv.org/abs/2005.14165)
447
-
- GPT-3.5 / InstructGPT / ChatGPT:
448
-
-[Aligning language models to follow instructions](https://openai.com/research/instruction-following)
449
-
-[Training language models to follow instructions with human feedback](https://arxiv.org/abs/2203.02155)
434
+
| Backend | Target devices |
435
+
| --- | --- |
436
+
|[Metal](./docs/build.md#metal-build)| Apple Silicon |
437
+
|[BLAS](./docs/build.md#blas-build)| All |
438
+
|[BLIS](./docs/backend/BLIS.md)| All |
439
+
|[SYCL](./docs/backend/SYCL.md)| Intel and Nvidia GPU |
440
+
|[CUDA](./docs/build.md#cuda)| Nvidia GPU |
441
+
|[hipBLAS](./docs/build.md#hipblas)| AMD GPU |
442
+
|[Vulkan](./docs/build.md#vulkan)| GPU |
450
443
451
444
## Tools
452
445
@@ -492,3 +485,15 @@ To learn more how to measure perplexity using llama.cpp, [read this documentatio
If your issue is with model generation quality, then please at least scan the following links and papers to understand the limitations of LLaMA models. This is especially important when choosing an appropriate model size and appreciating both the significant and subtle differences between LLaMA models and ChatGPT:
492
+
- LLaMA:
493
+
-[Introducing LLaMA: A foundational, 65-billion-parameter large language model](https://ai.facebook.com/blog/large-language-model-llama-meta-ai/)
494
+
-[LLaMA: Open and Efficient Foundation Language Models](https://arxiv.org/abs/2302.13971)
495
+
- GPT-3
496
+
-[Language Models are Few-Shot Learners](https://arxiv.org/abs/2005.14165)
497
+
- GPT-3.5 / InstructGPT / ChatGPT:
498
+
-[Aligning language models to follow instructions](https://openai.com/research/instruction-following)
499
+
-[Training language models to follow instructions with human feedback](https://arxiv.org/abs/2203.02155)
0 commit comments