You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
- PHP (API bindings and features built on top of llama.cpp): [distantmagic/resonance](https://github.com/distantmagic/resonance)[(more info)](https://github.com/ggerganov/llama.cpp/pull/6326)
@@ -391,28 +392,21 @@ The `grammars/` folder contains a handful of sample grammars. To write your own,
391
392
392
393
For authoring more complex JSON grammars, you can also check out https://grammar.intrinsiclabs.ai/, a browser app that lets you write TypeScript interfaces which it compiles to GBNF grammars that you can save for local use. Note that the app is built and maintained by members of the community, please file any issues or FRs on [its repo](http://github.com/intrinsiclabsai/gbnfgen) and not this one.
393
394
394
-
### Obtaining and using the Facebook LLaMA 2 model
395
+
##Build
395
396
396
-
- Refer to [Facebook's LLaMA download page](https://ai.meta.com/resources/models-and-libraries/llama-downloads/) if you want to access the model data.
397
-
- Alternatively, if you want to save time and space, you can download already converted and quantized models from [TheBloke](https://huggingface.co/TheBloke), including:
Please refer to [Build llama.cpp locally](./docs/build.md)
404
398
405
-
### Seminal papers and background on the models
399
+
##Supported backends
406
400
407
-
If your issue is with model generation quality, then please at least scan the following links and papers to understand the limitations of LLaMA models. This is especially important when choosing an appropriate model size and appreciating both the significant and subtle differences between LLaMA models and ChatGPT:
408
-
- LLaMA:
409
-
-[Introducing LLaMA: A foundational, 65-billion-parameter large language model](https://ai.facebook.com/blog/large-language-model-llama-meta-ai/)
410
-
-[LLaMA: Open and Efficient Foundation Language Models](https://arxiv.org/abs/2302.13971)
411
-
- GPT-3
412
-
-[Language Models are Few-Shot Learners](https://arxiv.org/abs/2005.14165)
413
-
- GPT-3.5 / InstructGPT / ChatGPT:
414
-
-[Aligning language models to follow instructions](https://openai.com/research/instruction-following)
415
-
-[Training language models to follow instructions with human feedback](https://arxiv.org/abs/2203.02155)
401
+
| Backend | Target devices |
402
+
| --- | --- |
403
+
|[Metal](./docs/build.md#metal-build)| Apple Silicon |
404
+
|[BLAS](./docs/build.md#blas-build)| All |
405
+
|[BLIS](./docs/backend/BLIS.md)| All |
406
+
|[SYCL](./docs/backend/SYCL.md)| Intel and Nvidia GPU |
407
+
|[CUDA](./docs/build.md#cuda)| Nvidia GPU |
408
+
|[hipBLAS](./docs/build.md#hipblas)| AMD GPU |
409
+
|[Vulkan](./docs/build.md#vulkan)| GPU |
416
410
417
411
## Tools
418
412
@@ -460,3 +454,15 @@ To learn more how to measure perplexity using llama.cpp, [read this documentatio
If your issue is with model generation quality, then please at least scan the following links and papers to understand the limitations of LLaMA models. This is especially important when choosing an appropriate model size and appreciating both the significant and subtle differences between LLaMA models and ChatGPT:
461
+
- LLaMA:
462
+
-[Introducing LLaMA: A foundational, 65-billion-parameter large language model](https://ai.facebook.com/blog/large-language-model-llama-meta-ai/)
463
+
-[LLaMA: Open and Efficient Foundation Language Models](https://arxiv.org/abs/2302.13971)
464
+
- GPT-3
465
+
-[Language Models are Few-Shot Learners](https://arxiv.org/abs/2005.14165)
466
+
- GPT-3.5 / InstructGPT / ChatGPT:
467
+
-[Aligning language models to follow instructions](https://openai.com/research/instruction-following)
468
+
-[Training language models to follow instructions with human feedback](https://arxiv.org/abs/2203.02155)
0 commit comments