Skip to content

Adding support for llama2.c models #2559

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 32 commits into from
Aug 11, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
f451983
first crack at lamma2.c model conversion
Jul 25, 2023
78f8e4d
add the new example directory in gitignore
Jul 25, 2023
a901996
WIP: super not working attempt atm. will update as I learn more ggml :D
Jul 25, 2023
912fc59
Updated makefile to compile rough tests
Jul 28, 2023
485e62b
Adding a doc that shows mappings that are coded in between llama.c <-…
Jul 28, 2023
cc5c67b
adding the rough attempt to convert the model
Jul 28, 2023
b3aa107
saving the file with all the variables found in llama.c model
Jul 28, 2023
af9caca
updating makefile to compile finalized version
Jul 28, 2023
817cc20
updating gitignore to ignore additional binaries
Jul 28, 2023
5a87675
output vector is not part of llama.c model file
Jul 28, 2023
aebccdb
fixing bug that didnt unroll the 1d karpathy arrays
Jul 31, 2023
f1c03f4
more bug fixn
Jul 31, 2023
df659f6
cleaning up code a little bit with removing extra printfs needed duri…
Aug 2, 2023
ff9fae5
updating makefile so test scripts are not compiled
Aug 8, 2023
2a0138e
updating readme for instructions for compilation and use
Aug 8, 2023
9a09e64
minor spacing update
Aug 8, 2023
3c0c155
Merge branch 'ggerganov:master' into master
byte-6174 Aug 8, 2023
223ddb7
updating makefile so my initial tests are not compiled
Aug 8, 2023
088eb86
updating gitignore
Aug 8, 2023
08e9433
cleaning up some earlier files used for experiments
Aug 8, 2023
5520876
cleaning up Makefile empty space before mearge
Aug 8, 2023
d14c066
cleaning up to remove spaces and satisfy failed checks
Aug 9, 2023
7b1f062
adding add_subdirectory in examples dir CMakeLists.txt
Aug 9, 2023
7d0404c
adding newline in readme
Aug 9, 2023
afb8f6e
removing 1 whitespace
Aug 9, 2023
40a51ec
adding CMakeLists.txt file in the conversion script directory
Aug 9, 2023
a3fa0ab
for got to add newline
Aug 9, 2023
db5d7ab
Adding more information in the README to use conversion tool.
Aug 10, 2023
aab15de
commandline argument changes for clarity.
Aug 10, 2023
d2b95e7
refactor vocab loading into its own method
jrudolph Aug 10, 2023
aa26201
also support loading from llama2.c vocabulary
jrudolph Aug 10, 2023
52801c0
Merge pull request #1 from jrudolph/convert-llama2-vocab
byte-6174 Aug 10, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
*.o
*.a
*.so
*.bin
.DS_Store
.build/
.cache/
Expand Down Expand Up @@ -39,6 +40,7 @@ models-mnt
/perplexity
/embedding
/train-text-from-scratch
/convert-llama2c-to-ggml
/simple
/benchmark-matmult
/vdot
Expand Down
7 changes: 5 additions & 2 deletions Makefile
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# Define the default target now so that it is always the first target
BUILD_TARGETS = main quantize quantize-stats perplexity embedding vdot train-text-from-scratch simple server embd-input-test
BUILD_TARGETS = main quantize quantize-stats perplexity embedding vdot train-text-from-scratch convert-llama2c-to-ggml simple server embd-input-test

# Binaries only useful for tests
TEST_TARGETS = tests/test-double-float tests/test-grad0 tests/test-opt tests/test-quantize-fns tests/test-quantize-perf tests/test-sampling tests/test-tokenizer-0
Expand Down Expand Up @@ -350,7 +350,7 @@ libllama.so: llama.o ggml.o $(OBJS)
$(CXX) $(CXXFLAGS) -shared -fPIC -o $@ $^ $(LDFLAGS)

clean:
rm -vf *.o *.so *.dll main quantize quantize-stats perplexity embedding benchmark-matmult save-load-state server simple vdot train-text-from-scratch embd-input-test build-info.h $(TEST_TARGETS)
rm -vf *.o *.so *.dll main quantize quantize-stats perplexity embedding benchmark-matmult save-load-state server simple vdot train-text-from-scratch convert-llama2c-to-ggml embd-input-test build-info.h $(TEST_TARGETS)

#
# Examples
Expand Down Expand Up @@ -393,6 +393,9 @@ embd-input-test: $(LIB_PRE)embdinput$(DSO_EXT) examples/embd-input/embd-input-te
train-text-from-scratch: examples/train-text-from-scratch/train-text-from-scratch.cpp build-info.h ggml.o llama.o $(OBJS)
$(CXX) $(CXXFLAGS) $(filter-out %.h,$^) -o $@ $(LDFLAGS)

convert-llama2c-to-ggml: examples/convert-llama2c-to-ggml/convert-llama2c-to-ggml.cpp build-info.h ggml.o llama.o $(OBJS)
$(CXX) $(CXXFLAGS) $(filter-out %.h,$^) -o $@ $(LDFLAGS)

build-info.h: $(wildcard .git/index) scripts/build-info.sh
@sh scripts/build-info.sh > [email protected]
@if ! cmp -s [email protected] $@; then \
Expand Down
1 change: 1 addition & 0 deletions examples/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,7 @@ else()
add_subdirectory(benchmark)
add_subdirectory(baby-llama)
add_subdirectory(train-text-from-scratch)
add_subdirectory(convert-llama2c-to-ggml)
add_subdirectory(simple)
add_subdirectory(embd-input)
if (LLAMA_METAL)
Expand Down
5 changes: 5 additions & 0 deletions examples/convert-llama2c-to-ggml/CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
set(TARGET convert-llama2c-to-ggml)
add_executable(${TARGET} convert-llama2c-to-ggml.cpp)
install(TARGETS ${TARGET} RUNTIME)
target_link_libraries(${TARGET} PRIVATE common llama ${CMAKE_THREAD_LIBS_INIT})
target_compile_features(${TARGET} PRIVATE cxx_std_11)
26 changes: 26 additions & 0 deletions examples/convert-llama2c-to-ggml/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
## Convert llama2.c model to ggml

This example reads weights from project [llama2.c](https://github.com/karpathy/llama2.c) and saves them in ggml compatible format. The vocab that is available in `models/ggml-vocab.bin` is used by default.

To convert the model first download the models from the [llma2.c](https://github.com/karpathy/llama2.c) repository:

`$ make -j`

After successful compilation, following usage options are available:
```
usage: ./convert-llama2c-to-ggml [options]

options:
-h, --help show this help message and exit
--copy-vocab-from-model FNAME model path from which to copy vocab (default 'models/ggml-vocab.bin')
--llama2c-model FNAME [REQUIRED] model path from which to load Karpathy's llama2.c model
--llama2c-output-model FNAME model path to save the converted llama2.c model (default ak_llama_model.bin')
```

An example command is as follows:

`$ ./convert-llama2c-to-ggml --copy-vocab-from-model <ggml-vocab.bin> --llama2c-model <llama2.c model path> --llama2c-output-model <ggml output model path>`

Now you can use the model with command like:

`$ ./main -m <ggml output model path> -p "One day, Lily met a Shoggoth" -n 500 -c 256 -eps 1e-5`
Loading