Skip to content

Commit c8b7620

Browse files
mscheong01ggerganov
authored andcommitted
examples : add "retrieval" (ggml-org#6193)
* add `retrieval` example * add README * minor fixes * cast filepos on print * remove use of variable sized array * store similarities in separate vector * print error on insufficient batch size * fix error message printing * assign n_batch value to n_ubatch * fix param definitions * define retrieval-only parameters in retrieval.cpp * fix `--context-file` option to be provided multiple times for multiple files * use vector for `query_emb` * add usage description in README * fix merge conflict * fix usage printing * remove seed setting * fix lint * increase file read buffer size * retrieval : minor --------- Co-authored-by: Georgi Gerganov <[email protected]>
1 parent 025e2a9 commit c8b7620

File tree

10 files changed

+435
-2
lines changed

10 files changed

+435
-2
lines changed

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -77,6 +77,7 @@ models-mnt
7777
/batched-bench
7878
/export-lora
7979
/finetune
80+
/retrieval
8081
/speculative
8182
/parallel
8283
/train-text-from-scratch

Makefile

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
BUILD_TARGETS = \
33
main quantize quantize-stats perplexity imatrix embedding vdot q8dot train-text-from-scratch convert-llama2c-to-ggml \
44
simple batched batched-bench save-load-state server gguf gguf-split llama-bench libllava.a llava-cli baby-llama beam-search \
5-
speculative infill tokenize benchmark-matmult parallel finetune export-lora lookahead lookup passkey gritlm tests/test-c.o
5+
retrieval speculative infill tokenize benchmark-matmult parallel finetune export-lora lookahead lookup passkey gritlm tests/test-c.o
66

77
# Binaries only useful for tests
88
TEST_TARGETS = \
@@ -810,6 +810,10 @@ export-lora: examples/export-lora/export-lora.cpp ggml.o common/common.h $(OBJS)
810810
$(CXX) $(CXXFLAGS) -c $< -o $(call GET_OBJ_FILE, $<)
811811
$(CXX) $(CXXFLAGS) $(filter-out %.h $<,$^) $(call GET_OBJ_FILE, $<) -o $@ $(LDFLAGS)
812812

813+
retrieval: examples/retrieval/retrieval.cpp ggml.o llama.o $(COMMON_DEPS) $(OBJS)
814+
$(CXX) $(CXXFLAGS) -c $< -o $(call GET_OBJ_FILE, $<)
815+
$(CXX) $(CXXFLAGS) $(filter-out %.h $<,$^) $(call GET_OBJ_FILE, $<) -o $@ $(LDFLAGS)
816+
813817
speculative: examples/speculative/speculative.cpp ggml.o llama.o $(COMMON_DEPS) grammar-parser.o $(OBJS)
814818
$(CXX) $(CXXFLAGS) -c $< -o $(call GET_OBJ_FILE, $<)
815819
$(CXX) $(CXXFLAGS) $(filter-out %.h $<,$^) $(call GET_OBJ_FILE, $<) -o $@ $(LDFLAGS)

common/common.cpp

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -157,7 +157,7 @@ bool gpt_params_parse(int argc, char ** argv, gpt_params & params) {
157157
return result;
158158
}
159159

160-
static bool gpt_params_find_arg(int argc, char ** argv, const std::string & arg, gpt_params & params, int & i, bool & invalid_param) {
160+
bool gpt_params_find_arg(int argc, char ** argv, const std::string & arg, gpt_params & params, int & i, bool & invalid_param) {
161161
llama_sampling_params& sparams = params.sparams;
162162

163163
if (arg == "-s" || arg == "--seed") {

common/common.h

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -171,6 +171,8 @@ bool gpt_params_parse(int argc, char ** argv, gpt_params & params);
171171

172172
void gpt_print_usage(int argc, char ** argv, const gpt_params & params);
173173

174+
bool gpt_params_find_arg(int argc, char ** argv, const std::string & arg, gpt_params & params, int & i, bool & invalid_param);
175+
174176
std::string get_system_info(const gpt_params & params);
175177

176178
std::string gpt_random_prompt(std::mt19937 & rng);

common/log.h

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -566,6 +566,7 @@ inline void log_print_usage()
566566
printf(" --log-new Create a separate new log file on start. "
567567
"Each log file will have unique name: \"<name>.<ID>.log\"\n");
568568
printf(" --log-append Don't truncate the old log file.\n");
569+
printf("\n");
569570
}
570571

571572
#define log_dump_cmdline(argc, argv) log_dump_cmdline_impl(argc, argv)

examples/CMakeLists.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -34,6 +34,7 @@ else()
3434
add_subdirectory(perplexity)
3535
add_subdirectory(quantize)
3636
add_subdirectory(quantize-stats)
37+
add_subdirectory(retrieval)
3738
add_subdirectory(save-load-state)
3839
add_subdirectory(simple)
3940
add_subdirectory(passkey)

examples/retrieval/CMakeLists.txt

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
set(TARGET retrieval)
2+
add_executable(${TARGET} retrieval.cpp)
3+
install(TARGETS ${TARGET} RUNTIME)
4+
target_link_libraries(${TARGET} PRIVATE common llama ${CMAKE_THREAD_LIBS_INIT})
5+
target_compile_features(${TARGET} PRIVATE cxx_std_11)

examples/retrieval/README.md

Lines changed: 69 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,69 @@
1+
# llama.cpp/examples/retrieval
2+
3+
Demonstration of simple retrieval technique based on cosine similarity
4+
5+
More info:
6+
https://github.com/ggerganov/llama.cpp/pull/6193
7+
8+
### How to use
9+
10+
`retieval.cpp` has parameters of its own:
11+
- `--context-file`: file to be embedded - state this option multiple times to embed multiple files
12+
- `--chunk-size`: minimum size of each text chunk to be embedded
13+
- `--chunk-separator`: STRING to divide chunks by. newline by default
14+
15+
`retrieval` example can be tested as follows:
16+
17+
```bash
18+
make -j && ./retrieval --model ./models/bge-base-en-v1.5-f16.gguf --top-k 3 --context-file README.md --context-file License --chunk-size 100 --chunk-separator .
19+
```
20+
21+
This chunks and embeds all given files and starts a loop requesting query inputs:
22+
23+
```
24+
Enter query:
25+
```
26+
27+
On each query input, top k chunks are shown along with file name, chunk position within file and original text:
28+
29+
```
30+
Enter query: describe the mit license
31+
batch_decode: n_tokens = 6, n_seq = 1
32+
Top 3 similar chunks:
33+
filename: README.md
34+
filepos: 119
35+
similarity: 0.762334
36+
textdata:
37+
png)
38+
39+
[![License: MIT](https://img.shields.io/badge/license-MIT-blue.svg)](https://opensource.org/licenses/MIT)
40+
41+
[Roadmap](https://github.
42+
--------------------
43+
filename: License
44+
filepos: 0
45+
similarity: 0.725146
46+
textdata:
47+
MIT License
48+
49+
Copyright (c) 2023 Georgi Gerganov
50+
51+
Permission is hereby granted, free of charge, to any person obtaining a copy
52+
of this software and associated documentation files (the "Software"), to deal
53+
in the Software without restriction, including without limitation the rights
54+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
55+
copies of the Software, and to permit persons to whom the Software is
56+
furnished to do so, subject to the following conditions:
57+
58+
The above copyright notice and this permission notice shall be included in all
59+
copies or substantial portions of the Software.
60+
--------------------
61+
filename: README.md
62+
filepos: 9178
63+
similarity: 0.621722
64+
textdata:
65+
com/cztomsik/ava) (MIT)
66+
- [ptsochantaris/emeltal](https://github.com/ptsochantaris/emeltal)
67+
- [pythops/tenere](https://github.
68+
--------------------
69+
```

0 commit comments

Comments
 (0)