Skip to content

Commit fab647e

Browse files
authored
server : add cache reuse card link to help (#13230)
* server : add cache reuse card link to help * args : use short url
1 parent dcf8860 commit fab647e

File tree

2 files changed

+5
-2
lines changed

2 files changed

+5
-2
lines changed

common/arg.cpp

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2783,7 +2783,10 @@ common_params_context common_params_parser_init(common_params & params, llama_ex
27832783
).set_examples({LLAMA_EXAMPLE_SERVER}).set_env("LLAMA_ARG_THREADS_HTTP"));
27842784
add_opt(common_arg(
27852785
{"--cache-reuse"}, "N",
2786-
string_format("min chunk size to attempt reusing from the cache via KV shifting (default: %d)", params.n_cache_reuse),
2786+
string_format(
2787+
"min chunk size to attempt reusing from the cache via KV shifting (default: %d)\n"
2788+
"[(card)](https://ggml.ai/f0.png)", params.n_cache_reuse
2789+
),
27872790
[](common_params & params, int value) {
27882791
params.n_cache_reuse = value;
27892792
}

examples/server/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -154,7 +154,7 @@ The project is under active development, and we are [looking for feedback and co
154154
| `--ssl-cert-file FNAME` | path to file a PEM-encoded SSL certificate<br/>(env: LLAMA_ARG_SSL_CERT_FILE) |
155155
| `-to, --timeout N` | server read/write timeout in seconds (default: 600)<br/>(env: LLAMA_ARG_TIMEOUT) |
156156
| `--threads-http N` | number of threads used to process HTTP requests (default: -1)<br/>(env: LLAMA_ARG_THREADS_HTTP) |
157-
| `--cache-reuse N` | min chunk size to attempt reusing from the cache via KV shifting (default: 0)<br/>(env: LLAMA_ARG_CACHE_REUSE) |
157+
| `--cache-reuse N` | min chunk size to attempt reusing from the cache via KV shifting (default: 0)<br/>[(card)](https://ggml.ai/f0.png)<br/>(env: LLAMA_ARG_CACHE_REUSE) |
158158
| `--metrics` | enable prometheus compatible metrics endpoint (default: disabled)<br/>(env: LLAMA_ARG_ENDPOINT_METRICS) |
159159
| `--slots` | enable slots monitoring endpoint (default: disabled)<br/>(env: LLAMA_ARG_ENDPOINT_SLOTS) |
160160
| `--props` | enable changing global properties via POST /props (default: disabled)<br/>(env: LLAMA_ARG_ENDPOINT_PROPS) |

0 commit comments

Comments
 (0)