Skip to content

server : add cache reuse card link to help #13230

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
May 2, 2025
Merged

Conversation

ggerganov
Copy link
Member

@ggerganov ggerganov commented May 1, 2025

Add a link to a visual card in the help for the --cache-reuse argument:

$ llama-server --help

--cache-reuse N                         min chunk size to attempt reusing from the cache via KV shifting
                                        (default: 0)
                                        [(card)](https://ggml.ai/f0.png)
                                        (env: LLAMA_ARG_CACHE_REUSE)

This link will open the following image:

image

Could be useful to click on it from the terminal to get some extra information about how the feature works. Maybe we can improve the help by adding more cards like this.

@ericcurtin
Copy link
Collaborator

Cool, I've learned what cache reuse is now 😄

@ngxson
Copy link
Collaborator

ngxson commented May 1, 2025

I'm a bit worry that the image will take a large amount of space on rendering to the server/README file. I think leaving just a link is fine.

Also, could we host the image (or do a proxy) via ggml.ai domain to make the URL looks shorter?

@ggerganov
Copy link
Member Author

ggerganov commented May 1, 2025

I'm a bit worry that the image will take a large amount of space on rendering to the server/README file. I think leaving just a link is fine.

The readme only has the link - it does not render the image currently. I agree rendering the images would be too heavy.

image

Also, could we host the image (or do a proxy) via ggml.ai domain to make the URL looks shorter?

Good idea: https://ggml.ai/f0.png

@ggerganov ggerganov merged commit fab647e into master May 2, 2025
48 checks passed
@ggerganov ggerganov deleted the gg/args-cache-reuse-card branch May 2, 2025 06:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants