Skip to content

Recognize IBM Granite 3.3 FIM tokens. Makes llama-server /infill usable. #12988

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 17, 2025

Conversation

Noeda
Copy link
Contributor

@Noeda Noeda commented Apr 17, 2025

The model in question is, freshly released by IBM: https://huggingface.co/ibm-granite/granite-3.3-8b-base

The Granite 3.3's FIM tokens are very similar to Qwen's; it's just that they use underscore instead of a dash. So <fim_middle> for example instead of <fim-middle>.

Opening up tokenizer_config.json in ibm-granite/granite-3.3-8b-base shows (https://huggingface.co/ibm-granite/granite-3.3-8b-base/blob/main/tokenizer_config.json)

    "<fim_prefix>",
    "<fim_middle>",
    "<fim_suffix>",
    "<fim_pad>",
    ...
    "<reponame>",

Tested with granite-3 I converted to .ggufs. I noticed the llama.cpp code completion vim extension didn't work with the llama-server, so I checked out if tokens were missing, added them, tested them, filed this PR.

I could not find an equivalent for file separator token, but I mapped the 5 tokens I found that had clear llama.cpp equivalents.


Testing:

Checked tokenization (i.e. does llama.cpp tokenize them all to single tokens):

$ ./build/bin/llama-tokenize --model ~/text-generation-webui/models/granite-3.3-base-q8.gguf --prompt "<fim_prefix><fim_middle><fim_suffix><fim_pad><reponame>"
... omitted verbose output ...
     1 -> '<fim_prefix>'
     2 -> '<fim_middle>'
     3 -> '<fim_suffix>'
     4 -> '<fim_pad>'
    18 -> '<reponame>'

Also saw:
Screenshot 2025-04-16 at 18 25 01

And empirically tried in coding with the extension:

code_example_granite.mp4

(I thought it would offer printf() instead...C++ bias? 😉 ) I love that extension.

The Granite's FIM tokens are very similar to Qwen's; it's just that
they use underscore instead of a dash. So <fim_middle> for example
instead of <fim-middle>.

Opening up tokenizer_config.json in ibm-granite/granite-3.3-8b-base
shows:

```
    "<fim_prefix>",
    "<fim_middle>",
    "<fim_suffix>",
    "<fim_pad>",
    ...
    "<reponame>",
```
@ExtReMLapin
Copy link
Contributor

That would be nice to have somewhere a list of FIM supported models

@ggerganov ggerganov merged commit 971f245 into ggml-org:master Apr 17, 2025
48 of 51 checks passed
colout pushed a commit to colout/llama.cpp that referenced this pull request Apr 21, 2025
The Granite's FIM tokens are very similar to Qwen's; it's just that
they use underscore instead of a dash. So <fim_middle> for example
instead of <fim-middle>.

Opening up tokenizer_config.json in ibm-granite/granite-3.3-8b-base
shows:

```
    "<fim_prefix>",
    "<fim_middle>",
    "<fim_suffix>",
    "<fim_pad>",
    ...
    "<reponame>",
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants