Skip to content

convert : fix context length for nomic-embed-text-v2-moe #13216

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 2, 2025

Conversation

cebtenzzre
Copy link
Collaborator

As noted by ggerganov, nomic-embed-text-v2-moe is correctly documented to be trained with up to 512 tokens of context, so the hardcoded value of 2048 used in the convert script is not accurate.

With this change, nomic-embed-text-v1 and v1.5 still convert with context_length=2048, and nomic-embed-text-v2-moe now converts with context_length=512.

@cebtenzzre cebtenzzre requested a review from ggerganov April 30, 2025 16:47
@github-actions github-actions bot added the python python script changes label Apr 30, 2025
@cebtenzzre cebtenzzre merged commit 7d21234 into master May 2, 2025
7 checks passed
@cebtenzzre cebtenzzre deleted the jared/fix-nomic-embed-v2-nctx branch May 2, 2025 15:41
gabe-l-hart added a commit to gabe-l-hart/llama.cpp that referenced this pull request May 2, 2025
* GraniteMoEShared:
fix: Fix the input to the shared experts
fix: Cleaner (maybe more correct?) splitting for gate/up
feat: First WIP cut at model arch in cpp
fix: Split MoE fused tensors for shared experts in conversion
feat: hparam and arch plumbing for granitemoeshared
feat: Add GGUF conversion for granitemoeshared
llama-model : support Qwen2 embedding models and pooling_mode_lasttoken (ggml-org#13245)
convert : use correct context length for nomic-embed-text-v2 (ggml-org#13216)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
python python script changes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants