-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Creating doc automatically for supported models. #1929
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking very nice!
"name": "Llama", | ||
"url": "https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct", | ||
} | ||
BAICHUAN = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note to self: this is not supported in transformers
"url": "https://huggingface.co/CohereForAI/c4ai-command-r-plus", | ||
} | ||
DRBX = { | ||
"type": "drbx", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"type": "drbx", | |
"type": "dbrx", |
@@ -110,6 +111,138 @@ | |||
__all__.append(Mamba) | |||
|
|||
|
|||
class ModelType(enum.Enum): | |||
MAMBA = { | |||
"type": "ssm", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be consistent with transformers
https://huggingface.co/state-spaces/mamba-130m-hf/blob/main/config.json
"type": "ssm", | |
"type": "mamba", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the original one: https://huggingface.co/state-spaces/mamba-2.8b-slimpj
"name": "Mamba", | ||
"url": "https://huggingface.co/state-spaces/mamba-2.8b-slimpj", | ||
} | ||
GALACTICA = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is opt
as well based on the config file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably not. I'm not familiar with the codebase, but we tend to remove a lot of ifs from modeling code, meaning than in TGI, some model types which are the same in transformers , are different here (like gpt_bigcode which has models declared as gpt2 but the modeling is actually quite different).
"name": "Galactica", | ||
"url": "https://huggingface.co/facebook/galactica-120b", | ||
} | ||
SANTACODER = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note to self: not supported in transformers
MAMBA = { | ||
"type": "ssm", | ||
"name": "Mamba", | ||
"url": "https://huggingface.co/state-spaces/mamba-2.8b-slimpj", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe example_repo
or something like that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's just to output a link in the doc. Could be the paper page or anything. In here I'd like to keep things minimal.
we may want to update the mamba test to the diff --git a/integration-tests/models/test_mamba.py b/integration-tests/models/test_mamba.py
index bf3701b..6559f8a 100644
--- a/integration-tests/models/test_mamba.py
+++ b/integration-tests/models/test_mamba.py
@@ -3,7 +3,7 @@ import pytest
@pytest.fixture(scope="module")
def fused_kernel_mamba_handle(launcher):
- with launcher("state-spaces/mamba-130m", num_shard=1) as handle:
+ with launcher("state-spaces/mamba-130m-hf", num_shard=1) as handle:
yield handle
diff --git a/server/text_generation_server/models/__init__.py b/server/text_generation_server/models/__init__.py
index 8878ad1..ad75e3f 100644
--- a/server/text_generation_server/models/__init__.py
+++ b/server/text_generation_server/models/__init__.py
@@ -246,15 +246,6 @@ def get_model(
if speculate > 0:
logger.info(f"Using speculation {method} with {speculate} input ids.")
- if model_type is None:
- # TODO: fix how we determine model type for Mamba
- if "ssm_cfg" in config_dict:
- # *only happens in Mamba case
- model_type = "ssm"
- else:
- raise RuntimeError(
- f"Could not determine model type for {model_id} revision {revision}"
- )
quantization_config = config_dict.get("quantization_config", None)
if quantization_config is not None and quantize is None:
method = quantization_config.get("quant_method", None) |
We never remove that old logic. Simple current deployments might depend on it. |
I tried enabling the hf format for mamba, every tensor changed around so we have to redo the modeling code. Not fun (and not important) |
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
What does this PR do?
Fixes # (issue)
Before submitting
Pull Request section?
to it if that's the case.
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.