Skip to content

[WIP] InferenceClient.post is deprecated, but Sentence Ranking tasks are not implemented #3109

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

Copilot
Copy link

@Copilot Copilot AI commented May 23, 2025

Thanks for assigning this issue to me. I'm starting to work on it and will keep this PR's description up to date as I form a plan and make progress.

Original issue description:

Describe the bug

The InferenceClient API still does not support many of the tasks that can be hosted at inference endpoints, but gives a deprecation warning when using .post to get around this.

Reproduction

from huggingface_hub import InferenceClient, get_inference_endpoint
import json

# Get endpoint and create client
MODEL_NAME = "YOUR_MODEL_NAME_OR_ENDPOINT_NAME"
NAMESPACE = "YOUR_NAMESPACE"
endpoint = get_inference_endpoint(MODEL_NAME, namespace=NAMESPACE)
client = InferenceClient(endpoint.url, timeout=10)

# Test data
query = "What is the capital of France?"
document = "Paris is the capital of France."
sentence_ranking_style_inputs = [[query, document]]
text_classification_style_inputs = [{"text": query, "text_pair": document}]

# 1. Using post method
response_bytes = client.post(json={"inputs": sentence_ranking_style_inputs})
print(json.loads(response_bytes))
# Problem: post method has deprecation warning

# 2. Using text_classification task
try:
    result = client.text_classification(text_classification_style_inputs) # There's no way to inject the inputs format that would have worked on this task for reranking
    print(result)
except Exception as e:
    print(f"text_classification error: {e}")
# Problem: text_classification doesn't properly support text pairs format needed for reranking/cross-encoding

# 3. Using sentence_similarity task
try:
    result = client.sentence_similarity(sentence_ranking_style_inputs) # There's no direct way to inject the inputs format that would have worked on this task for reranking
    print(result)
except Exception as e:
    print(f"sentence_similarity error: {e}")
# Problem: No direct support for sentence ranking despite endpoint supporting this task

Ideally, the sentence_ranking task is supported.

Logs

System info

- huggingface_hub version: 0.30.2
- Platform: macOS-15.4-arm64-arm-64bit
- Python version: 3.12.10
- Running in iPython ?: No
- Running in notebook ?: No
- Running in Google Colab ?: No
- Running in Google Colab Enterprise ?: No
- Token path ?: redacted
- Has saved token ?: False
- Configured git credential helpers: redacted
- FastAI: N/A
- Tensorflow: N/A
- Torch: 2.7.0
- Jinja2: 3.1.6
- Graphviz: N/A
- keras: N/A
- Pydot: N/A
- Pillow: 11.2.1
- hf_transfer: N/A
- gradio: N/A
- tensorboard: N/A
- numpy: 1.26.4
- pydantic: 2.11.4
- aiohttp: 3.11.18
- hf_xet: N/A
- ENDPOINT: https://huggingface.co
- HF_HUB_CACHE: redacted
- HF_ASSETS_CACHE: redacted
- HF_TOKEN_PATH: redacted
- HF_STORED_TOKENS_PATH: redacted
- HF_HUB_OFFLINE: False
- HF_HUB_DISABLE_TELEMETRY: False
- HF_HUB_DISABLE_PROGRESS_BARS: None
- HF_HUB_DISABLE_SYMLINKS_WARNING: False
- HF_HUB_DISABLE_EXPERIMENTAL_WARNING: False
- HF_HUB_DISABLE_IMPLICIT_TOKEN: False
- HF_HUB_ENABLE_HF_TRANSFER: False
- HF_HUB_ETAG_TIMEOUT: 10
- HF_HUB_DOWNLOAD_TIMEOUT: 10

Fixes #3055.


💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

InferenceClient.post is deprecated, but Sentence Ranking tasks are not implemented
2 participants