Skip to content

Commit c83cc3e

Browse files
authored
Putting the provider arg more front'n'center (and other tweaks) (#1114)
1 parent f6e1749 commit c83cc3e

File tree

2 files changed

+25
-23
lines changed

2 files changed

+25
-23
lines changed

README.md

Lines changed: 14 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,7 @@ await uploadFile({
2727
}
2828
});
2929

30-
// Use HF Inference API
30+
// Use HF Inference API, or external Inference Providers!
3131

3232
await inference.chatCompletion({
3333
model: "meta-llama/Llama-3.1-8B-Instruct",
@@ -39,6 +39,7 @@ await inference.chatCompletion({
3939
],
4040
max_tokens: 512,
4141
temperature: 0.5,
42+
provider: "sambanova", // or together, fal-ai, replicate, …
4243
});
4344

4445
await inference.textToImage({
@@ -146,16 +147,16 @@ for await (const chunk of inference.chatCompletionStream({
146147

147148
/// Using a third-party provider:
148149
await inference.chatCompletion({
149-
model: "meta-llama/Llama-3.1-8B-Instruct",
150-
messages: [{ role: "user", content: "Hello, nice to meet you!" }],
151-
max_tokens: 512,
152-
provider: "sambanova"
150+
model: "meta-llama/Llama-3.1-8B-Instruct",
151+
messages: [{ role: "user", content: "Hello, nice to meet you!" }],
152+
max_tokens: 512,
153+
provider: "sambanova", // or together, fal-ai, replicate, …
153154
})
154155

155156
await inference.textToImage({
156-
model: "black-forest-labs/FLUX.1-dev",
157-
inputs: "a picture of a green bird",
158-
provider: "together"
157+
model: "black-forest-labs/FLUX.1-dev",
158+
inputs: "a picture of a green bird",
159+
provider: "fal-ai",
159160
})
160161

161162

@@ -169,14 +170,10 @@ await inference.translation({
169170
},
170171
});
171172

172-
await inference.textToImage({
173-
model: 'black-forest-labs/FLUX.1-dev',
174-
inputs: 'a picture of a green bird',
175-
})
176-
173+
// pass multimodal files or URLs as inputs
177174
await inference.imageToText({
175+
model: 'nlpconnect/vit-gpt2-image-captioning',
178176
data: await (await fetch('https://picsum.photos/300/300')).blob(),
179-
model: 'nlpconnect/vit-gpt2-image-captioning',
180177
})
181178

182179
// Using your own dedicated inference endpoint: https://hf.co/docs/inference-endpoints/
@@ -188,9 +185,9 @@ const llamaEndpoint = inference.endpoint(
188185
"https://api-inference.huggingface.co/models/meta-llama/Llama-3.1-8B-Instruct"
189186
);
190187
const out = await llamaEndpoint.chatCompletion({
191-
model: "meta-llama/Llama-3.1-8B-Instruct",
192-
messages: [{ role: "user", content: "Hello, nice to meet you!" }],
193-
max_tokens: 512,
188+
model: "meta-llama/Llama-3.1-8B-Instruct",
189+
messages: [{ role: "user", content: "Hello, nice to meet you!" }],
190+
max_tokens: 512,
194191
});
195192
console.log(out.choices[0].message);
196193
```

packages/inference/README.md

Lines changed: 11 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -42,15 +42,15 @@ const hf = new HfInference('your access token')
4242

4343
Your access token should be kept private. If you need to protect it in front-end applications, we suggest setting up a proxy server that stores the access token.
4444

45-
### Requesting third-party inference providers
45+
### Third-party inference providers
4646

47-
You can request inference from third-party providers with the inference client.
47+
You can send inference requests to third-party providers with the inference client.
4848

4949
Currently, we support the following providers: [Fal.ai](https://fal.ai), [Replicate](https://replicate.com), [Together](https://together.xyz) and [Sambanova](https://sambanova.ai).
5050

51-
To make request to a third-party provider, you have to pass the `provider` parameter to the inference function. Make sure your request is authenticated with an access token.
51+
To send requests to a third-party provider, you have to pass the `provider` parameter to the inference function. Make sure your request is authenticated with an access token.
5252
```ts
53-
const accessToken = "hf_..."; // Either a HF access token, or an API key from the 3rd party provider (Replicate in this example)
53+
const accessToken = "hf_..."; // Either a HF access token, or an API key from the third-party provider (Replicate in this example)
5454

5555
const client = new HfInference(accessToken);
5656
await client.textToImage({
@@ -63,14 +63,19 @@ await client.textToImage({
6363
When authenticated with a Hugging Face access token, the request is routed through https://huggingface.co.
6464
When authenticated with a third-party provider key, the request is made directly against that provider's inference API.
6565

66-
Only a subset of models are supported when requesting 3rd party providers. You can check the list of supported models per pipeline tasks here:
66+
Only a subset of models are supported when requesting third-party providers. You can check the list of supported models per pipeline tasks here:
6767
- [Fal.ai supported models](./src/providers/fal-ai.ts)
6868
- [Replicate supported models](./src/providers/replicate.ts)
6969
- [Sambanova supported models](./src/providers/sambanova.ts)
7070
- [Together supported models](./src/providers/together.ts)
7171
- [HF Inference API (serverless)](https://huggingface.co/models?inference=warm&sort=trending)
7272

73-
#### Tree-shaking
73+
**Important note:** To be compatible, the third-party API must adhere to the "standard" shape API we expect on HF model pages for each pipeline task type.
74+
This is not an issue for LLMs as everyone converged on the OpenAI API anyways, but can be more tricky for other tasks like "text-to-image" or "automatic-speech-recognition" where there exists no standard API. Let us know if any help is needed or if we can make things easier for you!
75+
76+
👋**Want to add another provider?** Get in touch if you'd like to add support for another Inference provider, and/or request it on https://huggingface.co/spaces/huggingface/HuggingDiscussions/discussions/49
77+
78+
### Tree-shaking
7479

7580
You can import the functions you need directly from the module instead of using the `HfInference` class.
7681

0 commit comments

Comments
 (0)