[Inference Providers] Async calls for `text-to-video` with fal.ai #1292

hanouticelina · 2025-03-17T17:35:31Z

What does this PR do?

This PR adds asynchronous polling to the fal.ai text-to-video generation. This allows running inference with models that may take > 2 min to generate results. The other motivation behind this PR is to align the Python and JS clients, the Python client has already been merged into main: huggingface/huggingface_hub#2927

Main Changes

Replaced static baseUrl property with makeBaseUrl() function across all providers. This is needed to be able to customize the base url based on the task. We want to use FAL_AI_API_BASE_URL_QUEUE for text-to-video only. I'm not convinced if it's the simplest and the best way to do that.
Added a pollFalResponse() for text-to-video(similarly to what it's done with BFL for text-to-image).

Any refactoring suggestions are welcome! I'm willing to spend some additional time to make provider-specific updates easier to implement and better align our two clients 🙂

btw, I did not update the VCR tests as we've discussed that it'd be best to remove the VCR for text-to-video. Maybe we should remove them here?
EDIT: removed the text-to-video tests in f8a6386.

I've tested it locally with tencent/HunyuanVideo for which the generation takes more than 2min and it works fine:

fal-ai-output.mp4

packages/inference/src/providers/fal-ai.ts

Co-authored-by: Julien Chaumond <[email protected]>

…e.js into async-calls-falai

SBrandeis

Looks good 🤩

SBrandeis · 2025-03-18T10:44:34Z

packages/inference/src/tasks/cv/textToVideo.ts

+async function pollFalResponse(res: FalAiOutput, args: TextToVideoArgs, options?: Options): Promise<Blob> {
+	const requestId = res.request_id;
+	if (!requestId) {
+		throw new InferenceOutputError("No request ID found in the response");
+	}
+	let status = res.status;
+	const { url, info } = await makeRequestOptions(args, { ...options, task: "text-to-video" });
+	const baseUrl = url?.split("?")[0] || "";
+	const query = url?.includes("_subdomain=queue") ? "?_subdomain=queue" : "";
+
+	const statusUrl = `${baseUrl}/requests/${requestId}/status${query}`;
+	const resultUrl = `${baseUrl}/requests/${requestId}${query}`;
+
+	while (status !== "COMPLETED") {
+		await delay(1000);
+		const statusResponse = await fetch(statusUrl, { headers: info.headers });
+
+		if (!statusResponse.ok) {
+			throw new Error(`HTTP error! status: ${statusResponse.status}`);
+		}
+		status = (await statusResponse.json()).status;
+	}
+
+	const resultResponse = await fetch(resultUrl, { headers: info.headers });
+	const result = await resultResponse.json();
+	const isValidOutput =
+		typeof result === "object" &&
+		!!result &&
+		"video" in result &&
+		typeof result.video === "object" &&
+		!!result.video &&
+		"url" in result.video &&
+		typeof result.video.url === "string" &&
+		isUrl(result.video.url);
+	if (!isValidOutput) {
+		throw new InferenceOutputError("Expected { video: { url: string } }");
+	}
+	const urlResponse = await fetch(result.video.url);
+	return await urlResponse.blob();
+}


I think it would make sense to move this util in the src/providers/fal-ai.ts file

ah yes, we should probably do the same for:

huggingface.js/packages/inference/src/tasks/cv/textToImage.ts

Line 132 in 0a0960c

async function pollBflResponse(url: string, outputType?: "url" | "blob"): Promise<Blob> {

addressed in cf2d1ac. I had to move FalAiOutput to src/providers/fal-ai.ts to avoid an import cycle, maybe we should move every provider-specific output type to their respective provider files, wdyt ?

maybe we should move every provider-specific output type to their respective provider files, wdyt ?

Agree with this and in particular their output type + logic to validate and parse the response. Same as the get_response's in Python. Out of scope for this PR though

packages/inference/src/tasks/cv/textToVideo.ts

packages/inference/src/providers/fal-ai.ts

Co-authored-by: Simon Brandeis <[email protected]>

Wauplin

🔥

packages/inference/src/providers/fal-ai.ts

packages/inference/src/tasks/cv/textToVideo.ts

…e.js into async-calls-falai

…nc-calls-falai

…e.js into async-calls-falai

…nc-calls-falai

packages/inference/src/providers/fal-ai.ts

packages/inference/src/types.ts

packages/inference/src/tasks/cv/textToVideo.ts

SBrandeis · 2025-03-24T13:47:49Z

packages/inference/src/tasks/cv/textToVideo.ts

-		}
-		const urlResponse = await fetch((res as FalAiOutput).video.url);
-		return await urlResponse.blob();
+		const { url, info } = await makeRequestOptions(args, { ...options, task: "text-to-video" });


That works, but it feels a bit weird to have to call makeRequestOptions again here

I think we can have a simpler / better API that does not require to mentally map why we call makeRequestOptions with those parameters.

I'm wondering whether request should return an optional Promise that we would just await here? - example usage syntax:

const res = await request<FalAiQueueOutput | ReplicateOutput | NovitaOutput>(payload, { ...options, task: "text-to-video", }); if (res.poll) { const blob: Blob = await res.poll; }

Anyways - I'm OK with addressing that in a subsequent PR

yes totally agree, it feels a bit wrong to call makeRequestOptions again just after calling request.
I'm currently working on a refactoring PR, I'll rework this part as well.
EDIT: I think it's better to have smaller and easier PRs, so for this one in particular, let's open a dedicated PR 😄

SBrandeis

Looks good! Thank you!!

kefranabg · 2025-03-26T10:22:18Z

That means we can allow any text-to-video model that supports fal-ai to use the inference widget on the hub right? 🙂

SBrandeis · 2025-03-26T10:33:52Z

Yes @kefranabg !

@SBrandeis

…makeRequestOptions calls (#1314) Related to @SBrandeis's comment here #1292 (comment). This PR addresses the original concern about redundant `makeRequestOptions` calls introduced in #1292.The solution implemented here updates the `request` function to return both the response and a _request context_ when needed, allowing provider-specific polling code to reuse this context without redundant calls to `makeRequestOptions`. This differs from the initial suggestion in the comment as each provider implements polling differently with different parameters / response formats. Making a generic `.poll` property would require mixing provider-specific logic into the core request function (we don't want that, right? 😄 ). In the end, we want to keep provider-specific logic isolated in their respective provider files (PR coming today to push that further!).

add async calls for fal-ai

19b1de5

hanouticelina requested review from julien-c, SBrandeis and coyotte508 as code owners March 17, 2025 17:35

hanouticelina requested a review from Wauplin March 17, 2025 17:35

hanouticelina added 2 commits March 17, 2025 18:53

update fal output

246e764

fix

4eca289

julien-c reviewed Mar 18, 2025

View reviewed changes

packages/inference/src/providers/fal-ai.ts Outdated Show resolved Hide resolved

hanouticelina and others added 4 commits March 18, 2025 10:32

remove comment

1975dc7

Co-authored-by: Julien Chaumond <[email protected]>

fix lint

5f77388

Merge branch 'async-calls-falai' of github.com:huggingface/huggingfac…

6534c9c

…e.js into async-calls-falai

Merge branch 'main' into async-calls-falai

0d193dd

SBrandeis reviewed Mar 18, 2025

View reviewed changes

hanouticelina and others added 5 commits March 18, 2025 12:05

Update packages/inference/src/tasks/cv/textToVideo.ts

80dc091

Co-authored-by: Simon Brandeis <[email protected]>

Update packages/inference/src/providers/fal-ai.ts

e4a7568

Co-authored-by: Simon Brandeis <[email protected]>

fixes

cf2d1ac

fix

77458a4

Merge branch 'main' into async-calls-falai

8b0f09b

Wauplin reviewed Mar 18, 2025

View reviewed changes

packages/inference/src/providers/fal-ai.ts Outdated Show resolved Hide resolved

packages/inference/src/tasks/cv/textToVideo.ts Outdated Show resolved Hide resolved

hanouticelina added 6 commits March 18, 2025 16:30

use 0.5s for the interval polling

188175c

Merge branch 'async-calls-falai' of github.com:huggingface/huggingfac…

30ba4cb

…e.js into async-calls-falai

Merge branch 'main' into async-calls-falai

1ee3029

remove text-to-video tests

f8a6386

Merge branch 'main' of github.com:huggingface/huggingface.js into asy…

4d30eea

…nc-calls-falai

Merge branch 'async-calls-falai' of github.com:huggingface/huggingfac…

b97f6cf

…e.js into async-calls-falai

hanouticelina requested review from Wauplin and SBrandeis March 18, 2025 16:21

hanouticelina mentioned this pull request Mar 18, 2025

[Inference Providers] update polling interval for fal-ai huggingface/huggingface_hub#2937

Merged

Wauplin approved these changes Mar 18, 2025

View reviewed changes

hanouticelina mentioned this pull request Mar 20, 2025

[Inference Providers] Fix status and response URLs when polling text-to-video results with fal-ai huggingface/huggingface_hub#2943

Merged

hanouticelina added 2 commits March 20, 2025 17:08

fix status and result urls construction

36c56ed

Merge branch 'main' of github.com:huggingface/huggingface.js into asy…

8ec3e55

…nc-calls-falai

SBrandeis reviewed Mar 24, 2025

View reviewed changes

hanouticelina added 2 commits March 24, 2025 15:01

review suggestions

3c346af

Merge branch 'main' into async-calls-falai

1d0f6e2

hanouticelina requested a review from SBrandeis March 24, 2025 14:09

SBrandeis approved these changes Mar 24, 2025

View reviewed changes

hanouticelina merged commit c0f38b0 into main Mar 24, 2025
5 checks passed

hanouticelina deleted the async-calls-falai branch March 24, 2025 14:18

hanouticelina mentioned this pull request Mar 25, 2025

[Inference] request() returns a request context to avoid redundant makeRequestOptions calls #1314

Merged

[Inference Providers] Async calls for text-to-video with fal.ai #1292

[Inference Providers] Async calls for text-to-video with fal.ai #1292

Uh oh!

Conversation

hanouticelina commented Mar 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Main Changes

Uh oh!

Uh oh!

SBrandeis left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

SBrandeis Mar 18, 2025

Choose a reason for hiding this comment

Uh oh!

hanouticelina Mar 18, 2025

Choose a reason for hiding this comment

Uh oh!

hanouticelina Mar 18, 2025

Choose a reason for hiding this comment

Uh oh!

Wauplin Mar 18, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Wauplin left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

SBrandeis Mar 24, 2025

Choose a reason for hiding this comment

Uh oh!

hanouticelina Mar 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

SBrandeis left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

kefranabg commented Mar 26, 2025

Uh oh!

SBrandeis commented Mar 26, 2025

Uh oh!

Uh oh!

[Inference Providers] Async calls for `text-to-video` with fal.ai #1292

[Inference Providers] Async calls for `text-to-video` with fal.ai #1292

hanouticelina commented Mar 17, 2025 •

edited

Loading

SBrandeis left a comment •

edited

Loading

hanouticelina Mar 24, 2025 •

edited

Loading