Skip to content

Commit 8b29df8

Browse files
docs: fix ambiguity in asr tasks page. (#724)
Co-authored-by: Omar Sanseviero <[email protected]>
1 parent e38b705 commit 8b29df8

File tree

1 file changed

+14
-14
lines changed
  • packages/tasks/src/tasks/automatic-speech-recognition

1 file changed

+14
-14
lines changed

packages/tasks/src/tasks/automatic-speech-recognition/about.md

Lines changed: 14 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ The use of Multilingual ASR has become popular, the idea of maintaining just a s
1818

1919
## Inference
2020

21-
The Hub contains over [17,000 ASR models](https://huggingface.co/models?pipeline_tag=automatic-speech-recognition&sort=downloads) that you can test right away in your browser using the model page widgets. You can also use any model as a service using the Inference API. Here is a simple code snippet to do exactly this:
21+
The Hub contains over [17,000 ASR models](https://huggingface.co/models?pipeline_tag=automatic-speech-recognition&sort=downloads) that you can test right away in your browser using the model page widgets. You can also use any model as a service using the Serverless Inference API. We also support libraries such as [transformers](https://huggingface.co/models?library=transformers&pipeline_tag=automatic-speech-recognition&sort=downloads), [speechbrain](https://huggingface.co/models?library=speechbrain&pipeline_tag=automatic-speech-recognition&sort=downloads), [NeMo](https://huggingface.co/models?pipeline_tag=automatic-speech-recognition&library=nemo&sort=downloads) and [espnet](https://huggingface.co/models?library=espnet&pipeline_tag=automatic-speech-recognition&sort=downloads) via the Serverless Inference API. Here's a simple code snippet to run inference:
2222

2323
```python
2424
import json
@@ -36,7 +36,19 @@ def query(filename):
3636
data = query("sample1.flac")
3737
```
3838

39-
You can also use libraries such as [transformers](https://huggingface.co/models?library=transformers&pipeline_tag=automatic-speech-recognition&sort=downloads), [speechbrain](https://huggingface.co/models?library=speechbrain&pipeline_tag=automatic-speech-recognition&sort=downloads), [NeMo](https://huggingface.co/models?pipeline_tag=automatic-speech-recognition&library=nemo&sort=downloads) and [espnet](https://huggingface.co/models?library=espnet&pipeline_tag=automatic-speech-recognition&sort=downloads) if you want one-click managed Inference without any hassle.
39+
You can also use[huggingface.js](https://github.com/huggingface/huggingface.js), the JavaScript client, to transcribe models with the Inference API.
40+
41+
```javascript
42+
import { HfInference } from "@huggingface/inference";
43+
44+
const inference = new HfInference(HF_TOKEN);
45+
await inference.automaticSpeechRecognition({
46+
data: await (await fetch("sample.flac")).blob(),
47+
model: "openai/whisper-large-v3",
48+
});
49+
```
50+
51+
For transformers compatible models like Whisper, Wav2Vec2, HuBERT, etc. You can also run inference in Python using transformers as follows:
4052

4153
```python
4254
# pip install --upgrade transformers
@@ -49,18 +61,6 @@ pipe("sample.flac")
4961
# {'text': "GOING ALONG SLUSHY COUNTRY ROADS AND SPEAKING TO DAMP AUDIENCES IN DRAUGHTY SCHOOL ROOMS DAY AFTER DAY FOR A FORTNIGHT HE'LL HAVE TO PUT IN AN APPEARANCE AT SOME PLACE OF WORSHIP ON SUNDAY MORNING AND HE CAN COME TO US IMMEDIATELY AFTERWARDS"}
5062
```
5163

52-
You can use [huggingface.js](https://github.com/huggingface/huggingface.js) to transcribe text with javascript using models on Hugging Face Hub.
53-
54-
```javascript
55-
import { HfInference } from "@huggingface/inference";
56-
57-
const inference = new HfInference(HF_TOKEN);
58-
await inference.automaticSpeechRecognition({
59-
data: await (await fetch("sample.flac")).blob(),
60-
model: "openai/whisper-large-v3",
61-
});
62-
```
63-
6464
## Solving ASR for your own data
6565

6666
We have some great news! You can fine-tune (transfer learning) a foundational speech model on a specific language without tonnes of data. Pretrained models such as Whisper, Wav2Vec2-MMS and HuBERT exist. [OpenAI's Whisper model](https://huggingface.co/openai/whisper-large-v3) is a large multilingual model trained on 100+ languages and with 4 Million hours of speech.

0 commit comments

Comments
 (0)