MINDS-14 Speech Recognition

Speech recognition is a methodologies and technologies that enable the recognition and translation of spoken language into text by computers. It is also known as automatic speech recognition (ASR). ASR help human for interacting to computer by gives the machine to understand what was said and its speech patterns, speaking styles, dialects, accents and phrases.

The dataset used are from the paper by Gerz et al. termed MInDS-14. The dataset contains multilingual and cross-lingual intent detection from spoken data. The dataset includes 14 intents that were identified in a commercial e-banking system. Each intent is associated with spoken examples across 14 distinct language varieties. We limited the scope by only use 5 distinct language, i.e. "zh-CN", "ru-RU", "fr-FR", "en-US", "de-DE".

Model Metrics

Word Error Rate (wer): Common metric of the performance of an automatic speech recognition system. Lower wer are more desired.

Project Structure

Data Preparation

Covers combining array of sentences into one sentence for each "clean_article" and "clean_summary".

Exploratory Data Analysis

Conduct thorough analysis of the content in English transcription to understand their structure, distribution, and key features.

Fine Tuning

Fine tuning Whisper models for speech recognition.

Inference Pipeline

An end-to-end pipeline from ASR to intent classification.

Result

Whisper Medium

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
image		image
.gitignore		.gitignore
1_data_preparation.ipynb		1_data_preparation.ipynb
2_eda.ipynb		2_eda.ipynb
3a_model_asr.ipynb		3a_model_asr.ipynb
3b_intent_audio_classification.ipynb		3b_intent_audio_classification.ipynb
3c_intent_text_classification.ipynb		3c_intent_text_classification.ipynb
4_inference_pipeline.ipynb		4_inference_pipeline.ipynb
LICENSE		LICENSE
README.md		README.md
label_encoder.pkl		label_encoder.pkl
minds14-all-test.csv		minds14-all-test.csv
minds14-all-train.csv		minds14-all-train.csv
minds14-all-valid.csv		minds14-all-valid.csv
tfidf_vectorizer_new.joblib		tfidf_vectorizer_new.joblib

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

MINDS-14 Speech Recognition

Model Metrics

Project Structure

Result

About

Uh oh!

Releases

Packages

Languages

License

andreanstev/MINDS-14_Speech_Recognition

Folders and files

Latest commit

History

Repository files navigation

MINDS-14 Speech Recognition

Model Metrics

Project Structure

Result

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages