server : add Speech Recognition & Synthesis to UI #8679

ElYaiko · 2024-07-25T01:49:40Z

This PR adds a Speech Recognition & Synthesis to the UI (A simple voice mode).

Features added:
Talk button: Initiates speech-to-text.
Send after talk option: Sends the message after STT.
Voice option: Text-to-speech voice used for the bot.
Play/pause message: Play/pause message with selected TTS voice.
Play message after completition

Tested browsers:

Chrome
Firefox
Safari

Tested OS:

Windows
macOS
Linux (Requires additional packages for TTS: Guide)
Android
iOS

I have read the contributing guidelines
Self-reported review complexity:
- Low
- Medium
- High

ggerganov

Nice! Just need to fix the trailing whitespaces (see the CI)

ngxson

LGTM. Small detail: I'd prefer to add a small message says "TTS and speech recognition are not provided by llama.cpp", so to be clear to user that the quality depends on their browser, not on llama.cpp or the model itself.

ElYaiko · 2024-07-25T20:53:35Z

@ngxson What do you think?

ngxson · 2024-07-25T21:31:30Z

Yes it's LGTM, we can merge once the CI pass

* server : add Speech Recognition & Synthesis to UI * server : add Speech Recognition & Synthesis to UI (fixes)

jboero · 2024-08-15T09:41:26Z

Wow I just saw this update. Kudos merging Whisper and TTS this is brilliant. Well done.

server : add Speech Recognition & Synthesis to UI

10895d3

github-actions bot added examples server labels Jul 25, 2024

ggerganov approved these changes Jul 25, 2024

View reviewed changes

ngxson approved these changes Jul 25, 2024

View reviewed changes

server : add Speech Recognition & Synthesis to UI (fixes)

c5f12a1

ElYaiko requested a review from ngxson July 25, 2024 20:54

ngxson merged commit 01aec4a into ggml-org:master Jul 25, 2024
11 checks passed

arthw pushed a commit to arthw/llama.cpp that referenced this pull request Jul 27, 2024

server : add Speech Recognition & Synthesis to UI (ggml-org#8679)

3395a68

* server : add Speech Recognition & Synthesis to UI * server : add Speech Recognition & Synthesis to UI (fixes)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

server : add Speech Recognition & Synthesis to UI #8679

server : add Speech Recognition & Synthesis to UI #8679

Uh oh!

ElYaiko commented Jul 25, 2024

Uh oh!

ggerganov left a comment

Uh oh!

ngxson left a comment

Uh oh!

ElYaiko commented Jul 25, 2024

Uh oh!

ngxson commented Jul 25, 2024

Uh oh!

Uh oh!

jboero commented Aug 15, 2024

Uh oh!

Uh oh!

server : add Speech Recognition & Synthesis to UI #8679

server : add Speech Recognition & Synthesis to UI #8679

Uh oh!

Conversation

ElYaiko commented Jul 25, 2024

Uh oh!

ggerganov left a comment

Choose a reason for hiding this comment

Uh oh!

ngxson left a comment

Choose a reason for hiding this comment

Uh oh!

ElYaiko commented Jul 25, 2024

Uh oh!

ngxson commented Jul 25, 2024

Uh oh!

Uh oh!

jboero commented Aug 15, 2024

Uh oh!

Uh oh!