whisper cpp large v3 turbo

Config

models: 
  "whisper":
    checkEndpoint: /v1/audio/transcriptions/
    cmd: |
      /path/to/llama-server/whisper-server-30cf30c
        --host 127.0.0.1 --port ${PORT}
        -m ggml-large-v3-turbo-q8_0.bin
        # required to be compatible w/ OpenAI's API
        --request-path /v1/audio/transcriptions --inference-path ""

Testing

$ curl 10.0.1.50:8080/v1/audio/transcriptions \
  -H "Content-Type: multipart/form-data" \
  -F file="@jfk.wav" \
  -F temperature="0.0" \
  -F temperature_inc="0.2" \
  -F response_format="json" \
  -F model="whisper"

Note

-F model="whisper"

Required for llama-swap to load the right configuration.

Compiling whisper.cpp

#!/bin/sh

# git clone https://github.com/ggml-org/whisper.cpp

# pull latest code
cd $HOME/whisper.cpp
git pull

# For Reference, configure the build. 
# CUDACXX=/usr/local/cuda-12.6/bin/nvcc cmake -B build -DBUILD_SHARED_LIBS=OFF -DGGML_CUDA=1

cmake --build build --config Release -j 16

# Copy new version with hash in its filename
VERSION=$(git rev-parse --short HEAD)
NEW_FILE="whisper-server-$VERSION"
echo "New version: $NEW_FILE"
cp ./build/bin/whisper-server "/mnt/nvme/llama-server/$NEW_FILE"

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

whisper cpp large v3 turbo

Config

Testing

Compiling whisper.cpp

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally