Skip to content

v0.3.0

Compare
Choose a tag to compare
@OlivierDehaene OlivierDehaene released this 16 Feb 16:33
· 1318 commits to main since this release
c720555

Features

  • server: support t5 models
  • router: add max_total_tokens and empty_input validation
  • launcher: add the possibility to disable custom CUDA kernels
  • server: add automatic safetensors conversion
  • router: add prometheus scrape endpoint
  • server, router: add distributed tracing

Fix

  • launcher: copy current env vars to subprocesses
  • docker: add note around shared memory