Skip to content

v0.4.2

Compare
Choose a tag to compare
@OlivierDehaene OlivierDehaene released this 30 Mar 15:10
· 1263 commits to main since this release
84722f3

Features

  • benchmark: tui based benchmarking tool
  • router: Clear cache on error
  • server: Add mypy-protobuf
  • server: reduce mlp and attn in one op for flash neox
  • image: aws sagemaker compatible image

Fix

  • server: avoid try/except to determine the kind of AutoModel
  • server: fix flash neox rotary embedding