Skip to content

v0.4.1

Compare
Choose a tag to compare
@OlivierDehaene OlivierDehaene released this 26 Mar 14:38
· 1272 commits to main since this release
ab5fd8c

Features

  • server: New faster GPTNeoX implementation based on flash attention

Fix

  • server: fix input-length discrepancy between Rust and Python tokenizers