v1.0.0

OlivierDehaene released this 23 Feb 16:43

· 177 commits to main since this release

41b692d

Highlights

Support for Nomic models
Support for Flash Attention for Jina models
Metal backend for M* users
/tokenize route to directly access the internal TEI tokenizer
/embed_all route to allow client level pooling

What's Changed

fix: limit the number of buckets for prom metrics by @OlivierDehaene in #114
feat: support flash attention for Jina by @OlivierDehaene in #119
feat: add support for Metal by @OlivierDehaene in #120
fix: fix turing for Jina and limit concurrency in docker build by @OlivierDehaene in #121
fix(router): fix panics on partial_cmp and empty req.texts by @OlivierDehaene in #138
feat(router): add /tokenize route by @OlivierDehaene in #139
feat(backend): support classification for bert by @OlivierDehaene in #155
feat: add embed_raw route to get all embeddings without pooling by @OlivierDehaene in #154
added docs for OTLP_ENDPOINT around the defaults and format sent by @MarcusDunn in #157
fix: use mimalloc to solve memory "leak" by @OlivierDehaene in #161
fix: remove modif of tokenizer by @OlivierDehaene in #163
fix: add cors_allow_origin to cli by @OlivierDehaene in #162
fix: use st max_seq_length by @OlivierDehaene in #167
feat: support nomic models by @OlivierDehaene in #166

New Contributors

@MarcusDunn made their first contribution in #157

Full Changelog: v0.6.0...v1.0.0

Contributors

OlivierDehaene and MarcusDunn

Assets 2