Releases: huggingface/text-embeddings-inference
Releases · huggingface/text-embeddings-inference
v1.2.1
TEI is now Apache 2.0!
What's Changed
- Document how to send batched inputs by @osanseviero in #222
- feat: add auto-truncate arg by @OlivierDehaene in #224
- feat: add PredictPair to proto by @OlivierDehaene in #225
- fix: fix auto_truncate for openai by @OlivierDehaene in #228
- Change license to Apache 2.0 by @OlivierDehaene in #231
- feat: Amazon SageMaker compatible images by @JGalego in #103
- fix(CI): fix build all by @OlivierDehaene in #236
- fix: fix cuda-all image by @OlivierDehaene in #239
- Add SageMaker CPU images and validate by @philschmid in #240
New Contributors
- @osanseviero made their first contribution in #222
- @JGalego made their first contribution in #103
- @philschmid made their first contribution in #240
Full Changelog: v1.2.0...v1.2.1
v1.2.0
What's Changed
- add cuda all image to facilitate deployment by @OlivierDehaene in #186
- add splade pooling to Bert by @OlivierDehaene in #187
- support vertex api endpoint by @drbh in #184
- readme examples by @plaggy in #180
- add_pooling_layer for bert classification by @OlivierDehaene in #190
- add /embed_sparse route by @OlivierDehaene in #191
- Applying
Cargo.toml
optimization options by @somehowchris in #201 - Add Dockerfile-arm64 to allow docker builds on Apple M1/M2 architecture by @iandoe in #209
- configurable payload limit by @OlivierDehaene in #210
- add api_key for request authorization by @OlivierDehaene in #211
- add all methods to vertex API by @OlivierDehaene in #192
- add
/decode
route by @OlivierDehaene in #212 - Input Types Compatibility with OpenAI's API (#112) by @OlivierDehaene in #214
New Contributors
- @drbh made their first contribution in #184
- @plaggy made their first contribution in #180
- @somehowchris made their first contribution in #201
- @iandoe made their first contribution in #209
Full Changelog: v1.1.0...v1.2.0
v1.1.0
Highlights
- Splade pooling
What's Changed
- Update Dockerfile to install curl by @jpbalarini in #117
- fix loading of bert classification models by @OlivierDehaene in #173
- splade pooling by @OlivierDehaene in #174
New Contributors
- @jpbalarini made their first contribution in #117
Full Changelog: v1.0.0...v.1.1.0
v1.0.0
Highlights
- Support for Nomic models
- Support for Flash Attention for Jina models
- Metal backend for M* users
/tokenize
route to directly access the internal TEI tokenizer/embed_all
route to allow client level pooling
What's Changed
- fix: limit the number of buckets for prom metrics by @OlivierDehaene in #114
- feat: support flash attention for Jina by @OlivierDehaene in #119
- feat: add support for Metal by @OlivierDehaene in #120
- fix: fix turing for Jina and limit concurrency in docker build by @OlivierDehaene in #121
- fix(router): fix panics on partial_cmp and empty req.texts by @OlivierDehaene in #138
- feat(router): add /tokenize route by @OlivierDehaene in #139
- feat(backend): support classification for bert by @OlivierDehaene in #155
- feat: add embed_raw route to get all embeddings without pooling by @OlivierDehaene in #154
- added docs for
OTLP_ENDPOINT
around the defaults and format sent by @MarcusDunn in #157 - fix: use mimalloc to solve memory "leak" by @OlivierDehaene in #161
- fix: remove modif of tokenizer by @OlivierDehaene in #163
- fix: add cors_allow_origin to cli by @OlivierDehaene in #162
- fix: use st max_seq_length by @OlivierDehaene in #167
- feat: support nomic models by @OlivierDehaene in #166
New Contributors
- @MarcusDunn made their first contribution in #157
Full Changelog: v0.6.0...v1.0.0
v0.6.0
What's Changed
- Doc build only if doc files were changed by @mishig25 in #85
- fix: fix inappropriate title of API docs page by @ucyang in #88
- fix: hf hub redirects by @OlivierDehaene in #89
- feat: add grpc router by @OlivierDehaene in #90
- fix: fix padding support in batch tokens by @OlivierDehaene in #93
- fix: fix tokenizers with both whitespace and metaspace by @OlivierDehaene in #96
- fix: enable http feature in http-builder by @zhangfand in #98
- feat: add integration tests by @OlivierDehaene in #101
New Contributors
- @mishig25 made their first contribution in #85
- @ucyang made their first contribution in #88
- @zhangfand made their first contribution in #98
Full Changelog: v0.5.0...v0.6.0
v0.5.0
What's Changed
- feat: accept batches in predict by @OlivierDehaene in #78
- feat: rerank route by @OlivierDehaene in #84
Full Changelog: v0.4.0...v0.5.0
v0.4.0
What's Changed
- feat: USE_FLASH_ATTENTION env var by @OlivierDehaene in #57
- docs: The initial version of the TEI docs for the hf.co/docs/ by @MKhalusova in #60
- feat: support roberta by @kozistr in #62
- fix: GH workflows update: added --not_python_module flag by @MKhalusova in #66
- docs: Images links updated by @MKhalusova in #72
- feat: add
normalize
option by @OlivierDehaene in #70 - ci: Migrate CI to new Runners by @glegendre01 in #74
- feat: add support for classification models by @OlivierDehaene in #76
New Contributors
- @MKhalusova made their first contribution in #60
- @kozistr made their first contribution in #62
- @glegendre01 made their first contribution in #74
Full Changelog: v0.3.0...v0.4.0
v0.3.0
v0.2.2
What's Changed
fix: max_input_length should take into account position_offset (aec5efd)
Full Changelog: v0.2.1...v0.2.2
v0.2.1
What's Changed
- fix: only use position offset for xlm-roberta (8c507c3)
Full Changelog: v0.2.0...v0.2.1