v1.1.0
Notable changes
What's Changed
- Fix f180 by @Narsil in #951
- Fix Falcon weight mapping for H2O.ai checkpoints by @Vinno97 in #953
- Fixing top_k tokens when k ends up < 0 by @Narsil in #966
- small fix on idefics by @VictorSanh in #954
- chore(client): Support Pydantic 2 by @JelleZijlstra in #900
- docs: typo in streaming.js by @revolunet in #971
- Disabling exllama on old compute. by @Narsil in #986
- sync text-generation version from 0.3.0 to 0.6.0 with pyproject.toml by @yzbx in #950
- Fix exllama wronfully loading by @maximelaboisson in #990
- add transformers gptq support by @flozi00 in #963
- Fix call vs forward. by @Narsil in #993
- fit for baichuan models by @XiaoBin1992 in #981
- Fix missing arguments in Galactica's from_pb by @Vinno97 in #1022
- Fixing t5 loading. by @Narsil in #1042
- Add AWQ quantization inference support (#1019) by @Narsil in #1054
- Fix GQA llama + AWQ by @Narsil in #1061
- support local model config file by @zhangsibo1129 in #1058
- fix discard_names bug in safetensors convertion by @zhangsibo1129 in #1052
- Install curl to be able to perform more advanced healthchecks by @oOraph in #1033
- Fix position ids logic instantiation of idefics vision part by @VictorSanh in #1064
- Fix top_n_tokens returning non-log probs for some models by @Vinno97 in #1023
- Support eetq weight only quantization by @Narsil in #1068
- Remove the stripping of the prefix space (and any other mangling that tokenizers might do). by @Narsil in #1065
- Complete FastLinear.load parameters in OPTDecoder initialization by @zhangsibo1129 in #1060
- feat: add mistral model by @OlivierDehaene in #1071
New Contributors
- @VictorSanh made their first contribution in #954
- @JelleZijlstra made their first contribution in #900
- @revolunet made their first contribution in #971
- @yzbx made their first contribution in #950
- @maximelaboisson made their first contribution in #990
- @XiaoBin1992 made their first contribution in #981
- @sywangyi made their first contribution in #1034
- @zhangsibo1129 made their first contribution in #1058
Full Changelog: v1.0.3...v1.1.0