PyThaiNLP v5.1.0 Released!
We released PyThaiNLP v5.1.0! This version has increased features and fixed problems such as Thai Discourse Treebank (TDTB), Thai Solar Date converted to Thai Lunar Date, and others.
Install: pip install pythainlp
Upgrade: pip install -U pythainlp
- Documentation: https://pythainlp.github.io/docs/5.1
- Report bug: https://github.com/PyThaiNLP/pythainlp/issues
See PyThaiNLP 5.1 Change Log: #900
What is new?
New features
- Add Thai Discourse Treebank postag #910
- Add Thai Universal Dependency Treebank postag #916
- Add Thai G2P v2 Grapheme-to-Phoneme model #923
- Add support for list of strings as input to sent_tokenize() #927
- Add pythainlp.tools.safe_print to handle UnicodeEncodeError on console #969
- Add Thai Solar Date convert to Thai Lunar Date #998
- Add Thai pangram text #1045
- Add pythainlp.llm #1043
Bug fixes
- Fix collate() to consider tonemark in ordering #926
- Fix maiyamok() that expanding the wrong word #962
- Fix nlpo3.load_dict() that never print error msg when not success #979
Remove
- Remove clause_tokenize #1024
Deprecation and other API changes
- 5.1
pythainlp.util.is_native_thai
, use insteadpythainlp.morpheme.is_native_thai
- 5.2
pythainlp.cls
, use insteadpythainlp.classify
pythainlp.corpus.thai_synonym
, use insteadpythainlp.corpus.thai_synonyms
pythainlp.util.maiyamok
, use insteadpythainlp.util.expand_maiyamok
Improve
- Add more Thailand political party to Thai dictionary 2252dee
- Fix inconsistency in newmm-safe engine by copilot #1063
- Update warn_deprecation to get deprecated and removal versions #1028
- Remove unnecessary enumerate in expand_maiyamok #1029
- Add SPDX FileType #1032
- Fix bug in Longest Matching tokenizer to preprocess spaces consistently #1062
- Add codemeta.json file to root directory #1053
Full Changelog: v5.0.0...v5.1.0
Contributors
Thanks all the contributors. (Image made with contributors-img)
We build Thai NLP.
PyThaiNLP