Skip to content

PyThaiNLP v5.1.0 Released!

Compare
Choose a tag to compare
@wannaphong wannaphong released this 25 Feb 12:13
· 58 commits to dev since this release
d88d971

We released PyThaiNLP v5.1.0! This version has increased features and fixed problems such as Thai Discourse Treebank (TDTB), Thai Solar Date converted to Thai Lunar Date, and others.

Install: pip install pythainlp
Upgrade: pip install -U pythainlp

See PyThaiNLP 5.1 Change Log: #900

What is new?

New features

  • Add Thai Discourse Treebank postag #910
  • Add Thai Universal Dependency Treebank postag #916
  • Add Thai G2P v2 Grapheme-to-Phoneme model #923
  • Add support for list of strings as input to sent_tokenize() #927
  • Add pythainlp.tools.safe_print to handle UnicodeEncodeError on console #969
  • Add Thai Solar Date convert to Thai Lunar Date #998
  • Add Thai pangram text #1045
  • Add pythainlp.llm #1043

Bug fixes

  • Fix collate() to consider tonemark in ordering #926
  • Fix maiyamok() that expanding the wrong word #962
  • Fix nlpo3.load_dict() that never print error msg when not success #979

Remove

  • Remove clause_tokenize #1024

Deprecation and other API changes

  • 5.1
    • pythainlp.util.is_native_thai, use instead pythainlp.morpheme.is_native_thai
  • 5.2
    • pythainlp.cls, use instead pythainlp.classify
    • pythainlp.corpus.thai_synonym, use instead pythainlp.corpus.thai_synonyms
    • pythainlp.util.maiyamok, use instead pythainlp.util.expand_maiyamok

Improve

  • Add more Thailand political party to Thai dictionary 2252dee
  • Fix inconsistency in newmm-safe engine by copilot #1063
  • Update warn_deprecation to get deprecated and removal versions #1028
  • Remove unnecessary enumerate in expand_maiyamok #1029
  • Add SPDX FileType #1032
  • Fix bug in Longest Matching tokenizer to preprocess spaces consistently #1062
  • Add codemeta.json file to root directory #1053

Full Changelog: v5.0.0...v5.1.0

Contributors

Thanks all the contributors. (Image made with contributors-img)

We build Thai NLP.

PyThaiNLP