Open-Source Taggers for (Czech) POS Tagging and NE Recognition
Speaker:
Jana Straková
Abstract:
We present two recently released open-source tagging tools: NameTag is a free software for named entity recognition (NER) which achieves
state-of-the-art performance in Czech; MorphoDiTa (Morphologic Dictionary and
Tagger) performs morphologic analysis, morphologic generation, tagging and
tokenization with state-of-the-art results for Czech and a throughput around
10-200K words per second. Both tools are free software under LGPL license and
are distributed along with trained linguistic models which are free for
non-commercial use under CC BY-NC-SA license. We will also briefly discuss a
recent release of the Czech Named Entity Corpus 2.0 which was used as a training
material for the named entity recognition tool.