Open-Source Taggers for (Czech) POS Tagging and NE Recognition

Speaker:
Jana Straková
Abstract:
We present two recently released open-source tagging tools: NameTag is  a free software for named entity recognition (NER) which achieves state-of-the-art performance in Czech; MorphoDiTa (Morphologic Dictionary and Tagger) performs morphologic analysis, morphologic generation, tagging and tokenization with state-of-the-art results for Czech and a throughput around 10-200K words per second. Both tools are free software under LGPL license and are distributed along with trained linguistic models which are free for non-commercial use under CC BY-NC-SA license. We will also briefly discuss a  recent release of the Czech Named Entity Corpus 2.0 which was used as a training material for the named entity recognition tool.
Length:
00:40:25
Date:
10/03/2014
views: 936

Images:
Preview of img-014.jpg
Image img-014.jpg
Preview of img-025.jpg
Image img-025.jpg
Preview of img-026.jpg
Image img-026.jpg
Attachments: (video, slides, etc.)
37MB
241 downloads
44MB
244 downloads
222MB
236 downloads