#3 Pre-processing

Speaker:
Ondřej Bojar
Abstract:
The third MT Talk is devoted to basic pre-processing steps: issues of normalization and tokenization. This talks is also complemented with a warm-up coding exercise: Unicode lowercaser and deaccenter.
Length:
00:08:23
Date:
21/01/2015
views:

Images:
Attachments: (video, slides, etc.)
12 MB
152 downloads
30 MB
150 downloads
50 MB
143 downloads
24 MB
149 downloads