Named Entity Disambiguation With Wikipedia As a Knowledge Base
Ivo Lašek
We focus on the problem of named entity disambiguation on a
very detailed level. To each entity is assigned a concrete identifier of a corresponding Wikipedia article describing the entity. We will compare
various context representations for proper disambiguation: bag of words
representation, sentence structure representation and entity
co-occurrence representation. We will show how we preprocess freely
available dumps from Wikipedia in order to form indexes representing the
context of named entities. Briefly we will cover also the topic of Czech
DBpedia and the way it can be used to enrich textual content with
structured data.