Video Recordings

Growing Trees: Non-Linear Incremental Parsing during Writing

Speaker:

Cerstin Mahlow

Abstract:

Incremental parsing in Computer Science is defined as a process occuring during the creation of a program by parsing the structure of the program code and then to update the parse according to changes to the code during editing. Incremental parsing in Natural Language Processing is defined by parsing a sentence or a text in a time-linear fashion, similar to how humans speak and hear. However, when we read and write language, we do not process in a linear fashion but can go back and forth. During writing, we can add to previously written sentences, but we can also delete parts of it.

In this talk, I present the idea of non-linear incremental parsing in NLP, i.e., incremental parsing of natural language text in a similar way incremental parsing is applied to code. As previously shown, processing of code relies on formal characteristics like being unambiguously highly structured documents. Natural language text is structured as well, both on the document level and on the text level. The latter structure, however, is implicit and often ambiguous – we can easily come up with different syntax trees when analysing complex sentences in isolation. The parse tree of a document in preparation is a prerequisite for the development of editing functions making use of appropriate units of this document.

I will discuss issues of when and how to update parse trees during writing. I will also argue for the development of new methods for comparing parse trees, which could be used as another view on the text allowing the author to evaluate different versions of a sentence from a syntactic and stylistic point of view.

Length:

01:01:39

Date:

04/04/2016