Signalling discourse relations

Markus Egg (Humboldt-Universität zu Berlin)
Discourse relations are marked by connectives and non-connective signals like specific syntactic constructions (e.g., inversion or clefts), sense relations between lexemes, or punctuation. However, as opposed to connectives, non-connective signals can co-occur with discourse relations without contributing to the signalling of the relation, in fact, they can do so even if they seem to present evidence against the actual discourse relation (Hoek et al. 2019, Zeldes and Liu 2020), like in the case of antonymy in an ELABORATION relation. The annotation of discourse signals in most previous work follows a principle of ‘signal relevance’, annotating only the signals deemed relevant for the interpretation of the relation: In the Penn Discourse Treebank (Webber et al. 2018), those signals comprise connectives, non-connective lexical and syntactic signals; in the RST Signalling Corpus (Das et al. 2015) the range of such signals is much wider (see Poláková et al. 2017 for a comparison). Crible (2022) criticises this strategy and advocates annotating all the signals present in a relation.

 This discussion raises theoretical and practical issues, starting with the general question of what exactly makes a signal relevant for a relation. From a more practical point of view, one needs to investigate whether it is really necessary to annotate the full range of signals: If the ‘relevant’ signals alone can predict the eventual correlations between signals and relations, it would not be necessary to invest additional effort to annotate the full range of signals. At the same time, it would be interesting to have an estimate of the effort necessary to extend annotations of relevant signals to include all the signals. To shed more light on these issues, we conducted a study annotating the full range of signals in a subcorpus of 1,000 discourse relations of the RST Signalling Corpus. We found that while only one third of the signals was annotated in the corpus, this had no strong impact on the distribution of signals across discourse relations.
views: 439

Attachments: (video, slides, etc.)
100.0 MB