Lecture | LUCL Sociolinguistics Series 2022/2023

Annotation reliability as a preliminary for corpus research

Friday 17 February 2023
Sociolinguistics & Discourse Studies Series
Cleveringaplaats 1
2311 BD Leiden


In corpus research, language data are frequently annotated by analysts, but measures of reliability are rarely reported. When annotations concern interpretative features such as implicatures, this poses problems for subsequent steps in the analysis. In this talk, three connected issues are discussed in light of an experiment on classification of coherence relations in conditionals. First, different classifications produce incompatible results when applied to language data. Second, discourse studies observe a discrepancy between theory and data, i.e., existing classifications are “too detached” from actual discourse. Third, while language users construct various cognitive relations between clauses, they do so without relying on overt linguistic features, which poses problems for composing annotation schemes. Based on the results of the experiment, I discuss the implications for corpus research of implicatures.

