Universiteit Leiden

nl en


Exemplar semantics through parallel corpora: Something about indefinite pronouns

  • Barend Beekhuizen
Monday 6 February 2017
Matthias de Vrieshof
Matthias de Vrieshof 4
2311 BZ Leiden


In this talk, I'll present recent work on the use of parallel, translated corpora to understand crosslinguistic variation in word meaning and connect that to cognition. Crosslinguistic comparison can inform us about the structure of the design space of meaning: if a pattern of labeling two situations with the same term is prevalent across many languages, this is likely the case because they are conceptually very similar and speakers hence easily `group' them together in acquisition and adult production; distinguishing them has little communicative value. 

Bowerman used this insight in several publications to explain errors in the way children learn word meanings. This connection initially inspired me to use geometric spaces based on crosslinguistic patterns of variation as the input space for computational category learning models. When trained on these spaces, these models 'fail' in the same way children do -- by making the same word meaning errors. 

However, this research relies on the availability of data that contains expressions for a particular situation across a number of languages. Such data can be obtained through consultation of secondary data or through elicitation with native speakers, which are both labour-intensive . In line with recent work in semantic typology, I propose to use parallel, translated, texts as a practical way of obtaining such data. Not only do these subtitles form a practical alternative to the other two methods, they also resolve some conceptual issues and instantiate a vision on meaning Croft dubbed 'Exemplar Semantics'. I will illustrate the method with a replication of Haspelmath's seminal work on indefinite pronouns.

This website uses cookies.  More information.