Two advances in semantic maps: text-based typology and directed graphs
- Barend Beekhuizen
- Monday 23 April 2018
2311 BD Leiden
In this talk, I will present two ideas pertaining to semantic maps as used in linguistic typology. Semantic maps, introduced by Anderson (1982) and made popular by such works as Kemmer (1993), Haspelmath (1997), and van der Auwera & Plungian (1998), are graph structures that constrain the set of possible semantic extensions of grams (words, morphemes, constructions) across languages by requiring a gram to cover a connected subgraph of the map.
The availability of crosslinguistic data constitutes a major bottleneck for the approach. In the first part of my talk, I discuss a novel pipeline for using translated texts (subtitles or bible translations) to obtain data to (among other things) learn semantic maps from parallel corpora and compare them to other attempts at leveraging such corpora.
In the second part, I address two limitations of traditional semantic maps, namely (1) that they predict the existence of a great number of unattested ways of carving up the world (i.e., they 'overgenerate') (2) that they do not discriminate between crosslinguistically common and rare categories. I will present an algorithm that learns semantic maps but is less resp. not constrained by these two limitations. Haspelmath's (1997) analysis of indefinite pronouns forms the starting point of the validation of both methods, but I will present explorations of other semantic domains as well.