Universiteit Leiden

nl en


LACG: Emmanuel Keuleers

  • Emmanuel Keuleers
8 March 2018
Language and Cognition Group meetings
Lipsius Building
Cleveringaplaats 1
2311 BD Leiden

Frequency, diversity, and prevalence in crowds and corpora: Towards a transaction perspective

Measures of word frequency and diversity derived from text corpora have been pervasively shown to be the best predictors of isolated word identification latency. Recently, Keuleers et al. (2015) and Brysbaert et al. (2016) have show that prevalence, a crowd-based measure that can be expressed as the percentage of a population who know a word, is also a very strong independent predictor of word processing times. I give additional evidence that prevalence outperforms corpus-based measures of frequency and diversity on predicting lexical decision and word naming latency in different languages. More importantly, I will show how different measures of frequency and diversity can be mathematically related to each other and that ignoring these correspondences leads to spurious comparisons. I conclude that measures of frequency and diversity are best understood in a transaction framework.