Lecture

LUCDH Lunchtime Speaker Series: Testing linguistic theories with deep learning: a case study on meaning predictability

Matthijs Westera

Date: Wednesday 1 March 2023
Time: 13:15 - 14:15
Explanation: Note our later start time than usual!
Location: On Campus: Digital Lab P.J. Veth 1.07 / Online: Kaltura Live Rooms
Room: Please register via lucdh@hum.leidenuniv.nl

Join us for our next LUCDH lunchtime talk presented by Matthijs Westera on
Wednesday, 1 March 2023 at the later time of 13:15 – 14:15 (CET).

Large neural networks trained on enormous amounts of linguistic data are a potentially unprecedented source of linguistic information. But how can we access this information, and use it for testing our theories? I will present joint work reported in Aina et al. (2021) as well as its relation to my ongoing work. In the aforementioned study we tested the well-known hypothesis that meaning components that are more predictable, tend to be communicated with shorter or fewer words. While this hypothesis is in itself uncontroversial, the new method we developed for testing it will also let us test more specific hypotheses, for instance about particular words or linguistic constructions, as well as more sophisticated (future) theories about interactions between predictability and other aspects of meaning and discourse. In this talk, I will explain this new method and clarify its promise. It relies on existing deep learning models trained on the task of coreference resolution: given a preceding discourse, to which entity does a given referring expression ('she', 'the doctor', 'my grandma') likely refer? By fine-tuning this model on a masked version of the task, in which we censored a certain percentage of referring expressions (i.e., replaced them by an uninformative 'MASK' token), we were able to extract information from such models about how likely a given entity would be mentioned next in the discourse. As I hope to show, fine-tuning existing deep learning models on a modified version of the original training task, in order to extract theoretically relevant information, is an exciting new research instrument.

Aina, L., X., Liao, G. Boleda & M. Westera (2021). Does referent predictability affect the choice of referential form? A computational approach using masked coreference resolution. https://aclanthology.org/2021.conll-1.36/

Location: On-campus in the Digital Lab P.J. Veth 1.07 or Online via Kaltura Live Rooms.

To Register: Please email: lucdh@hum.leidenuniv.nl

We very much hope that you can join colleagues on campus in the Digital Lab in P.J. Veth 1.07. However, we will also be live-streaming on Kaltura, so please let us know if you will be attending in person or would like Kaltura Live Room login details.