Dissertation

Emergence of Linguistic Universals in Neural Agents via Artificial Language Learning and Communication

Human language is constantly evolving with its linguistic structure being shaped by language users at both individual and population levels. Focusing on the interplay between processes of language acquisition and communicative need in shaping human languages, this thesis introduces a novel computational framework for simulating language emergence: NeLLCom (Neural-agent Language Learning and Communication).

Author: Y. Lian
Date: 12 December 2025
Links: Thesis in Leiden Repository

Unlike most recent emergent communication models, our agents are first trained on a pre-defined artificial language, matching methods used in experiments with human participants. Specifically, NeLLCom agents first learn these artificial languages individually through supervised learning, and then interact in pairs or groups through interactive language games, where they dynamically learn from these interactions through reinforcement learning. In this thesis, we focused on two widely attested language phenomena: (i) the trade-off between word order flexibility and case marking, and (ii) differential case marking. In both cases, we strictly followed the artificial language design as previously studied in human experiments. Our results show that neural agents can successfully replicate both phenomena. Moreover, our new framework allows to extend prior results with humans to a larger scale, and we were able to show that a word-order/case-marking trade-off also emerges at the group level. This framework can be used for conducting controlled experiments, complementing experimental research with humans on language evolution, to facilitate exploring the end goal of explaining why human languages look the way they do.