Data Science Research Programme
The Faculty of Humanities
At the Faculty of Humanities we study languages, cultures and societies worldwide.
Leiden is a unique international centre for the advanced study of languages, cultures, arts, and societies worldwide, in their historical contexts from prehistory to the present. We aim to contribute to knowledge, the sustainable well-being of societies, and the understanding of the cognitive, historical, cultural, artistic, and social aspects of human life. In research and teaching, we focus on the mobility of people, language, culture, ideas, art, and institutions in a globalizing world, and their interconnectivity through the ages.
Data Science Research Projects
Exploring new methods in comparing sign language corpora
Project Manolis Fragkiadakis
The goal of this project is to innovate some of the most widely used tools in the analysis of signed languages. This will include the expansion of the functionalities of SignBank which is a lexical database for sign language corpora, with the purpose of enabling cross-corpus compatibility. Further, the project will also explore ways in which automated image analysis can be used for semi-automated lemma generation. The functionalities will be developed based on the collection of corpora of four African sign languages that were compiled at the Leiden University.
Currently the focus of the project is the development of a tool that utilizes dimensionality reduction techniques in order to analyze and interpret the lexical and phonological variation between different sign languages. Additionally, the application of deep learning techniques for the extraction of phonological features from sign language videos is being explored. The project is in collaboration with the Leiden University Centre for Linguistics and the Leiden Univesrity Centre for Digital Humanities.
Detecting cross-linguistic syntactic differences automatically
The main goal of comparative syntactic research is to discover the syntactic principles that all natural languages have in common, but so far it has been impossible to compare large sets of syntactic constructions in large sets of languages systematically and automatically. The online availability of parallel text corpora and software tools to align, enrich, search and analyse them has the potential to make automatic massive systematic cross-linguistic syntactic comparison possible for the first time.