Data Science Research Programme
The Faculty of Science
The Faculty of Science is a world-class faculty where staff and students work together in a dynamic international environment. It is a faculty where personal and academic development are top priorities. Our people are driven by curiosity to expand fundamental knowledge and to look beyond the borders of their own discipline; their aim is to benefit science and to make a contribution to addressing the major societal challenges of the future.
The research carried out at the Faculty of Science is very diverse, ranging from mathematics, information science, astronomy, physics, chemistry and bio-pharmaceutical sciences to biology and environmental sciences. The research activities are organised in eight institutes. These institutes offer eight bachelor’s and fifteen master’s programmes. The Faculty has grown strongly in recent years and now has more than 2,300 staff and over 5,000 students. We are located at the heart of Leiden’s Bio Science Park, one of Europe’s biggest science parks, where university and business life come together.
Data Science Research Projects
A new era for nature conservation using hyperspectral and lidar data; Oostvaardersplassen as a case study
Nuno César de Sá
This project aims to develop advanced data analysis methods for monitoring and increasing our understanding on biodiversity dynamics in nature reserves such as the Oostvaardersplassen. Earth observation methodologies have incredibly improved over the past decade. As a result, applications to nature management come in range, but these demand new ecoinformatics tools for nature conservation, e.g., for tracking animals based on hyperspectral data, and for linking spatial and temporal patterns of animal movement to vegetation characteristics.
Minimal structure modeling
Existing work in probabilistic language modeling can be mostly divided into two categories: (i) Purely sequential, string-level approaches ensure fluency at the local level without notion of grammaticality and seek improvements in the use of massive training corpora. (ii) Fully structural, tree-based approaches model text as the realization of latent tree structures that encode complex grammatical dependencies. This project explores a third way where only structural relations required to produce grammatical sentences in a specific language and task are modeled.
Knowledge Discovery and Data Mining from patient experience repositories
This PhD project, funded by the Dutch SIDN fonds, is part of the Patient Forum Miner (PFM) research programme. Patients often share experiences on internet forums. These experiences often contain valuable information for patients, medical specialists and researchers. This information is hidden in an abundance of messages for emotional support. The aim of the PFM programme is to extract the information which is of real value and to formulate hypotheses which can be input for further clinical research.
Exploratory Data Analysis for Multimodal Data
To assess the use of a public park surrounding a nursing home, a multitude of data will be gathered in form of survey and sensor data. Exploratory data mining techniques will be developed to explore this multimodal information and to find associations across and within different information sources.
Socially Embedded AI Systems
This interdisciplinary project lies at the intersection of AI, Cognitive Psychology, and Linguistics. It explores several adaptive machine learning methods which can give insight into the interaction between human and machine. The ultimate goal is open and natural communication between humans and AI that should result in mutual trust, cooperation and coordination possibilities between both. To do so, we attempt to create a natural setting that allows machine learning algorithms to learn complex human- and social characteristics.
Data Science for State-of-the-Art Blood Banking (BloodStart)
This project is a collaboration between Sanquin and Leiden University, and will deliver enhanced data-driven models and evidence-based donor management strategies that will maximize the effectiveness of resources and minimize donor loss.
Visual Relation extraction Based on Deep Cross-media Transfer Network
This project will build a Deep Cross-media Transfer Network to extract a visual relation that relieves the problem of insufficient training data for a visual task.
Modeling interactions to unravel biomarkers for disease progression and treatment response
Large biobanking studies of healthy volunteers and patients are increasingly conducted for analyzing using molecular high-throughput molecular profiling (“omics”) technologies such as genomics, transcriptomics and metabolomics to obtain insights in molecular alterations underlying disease.
A major challenge for the analysis of such large clinical datasets associated with multiple high-dimensional datasets represents the integration of multiple omics technologies and typically longitudinally measured clinical data in a statistically and biologically meaningful way.