Research project

Stacked Domain Learning for multi-domain data: a new ensemble method

The aim of this project is to develop accurate but interpretable ensemble learning methods for high-dimensional multi-domain data.

Contact: Wouter van Loon

Nowadays, researchers are confronted with multi-domain data more and more often. In health research, for example, multi-domain data can occur when data are collected from multiple sources (e.g. medical imaging, genomics, questionnaires), or when different feature sets are derived from a single source (e.g. different MRI modalities). Combining data from multiple domains can potentially lead to a better understanding of disease and improved early diagnosis, but it is unknown how these domains can best be combined.

The project currently focusses on the domain selection problem: How can we identify the domains that are most important for prediction? A newly developed ensemble learning method is shown to offer a large increase in both speed and accuracy compared to existing methods.