Universiteit Leiden

nl en


Massively collaborative machine learning

Promotor: J. N. Kok, Co-promotor: A. J. Knobbe

J.N. van Rijn
19 december 2016
Thesis in Leiden Repository

Many scientists are focussed on building models. We nearly process all information we perceive to a model. There are many techniques that enable computers to build models as well. The field of research that develops such techniques is called Machine Learning. Many research is devoted to develop computer programs capable of building models (algorithms). Many of such algorithms exist, and these often consist of various options that subtly influence performance (parameters). Furthermore, there is mathematical proof that there exists no single algorithm that works well on every dataset. This complicates the task of selecting the right algorithm for a given task. The field of meta-learning aims to resolve these problems. The purpose is to determine what kind of algorithms work well on which datasets. In order to do so, we developed OpenML. This is an online database on which researches can share experimental results amongst each other, potentially scaling up the size of meta-learning studies. Having earlier experimental results freely accessible and reusable for others, it is no longer required to conduct time expensive experiments. Rather, researchers can answer such experimental questions by a simple database look-up. This thesis addresses how OpenML can be used to answer fundamental meta-learning questions.