Interactive models: Matthijs van Leeuwen receives NWO TOP grant

18 January 2019

Matthijs van Leeuwen of the Leiden Institute of Advanced Computer Science develops methods to make computer models interactive. With interactive models, experts can combine information from raw data with their own knowledge to make predictions more accurate. 'In this way we hope to build models that are more accurate and understandable.’ For his project, Van Leeuwen was awarded an NWO TOP grant of €249,000.

Self-learning models

'Scientists have traditionally used models to describe complex phenomena’, Van Leeuwen says. ‘A model is nothing more than a simplified, but precisely defined representation of reality.’ Besides simple models to describe, for example, the falling of an object, also much more complex models exist, such as for making weather forecasts - which are still not always reliable. Today, machine learning enables computer scientists to develop complex models that are self-learning. The purpose of these models is to make them increasingly accurate through repetition. Famous success stories include Tesla's self-driving car and the computer that ultimately won the complex game ‘Go’ from the world's best players.

No flu?

‘However, there are also disadvantages to data-driven modelling’, says Van Leeuwen: 'Data is often not a complete substitute for expert knowledge. For example, self-driving cars do not always handle unknown situations properly, which can lead to accidents.’ Another example is Google Flu, a former web service with which Google tried to predict flu epidemics based on search terms in Google Search. Google Flu appeared to have learned incorrect associations between search terms and flu, which over time led to inaccurate predictions. To prevent this, something needs to change.

Data and experts

At the moment, most models only use data, but these data by no means comprise all existing (and required) knowledge. That’s why Van Leeuwen wants to develop methods that allow users to combine the information from the data with their own knowledge. ‘To this end, we are going to develop theory and algorithms that make it possible to automatically search for patterns in large quantities of data, but that also include the people’s own knowledge’, he explains. ‘We do this by developing interactive methods with which the user can control the data analysis. By combining data and knowledge, we hope to construct models that are more accurate and provide more insight into the underlying processes that generate the data. In the example of Google Flu, the data analyst could indicate that certain search terms, such as 'ice skating', are not directly related to flu, despite the fact that - just like 'flu' - we search for it more often in winter than in summer.’

Predicting outcomes of interventions

More accurate models are important, because data and artificial intelligence are applied all around us. Van Leeuwen: 'Unfortunately, expectations are often higher than what we can achieve with the existing methods.’ Van Leeuwen and his colleagues will mainly focus on applications in healthcare and aviation. For instance, they hope to find better models to predict the outcome of surgery in the event of a stroke. ‘It would be nice to gain new insights into this serious condition with the help of data.’