Universiteit Leiden

nl en


This Week's Discoveries | 18 April 2017

18 april 2017
Niels Bohrweg 2
2333 CA Leiden
De Sitterzaal

First Lecture

Gaussian Processes for Big Data Regression

Hao Wang (LIACS) is a PhD student  in the Algorithms and software technology group of LIACS.

Gaussian Process Regression (or Kriging) is applied in many fields as a non-linear regression model as well as a surrogate model in the field of evolutionary computation. However, the computational and space complexity of Kriging, that is cubic and quadratic in the number of data points, respectively, becomes a major bottleneck with more and more data available nowadays. In this talk, we will present a general methodology for the complexity reduction, called Cluster Kriging, where the big data set is partitioned into smaller clusters and multiple local Gaussian process models are constructed for each cluster. Moreover, we show how to combine the local models into a global model in an optimal way. We shall also briefly illustrate the proof of the optimal combination approach that we developed. This novel method has been applied to an industrial data set in a NWO-funded research project, where it performs competitive to other state-of-the-art big data regression methods in terms of both accuracy and computational overheads. 

Second Lecture, Lorentz Center highlight

Data Science with Human in the Loop: Harnessing User Semantics at Scale

Lora Aroyo (VU Amsterdam) is full professor Computer Science, at Vrije Universiteit Amsterdam, where she leads the Web & Media Group. Her research work is focused on semantic technologies for modeling user and context for personalized access of online multimedia collections, e.g. cultural heritage collections, multimedia archives and interactive TV. She has been prominently involved in national and international Digital Humanities initiatives. She was a scientific coordinator of the NoTube project, dealing with the integration of Web and TV data with the help of semantics, and a number of nationally funded projects, such as CHIP and Agora, dealing with modelling events and event narratives. Lora is actively involved in the Semantic Web community as a program chair for the European and the International Semantic Web Conferences. She is also actively involved in the Personalization and User modeling community as vice-president of the User Modeling Inc.. She is a three time holder of IBM Faculty Awards for her work on Crowd Truth: Crowdsourcing for ground truth data collection for adapting IBM Watson system to medical domain.  Web: http://lora-aroyo.org Twitter: @laroyo and slideshare:  http://www.slideshare.net/laroyo  Lora is one of the organisors of the Lorentz Center workshop “Language, Knowledge and People in Perspective” that is being held from 18 Apr 2017 through 21 Apr 2017.

Software systems are becoming ever more intelligent and more useful, but the way we interact with these machines too often reveals that they don’t actually understand people. Knowledge Representation and Semantic Web focus on the scientific challenges involved in providing human knowledge in machine-readable form. However, we observe that various types of human knowledge cannot yet be captured by machines, especially when dealing with wide ranges of real-world tasks and contexts. The key scientific challenge is to provide an approach to capturing human knowledge in a way that is scalable and adequate to real-world needs. Human Computation has begun to scientifically study how human intelligence at scale can be used to methodologically improve machine-based knowledge and data management. My research is focusing on understanding human computation for improving how machine-based systems can acquire, capture and harness human knowledge and thus become even more intelligent. In this talk I will show how the CrowdTruth framework (http://crowdtruth.org) facilitates data collection, processing and analytics of human computation knowledge.