Universiteit Leiden

nl en


Fairness and Transparency, towards responsible data science

  • Ricardo Baeza-Yates
  • Ute Schmid
  • Ismael Rafols
  • Mireille Hildebrandt
Tuesday 5 March 2019
Including lunch
Langegracht 70
2312NV Leiden


Via this link.


The abundance of data offers many opportunities for technological innovations and for improved decision making. At the same time, these opportunities also create new challenges and dilemmas related to the way in which data-driven technologies are embedded in our society. For instance, to what extent do we want to make important decisions based on algorithms and data? Do we have a sufficient understanding of the outcomes of complex data-driven analyses? How does data-driven decision making influence human behaviour? What are the pros and cons of data-driven decision making as compared to more informal ways of making decisions?

These are key questions in the area of responsible data science. The Leiden University Data Science Research Programme and Centre for Science and Technology Studies organize this symposium in which questions such as these will be explored from different disciplinary viewpoints, ranging from the social sciences and law to the medical sciences and computer science. The aim is to develop a common understanding of the main challenges in responsible data science and to learn from the experiences in different scientific domains.

The symposium is open to everyone, so bring your colleagues!

This meeting is recognized as part of the SIKS Activity Program.


12.00 Doors open Walking lunch
12.30 Opening  
  Keynote lecture Ricardo Baeza-Yates: Bias on the Web
  Invited talks Mireille Hildebrandt: Saving Machine Learning from Crappy Applications
    Ute SchmidInteractive Learning with Mutual Explanations
    Ismael RafolsContextualisation and participation for responsible use of quantitative evidence in policy
  Lightning talks Annelieke van den Berg, Shannon Kroes
17.00 Posters and drinks Posters by the PhD candidates of the Data Science Research Programme


Ricardo Baeza-Yates

Bias on the Web

Our keynote speaker is the world famous computer scientist Ricardo Baeza-Yates. His areas of expertise are information retrieval, web search and data mining, data science and algorithms. He is currently a Professor at Northeastern University, Silicon Valley campus, since August 2017. He is also CTO of NTENT, a semantic search technology company based in California since June 2016. Before he was among others VP of Research at Yahoo Labs, based in Sunnyvale, California. In 2009, he was named ACM Fellow and in 2011 IEEE Fellow.

Baeza-Yates: "Our inherent human tendency of favoring one thing or opinion over another is reflected in every aspect of our lives, creating both latent and overt biases toward everything we see, hear, and do. Any remedy for bias must start with awareness that bias exists; for example, most mature societies raise awareness of social bias through affirmative-action programs, and, while awareness alone does not completely alleviate the problem, it helps guide us toward a solution. Bias on the Web reflects both societal and internal biases within ourselves, emerging in subtler ways."


Mireille Hildebrandt

Saving Machine Learning from Crappy Applications

Mireille Hildebrandt is a tenured Research Professor on 'Interfacing Law and Technology' at Vrije Universiteit Brussels which she combines with her work with the research group on Law Science Technology and Society studies (LSTS) at the Faculty of Law and Criminology. She also holds the parttime Chair of Smart Environments, Data Protection and the Rule of Law at the Science Faculty, the Institute for Computing and Information Sciences (iCIS) at Radboud University Nijmegen. 

Her research interests concern the implications of mindless artificial agency for the core tenets of constitutional democracies. Hildebrandt: "The core data protection principle is that of purpose limitation. This is often - wrongly - seen as an obstruction to innovation. In my talk I will discuss (1) the First Law of Informatics that states that in research data should only be used for the purpose for which they have been collected, and (2) the need to require pre-registration of research design as core requirements for the methodological integrity of ML. Both have been developed by computer scientists, not by lawyers or ethicists. I will argue that a legal framework is needed  o ensure that ML developers are free to develop the best possible ML practices, based on a sound and contestable research design. Finally my point will be that this will contribute more to FAT ML than a narrow focus on bias". 

Ute Schmid

Interactive Learning with Mutual Explanations - How Humans and Machine Learning Systems can Profit From Each Other

Ute Schmid holds a professorship of Applied Computer Science/Cognitive Systems at the University of Bamberg. Her research interests are mainly in the domain of comprehensible machine learning and high-level learning on structural data, especially inductive programming, knowledge level learning from planning, learning structural prototypes, analogical problem solving and learning. 

Schmid: "Classifier learning is relevant in many application domains such as autonomous driving, medical diagnosis, connected industry, or education. There is a growing recognition that such machine learned models need to be transparent and comprehensible. One approach to address this problem - especially in the context of deep learning for image classification - is to highlight that parts of an image which have been relevant for the classifier decision. Under the label of explainable Artificial Intelligence (XAI) not only different approaches for visualisation are explored but there are also classic (TREPAN) and current (LIME) proposals of how to generate symbolic rules to explain a classifier decision.

I will argue that presenting either visualisations or rules to a user will often not suffice as a helpful explanation. Instead, I will propose a variety of textual, visual, and example-based explanations. Furthermore, I will discuss that explanations are not "one size fits all" but that it depends on the user, the problem, and the current situation which explanation is most helpful. Finally,  in will present a new method which allows the machine learning system to exploit not only class corrections but also explanations from the user to correct and adapt learned models in interactive learning scenarios."

Ismael Ràfols

Contextualisation and participation for responsible use of quantitative evidence in policy: The case of indicators for science and innovation policy

Ismael Ràfols is a research fellow at Ingenio (CSIC-UPV, València), visiting professor at the Centre for Science and Technology Studies of Leiden University and associate faculty at SPRU (Univ. Sussex, Brighton). He explores how to develop more plural and inclusive uses of S&T indicators for informing evaluation, funding and research strategies in science policy. He currently works on ‘responsible metrics’ (Leiden Manifesto), with a focus on inclusiveness, and priority setting in science using research portfolios analysis

Ràfols: "The use of indicators in research policy and evaluation is widely perceived as problematic. In order to prevent misuses and to foster a more responsible use of indicators for science management, various initiatives such as the San Francisco Declaration on Research Assessment (DORA) and the Leiden Manifesto have been proposed. In this presentation, I will introduce these initiatives – and will then argue that rather than piecemeal improvements for damage prevention, we have to reconsider in general the place and role of quantitative evidence in policy. I will argue that expert advice should not separate knowledge formation from decision-making under conditions of uncertainty and lack of value consensus. According to this view, most data science is currently too focused on technical issues, too reductionist and isolated from the contexts and values of its use.

I propose three moves for improving design and use of indicators for policy. First, to continue ongoing trends towards pluralising the data sources, processing and visualisation techniques, and to expand the research communities contributing to the making of quantitative evidence. Second, to develop forms of quantitative evidence that can be contextualised with the participation of stakeholders. Third, to open up the policy framings implicit in measurement, and use quantitative analyses to reveal more balanced perspectives of existing and alternative policy options. I will conclude by arguing that these shifts are necessary to achieve robustness associated with epistemic diversity and to preserve political pluralism when quantitative evidence is used in policy making."

This website uses cookies.  More information.