Ethical standards for data science

Computers are becoming so smart that in the future they will perhaps take over the role of judges. In the meantime, experts at Leiden University are examining the question of which standards responsible data science should meet.

The time when judges will be replaced by computers is a becoming step closer every day, with the rapid developments in the area of data science. ‘In the future, a computer will be able to extract elements from thousands of comparable cases, link them together and make the best decision on the basis of those elements,’ says Jaap van den Herik, Professor of Computer Science and Law at Leiden University. ‘I think computers will already be producing simple judgements in about 15 years. And by 2080 computers will be better at making ethical decisions than judges.’

The key to that future is ‘deep learning’, where a computer programme itself detects patterns in large numbers of cases. Using this technology, a computer has already learnt how to defeat the human world champion at the game Go. However, much more progress still needs to be made before judges have to relinquish their gavels to computers, says Van den Herik.

More information needed

‘The number of possibilities in Go is extremely large, but finite. By contrast, the number of possibilities in legal judgements is infinite.’ There can also be no dispute about the outcome of the game, which is another point of contrast with legal judgements. This is because cultural differences lead to differences in the outcome of cases. ‘To learn how to make properly substantiated decisions, the computer must have access to a wide range of data relating to the context of earlier cases,’ says the professor. ‘A part is often played by data that may not be stored, such as race, religion and sexual orientation. The lack of that context makes it difficult for a computer programme to analyse cases correctly.’ Another problem is that there are sometimes very few accessible past cases. The more data available for the computer programme to learn from, the better its judgements will be.

Responsible data science

While the role of computers in society is increasing all the time, public confidence in computers appears to have dropped. This is partly due to concerns in the area of security and privacy. To reverse this trend, says Van den Herik, more research is needed on the conditions for responsible data science.

‘The data that you use for this must be accessible and processable by a computer. They must be securely encrypted and must guarantee people’s privacy. Here in Leiden we’re setting the standard for research data with the FAIR principle.’ (see also ‘Leiden: Silicon Valley of FAIR data’)

In the professor’s view, students working with data science should learn at an early stage to incorporate ethical considerations in their methods. Together with his own research group, one of the questions he is investigating is how responsible data science relates to legal practice. ‘How do you measure all those conditions at the present time? And how will you do that in ten years’ time? It’s certain that by then the issue won’t only be responsibility but also liability. With computers becoming smarter every day, it’s high time to start having serious discussions about this.’

The International Data Responsibility Group (IDRG)

