A treasure trove of legal data
Data science offers great opportunities for legal research, according to Simone van der Hof and Bart Custers (eLaw). But at the same time, we have to keep an eye on the unwanted side effects of big data - such as ethnic profiling.
Traditionally, lawyers tend to work with data on paper. However, the use of big data methods is increasing in this academic discipline as well. Within the Faculty of Law, Simone van der Hof (Professor of Law and Information Society) and Bart Custers (head of research at eLaw) are leading in this respect. On 23 September, 2016, they are organising a meeting for their colleagues, in collaboration with the Leiden Centre or Data Science (LCDS): the Research Lab Legal Data Science.
What is the main purpose of the Research Lab on 23 September?
Simone van der Hof: ‘In our field, data science provides many opportunities: we have a treasure trove of legal data that can be unlocked using these new methods. However, the use of such methods is still in its infancy. Our goal is to show our colleagues what is possible in this area, and to hear their questions and ideas about these new methods.’
What are the possibilities of data science in the legal domain?
Bart Custers: 'Lawyers traditionally work with a lot of data, even though they might not perceive it as such. We have libraries full of books, reports and case law. Using text mining, for example, we can make all this data more accessible: instead of reading all these texts ourselves, we can let a computer discover patterns in them. It is now possible, for instance, to search a database not only by keyword, but also by context. That means a computer can learn to recognize whether a court ruling turned out positive or negative for someone.’
And how can this be applied?
BC: ‘For instance, we will be able to predict the outcome of a lawsuit much better, since computers can rapidly process and analyze the results of tens of thousands of similar cases. You can imagine it will be possible to make a risk assessment before hiring an expensive lawyer. And that the computer can tell you in no time: better not start this lawsuit, because the chance that you will win the case is very small.’
SvdH: ‘In addition, the use of data science methods can bring a lot to science itself. In the field of comparative law, for example: we now do everything by hand, but who knows what we may discover if we use an algorithm to analyse the data? Possibly, some completely unexpected patterns will appear.’
That all sounds great. Are there any limitations to this type of research, at this point?
SvdH: ‘What makes it a bit challenging, is that lawyers still speak a very different language than computer scientists do, while at the same time it is particularly important to work together. Our Research Lab is also an attempt to bridge that gap.’
BC: ‘Another difficulty is that lawyers mainly work with textual data. For computers, texts are much more complex to analyze than numbers and Excel files, which are used in many other academic disciplines. But fortunately, the field of big data analytics is quickly evolving.’
Being legal researchers, you are also concerned with the ethical component of big data, I presume?
SvdH: ‘Exactly. On the one hand, we look at the possibilities of these new research methods in the legal domain, and on the other hand, we look at big data from a legal perspective: how can it be used without undesirable side effects?’
What kind of side effects should we think of?
BC: ‘Sometimes, for instance, patterns emerge from the data that we should be very careful with. Such as links between ethnicity and crime: sometimes there is a statistical correlation, but that does not necessarily mean there is a causal link. What do we do with that kind of data? That is an important issue. Lately, for example, ethnic profiling was in the news a lot: it is desirable that the police more frequently stops people who statistically belong to a specific risk group?
You might say that by using big data and making risk profiles, the police can work more efficiently.
BC: ‘Also with that, we should be very careful. Suppose the police supervises some areas more intensively than others. It can be expected that they will find more crime in those neighborhoods where they are most present, while in other neighborhoods a lot of crime may be going on too. That kind of tunnel vision may distort the data. The problem is that a biased way of collecting data will confirm such a bias even further.’
Meanwhile, citizens have no idea of what kind of data is being collected about them and how these profiles are created…
BC: Indeed, that lack of transparency is a problem. As a citizen, you may end up in a certain risk group and be completely unaware of it. In the meantime, it may have a big impact on your life: perhaps your insurance is more expensive than your neighbour’s, or maybe you can’t even get a mortgage because of your profile.
SvdH: ‘Citizens are registered in hundreds of different databases. What happens with all that data? That is really a black box. At eLaw we also focus on that kind of issues: what impact do these new ways of dealing with data have on society? And what safeguards are needed in order to protect the rights and privacy of individuals?’
Could you give an example of such a safeguard?
SvdH: ‘We must ensure, at all times, that the human dimension is maintained when making decisions based on data. That is to say, whether at a government institution or at an insurance company, there should always a human link in the process. An employee who looks at the bigger picture and who thinks about what effects certain decisions may have. In the systems we use, there should always be space for humans to think critically, rather than just let the computer decide.’
Legal researchers seem to have an important role in these issues.
SvdH: ‘Yes. Law is often seen as an obstacle - ‘those lawyers again…’- but if you look at the bigger picture, you can see how important it is to think about what rules we have to build into the system in order to protect our free and open society. Big data provides great opportunities and applications, but we have to make sure that the world will not become a place where we no longer feel at home.’
Simone van der Hof is Professor of Law and Digital Technologies and head of eLaw, the Center for Law and Digital Technologies at the Faculty of Law. Among other projects, Van der Hof led the NWO project 'Empowering and protecting children and adolescents against cyberbullying'. She is also programme director of the Advanced Master Law and Digital Technologies.
Bart Custers is associate professor and head of research at eLaw. He is project leader of two EU projects: EUDECO, on reuse of data in the context of big data, and e-SIDES, on the ethical and legal implications of big data. He has published the book 'Discrimination and Privacy in the Information Society' about the effects of data mining and profiling in the context of big data.
This article is part of a series of interviews with researchers from the Leiden Centre of Data Science (LCDS). LCDS is a network of researchers from different scientific disciplines, who use innovative methods to deal with large amounts of data. Collaboration between these researchers leads to new solutions to problems in science and society.