Universiteit Leiden

nl en

‘Behaviour comes to us in big data’

Jurist Gineke Wiggers wants to predict the expected impact of legal articles. Carel Stolker, Rector of the University and, like Wiggers, a legal specialist, is enthusiastic about the research. ‘A big data project like this will help us establish the effect of our work on society.’

What is this research about?

Gineke Wiggers (GW): ‘I want to predict which publications will have the greatest impact on the work of legal specialists. I want to do this with usage data about publications from a legal search engine. At the moment, the relevance is determined using data such as “how often does the search term occur?,” “where in the text does the term appear?” and “how old is the result?” But these are textual characteristics. I’m therefore going to look at which usage characteristic relates most strongly to later citations, and how you can use this information to improve the ranking. So, for instance, if it relates to the absolute number of readers, in other words the height of the peak, or to the breadth: over which period is the article being read?’

Carel Stolker (CS): ‘So let’s say I use a search engine such as Navigator or Legal Intelligence and search for the “period of limitation” of Article 3:310 of the Civil Code, will the search engine automatically rank the results for me?’

GW: ‘That’s right. I’ve been given anonymised data by Legal Intelligence. Over half of the profession uses this. Legal Intelligence records which articles are read and when and which articles are cited and where and how. This will enable me – I hope – to develop a model to improve this ranking. Then you can bring those particular articles to the attention of users at an earlier stage, which means they are quicker to find the information they need.’

That’s not very common research for a jurist...

GW: ‘I joined the Data Science Research Programme at Leiden University on 1 May. It also has an educational component, so I’m now following a course on statistical learning. I’m learning to define boundaries, choose the right model and determine its reliability. The aim is to build my own classifier, the model that determines the ranking of the publications.’



Gineke Wiggers is developing a model for a legal search engine that will predict the expected impact of publications.

What could be the effect of such a model on the discipline?

CS: ‘Enormous. One of things that I find so appealing about law is that academic research and legal practice are so interconnected. Colleagues at our faculty are also judge pro tempore, legislator or lawyer, or they work in business. So everything that we publish is read by those who work in practice. But for those people who only work in practice, it is important to know what they can trust. What is good and what is weak scientific research. We now know that grosso modo from the reputation of journals, but that can suddenly change. Just think about the alternative forms of publications that are emerging.’

But can you judge that from the number of times that an article is read?

CS, smiles: ‘That is an important part of Gineke’s assignment. She has to ensure that the model yields reliable results.’

GW: ‘Research in other disciplines has already shown that there is a link between the use of publications and the number of citations, or what the academic verdict is on a publication. My research will establish whether that link also exists in the legal domain, with the significant interconnection between academia and practice. A subsequent question is whether that link is significant enough to improve the ranking in search results.’

CS: ‘What you’re really talking about is a kind of relative usability or valorisation of the search results. A system that helps people in academia and practice to find their way through the forest of publications.’

What could be good future applications of data science in the study of law?

CS: ‘Data science is going to help us jurists establish the influence we now have on society. We have built cathedrals of legislation over the centuries. This legislation should influence people’s behaviour: what we want them to do or not to do, or what they should do differently. But what we as jurists have never actually done is to verify this properly. They do that more in the US. This branch of law that veers more towards the social-sciences is called socio-legal studies. What, for instance, do all those articles about neighbour law do to people who actually live next door to each other? We may now be better able to obtain this knowledge, because behaviour comes to us in big data. This makes law complete.’

Gineke Wiggers has a place on the Data Science Research Programme at the Leiden Centre of Data Science. How did this research programme arise?

CS: ‘Each year we as the Executive Board spend two days away with faculty boards and scientific directors. Two years ago, a few deans came to me during these days and said, “we need to do more with data science. If we all make a PhD place available, can the University also do that centrally for each faculty?” We sat to deep in the night thinking about how to approach this. It ended up being the Data Science Research Programme, with more than 14 research assistants already!’

What is the value of the programme to the University?

CS: ‘It’s fantastic that colleagues from all the faculties are involved. Researchers from all sorts of disciplines, like Gineke and her professors. I expect that a kind of new, let’s call it a mini research school, will arise between these researchers. They will be able to learn from each other and improve their research. But above all I hope that they will be able to sow their ideas about the importance of big data in their own faculties, by talking about it to their colleagues. This will enable us in Leiden and The Hague to join the national and international vanguard in this field, over the full breadth of our seven faculties.’

Data Science Research Programme start symposium

Within the Data Science Research Programme (DSRP) a number of researchers started a PhD programme in 2017. Each PhD candidate will learn how to use data science in his or her own discipline. The first projects of the DSRP will be presented at the starting symposium in the Academy Building on Thursday 7 September. Two Leiden Spinoza Prize winners will reflect on data science. If you a member of the University’s academic staff, you are welcome to come and be inspired to develop data science projects in your own specific field.