Probability words: everybody interprets them differently

What exactly does it mean when your doctor says you have a ‘good chance’ of survival? Leiden researchers discovered that there is a big difference in how people interpret such probability phrases. And that can be a problem, warns lead researcher Sanne Willems in her blog post.

‘How often is often?’ is the title of the much-read Dutch blog post. And that is exactly what the study by statistician Sanne Willems was about: determining how people interpret probability and frequency words, such as often, possibly and probably. Do we actually interpret these words in the same way?

The distribution (density) of the interpretation of the probability and frequency phrases. These are smoothed versions of histograms, which causes them to pass the boundaries of 0 and 100 percent.

First large-scale study in the Netherlands

Willems: ‘Research has shown that in other languages, such as English, the interpretation of probability words widely varies. But there has never been a large-scale study of this in Dutch.’ Willems therefore joined forces with science communication professor Ionica Smeets from Leiden and statistics professor Casper Albers from Groningen. 

Research with online questionnaire

For their research, the team created an online questionnaire, in which participants had to express a word in terms of probability by using a slider. ‘We presented participants with sentences, such as “The team sometimes wins the match”. Then we asked them what the probability was of the team winning the match. Our research shows that the interpretation of probability words also differs greatly in Dutch.’

Much disagreement about ‘sometimes’ 

The 881 participants do not exactly agree on the Dutch word soms – which means sometimes – as can be seen in the graph on the right. The interpretation varies from a 10 per cent chance to even above 50 percent, a difference of more than 40 percent. ‘What also stood out was that people still gave extreme words like impossible or never a 10 per cent chance. I find it interesting to think why that is. Maybe because they think: never say never, there is always a chance.’

The distribution of probability words for statisticians (red) and non-statisticians (blue). The purple surfaces show the overlap between the two groups.

Statisticians vary just as much

The team also investigated whether statisticians assess the probability words differently than non-statisticians. It turned out that there was no clear difference between these two groups. ‘That shows that we statisticians also have to be careful when we use such probability words among ourselves.’

Words are not enough

This research clearly shows that probability words are an ambiguous way of expressing probabilities. ‘It then depends on the context whether that is a problem or not. If someone misjudges the probability of you coming to a party, that doesn't have to be a big deal. But in the medical field it can be problematic, think of a doctor who communicates a survival rate or the chance of serious side effects of a treatment.’ 

That is why Willems pleads for more awareness. ‘Only words are not enough, then the chance of miscommunication is too big.’

Advice: learn to think statistically

But what would be a better way? ‘The scientific literature is not entirely clear about that. You can use images, or percentages to express probabilities. I think that data visualisations, such as graphs and infographics, can certainly help. But a lot can go wrong with them. That is why I am now researching misleading graphs.’

Willems suggests that there is also room for improvement on the other side. ‘I think it is also important that we pay more attention to statistical thinking in education, so that we can better prepare people to deal with probabilities. Statistics seems very complicated, but the basis is easy to explain.’      

From a Dutch newspaper to Reddit

The research by Willems, Albers and Smeets is popular. More than 6000 people read the blog post on the website of the Dutch statistics association VVSOR, the online Dutch language journal Neerlandistiek reposted the blog, and it even got 2800 upvotes on the online forum Reddit. Columnist Floor Rusman from the Dutch newspaper NRC used the research to argue in favour of a mini-course on probability theory, instead of the umpteenth corona press conference. A more detailed article on the research will be published in the Dutch magazine and style guide Onze Taal at the end of the year.

Sanne Willems obtained her PhD in 2020 at the Mathematical Institute in Leiden. She recently joined the Methodology & Statistics department at the Faculty of Social and Behavioural Sciences, where she researches data scaling and the communication of statistics. She finds it important to communicate about her research: 'Many people find statistics boring and difficult. I therefore want to show everyone that it is actually not that difficult and that you can understand many statistical concepts without mathematical details.

Text: Bryce Benda

