More powerful data centre will accelerate research
Language evolution, targeted drugs or archaeological interpretation. Researchers are making increasing use of supercomputers that can rapidly process large quantities of data. This is one reason why the University data centre will be extended and updated. ‘Datamining means we can get a better picture of the past.’
The rapid rise of artificial intelligence’s in academia means an increasing demand for more-powerful installations that can perform complex calculations and store huge amounts of data. That is why the Executive Board recently gave the green light to extending and updating the University data centre. This was necessary because, having served for over 15 years, the current data centre was ready for an upgrade and the data centre in the LMUCY building will be demolished in 2024. If all goes well, the extension and refit will have been completed by the start of 2022.
About the refitThe ISSC manages the data centre. Its director, Joop van der Born, explains what will change. The current two data centres consist of 130 ‘racks’: deep cabinets containing server installations and data storage space. After the refit, the data centre will have 180 ultramodern racks with plenty of extra capacity to meet the increasing need for processing power and storage. There will also be an extra installation pavilion for the energy supply and cooling system. Another advantage is that the new data centre will be ‘as quiet as a mouse’, says Van der Born. At present, installations outside the racks hum as they cool the hot servers, but in future each rack will have its own cooling system, which will substantially reduce the racket. The data centre will be as sustainable as possible, and will reuse residual heat to heat the entire building, for instance.
Smart computer systems
Holger Hoos, Professor of Machine Learning, is one such scientist for whom a larger and more powerful data centre is crucial. He improves the algorithms of computer systems so that they themselves come up with the best solution to a problem. Software can therefore improve itself, which is of enormous help in the development of efficient drugs, complex climate models or the technology for self-driving cars. Hoos consequently works with a wide range of researchers: Leiden astronomers, for example, with whom he is trying to simulate dynamic processes in the universe.
Hoos stresses that a powerful data centre is not only necessary for his own field. ‘It is important for the entire University because we will be better able to compete globally in all those projects where it is essential to process data and information quickly. This was already the case for drug development, computer science, astronomy, chemistry and physics, but increasingly also applies to the social sciences, humanities, archaeology and so on. Computing power is becoming cheaper and we can see universities around the world setting up larger data centres. We have to do the same in Leiden.’
Experiments in clean-room conditions
Some of Hoos’s experiments are performed on the regular machines at the University data centre. ‘For other experiments we need very specialised machines operating under “clean room” conditions. As my group is in close proximity to the current data centre, we were able to place these machines, which are exclusively used for our work, there. Yet other experiments require a data centre on a national or commercial scale, such as Google or Amazon.’
Shining a light on language development
Humanities researchers are also making increasing use of the data centre’s digital muscle. Leiden linguist Jelena Prokic uses large datasets to investigate how languages change. Her insights are important for the development of language theory and an understanding of knowledge acquisition – and that says more about the development of societies.
Computing with ALICE
In 2019, the University already expanded its infrastructure for computing and data-intensive research with the new ALICE computing cluster (Academic Leiden Interdisciplinary Computing Environment). Prokic will use ALICE in several projects, including the multidisciplinary MacBERTh project. Here a team of scientists will investigate how the meaning and use of English and Dutch words and phrases has changed over time. Deep neural methods – algorithms inspired by the structure and function of the brain – will be used to search through thousands of ancient texts. Prokic: ‘In the past decade these neural networks have made spectacular advances in artificial intelligence and we can now apply them to the humanities.’
Archaeologist Alex Brandsen also makes grateful use of ALICE, the diva of the data centre. Until recently archaeologists could only use online search engines to quickly find information in the titles and summaries of fieldwork reports. This meant that a lot of interesting information was left out. Datamining methods enable Brandsen to sift through online fieldwork reports quickly and in great detail. ‘I use ALICE to train complex language models that are really good at automatically detecting and identifying archaeological concepts such as certain periods or finds.’ Such models are trained by ‘reading’ millions of sentences. Brandsen: ‘For a normal PC this takes a really long time, up to a month. On ALICE it takes around two to three days to train such a model.’
Better picture of the past
Brandsen then enters the archaeological concepts that have been found into an intelligent search engine, which more accurately finds the right information and often generates new insights. One example is ‘burials’ in the Netherlands in the Early Middle Ages. ‘The consensus is that in the Early Middle Ages, people in the Netherlands were mainly buried in the ground. But with our system we can see that there were also many cremations. This needs to be investigated further, but this kind of new information can give us a better, more complete picture of the past.’
Text: Linda van Putten