Leiden University Centre for Digital Humanities
Small grant projects
The LUCDH awards Small Research grants to foster the development of new digital research. We are sponsoring scholars to create databases, X-ray fragile letters, conduct text analysis, map historical information, and more.
In this project, we are developing an advanced search engine for Dutch archaeological excavation reports, called AGNES. Archives in the Netherlands currently contain over 70,000 excavation reports and this number is growing by 4000 a year. This so-called ‘grey literature’ contains vast amounts of knowledge stemming from millions of euros in funding, but are barely being utilised at the moment, mainly because researchers have trouble finding the information they need. The reason for the limited access is that the documents can only be searched by their metadata; the full text is not indexed yet for keyword search. Questions like “find all cremation remains from the early Middle Ages around Amsterdam” are currently impossible to solve without reading all the documents.
A unique feature of our method is that we develop a method for semantic search in the archaeological domain. The indexing method will use machine learning to automatically recognise and distinguish between archaeological concepts in the text, such as time periods, artefacts, materials, contexts and (animal) species. This will make searching a lot easier and quicker.
To be able to do this, we need text that is labelled by humans, called training data. Labeling data is a very time intensive process, and we are very happy that the LUCDH is providing us with funds to be able to create this training data.
Manuscript collection WARD 16, held at The National Archives (Kew, UK), comprises 436 unopened letters and packages from the sixteenth and seventeenth centuries. Based on the addresses on these letter bundles, in as far as these are still legible, it seems each letter contains legal pleadings (the textual by-product of legal proceedings, such as bills of complaint, answers, interrogatories, and depositions) sent to the Court of Wards and Liveries and the Court of Requests in London. It remains a mystery why these documents have never been opened. Opening up these manuscripts might solve this mystery, but has the disadvantage of being something that is irreversible: once opened, it is impossible to return the letter to its unopened state unharmed. Since the letter as artifact is worthy of attention and analysis in and of itself, opening these closed manuscripts is to be avoided. The Apocalypto team at the Dentistry Department at Queen Mary University of London has developed a technique that makes it possible to reveal the contents of a closed manuscript, without damaging or opening it. As such, the letter can be studied as artifact, while at the same time making it possible to reveal its hidden secrets.
The LUCDH’s small grant makes it possible to strike up a conversation between The National Archives, the Apocalypto team, and Leiden University, with the aim of x-raying a sample of manuscripts, and as such, making the unreadable readable.
This research projects aims to investigate ways of representing rivers in ‘literary worlds’ of the Italian Renaissance. The aim of the project is to develop one or more samples for the digital visualization of these worlds, in which extraliterary, ‘real world’ geography is combined with fictional, imaginative geography. The project serves as a technical preparation for the Veni-proposal (to be submitted in August 2019) of a research project that aims to investigate the pivotal role of rivers in constructing cultural and literary identity in early modern Italian epic poetry.
This project aims at mapping the genealogical networks of the group of chungin (middle people) of the Chosŏn dynasty, in an attempt to examine the origin and the development progress of the chungin group in Chosŏn Korea.
In the social hierarchy of Chosŏn Korea, the chungin was a group ranked between the yangban (aristocracy) and the commoners. The social mobility of the chungin is one of the central issues in the social history of Korea. Previous scholarship reveals that the group of chungin intensively developed around the middle 17th century. Although there has been no conclusive research on the origin of the group, some scholars claim that the chungin group was branched from downgrade yangban aristocrats. However, due to the absence of a tool enabling a macroscopic investigation, scholars have only been able to test the hypothesis by case studies of partial families.
The first goal of the project is to test the hypothesis that the chungin was a group branched from the downgrade aristocrats. By extracting genealogical information from big datasets like the bangmok (Rosters of Imperial Examinations), it allows us to map genealogical networks of different social groups in Chosŏn. In order to investigate the relation between the aristocracy and the middle people, the network interconnects the lineage and marriage data of both the yangban and the chungin groups.
Additionally, the project also contributes to filling up the gap of the digital infrastructure for chungin studies, a field which has received limited attention in Korean Studies. The final outcome of the project is designed as an intractable platform of visualized genealogical networks. Users are allowed to manipulate the networks by querying with time reference, regions, families or types of relationships (lineage or marriage). In this way, the project takes the initiative to benefit studies on minority social groups in the Chosŏn dynasty by opening up a new digital avenue.
Carina van den Hoven
In November 2017, the Egyptian Ministry of Antiquities granted Carina van den Hoven (NINO/LIAS) the concession for Theban Tomb 45 in Egypt. This beautifully decorated tomb dates to ca. 1400 BCE and is situated in the Theban necropolis, a UNESCO World Heritage Site close to modern Luxor. Theban Tomb 45 is a fascinating case of tomb reuse. The tomb was built and partly decorated with painted scenes and texts around 1400 BCE for a man named Djehuty and his family. Several hundred years later, the tomb was reused by a man called Djehutyemheb and his family. Even though the practice of tomb reuse calls to mind images of usurpation, tomb robbery and destruction, Theban Tomb 45 was reused in a non-destructive manner, with consideration for the memory of the original tomb owner. Instead of vandalising an earlier tomb and whitewashing its walls in order to replace the original decoration with his own, the second tomb owner left most of the existing decoration in its original state. He added his own decoration only to wall sections that had been left undecorated by the first tomb owner. He also retouched a number of the original paintings. For example, he altered the garments, wigs and furniture depicted on the tomb walls, in order to update them to contemporary style.
(For photographs, see the project website: www.StichtingAEL.nl)
Dr. van den Hoven has obtained funding from the Gerda Henkel Stiftung for the conservation of the wall paintings as well as for heritage preservation and site management activities in Theban Tomb 45. With the grant from the Leiden University Centre for Digital Humanities, she undertakes a pilot project in order to develop a proof of concept on the digital documentation and material analysis of the painted decoration of this tomb, with a special focus on the repaintings that were carried out by the second tomb owner.
An important aim of the fieldwork project in Theban Tomb 45 is to contribute significantly to the development of new standards for the documentation, analysis, publication and accessibility of ancient material culture. With the help of non-invasive digital imaging technologies we aim to create a scale 1:1 digital record of the tomb, which documents not only its architecture and decoration, but which also functions as a digital tool to interactively investigate the monument even after leaving the field.
Another important aim of the project is to test the usefulness of photo processing software applications and digital imaging technologies in the material analysis of ancient wall paintings. Part of the painted decoration of Theban Tomb 45 has faded over time and is now barely visible with the naked eye. In rock art studies, the software application ImageJ (and in particular its plugin DStretch) has been used successfully to enhance images of pictographs that are invisible to the naked eye. We aim to test the usefulness of photo processing software applications and digital imaging technologies in enhancing the legibility of faded wall paintings, as well as in identifying and analysing wall paintings that have been retouched or repainted in antiquity.
The investigation of the materiality of the (re)painted tomb decoration of Theban Tomb 45 allows us to open up new research opportunities on the question how the ancient Egyptians engaged with and perceived their own cultural heritage.
Links to websites:
The Leiden Williram, an eleventh-century text that is decades older than the hebban olla vogala poem and the sole substantial Old Dutch text that has come down to us in an original manuscript, has thus far only been accessible to a handful of scholars. Although the text can be consulted in the 1971 edition of Sanders or via the pdf-version of the digitized manuscript at the Leiden University library website (manuscript BPL 130), the linguistic character of the Leiden Williram and its philological context requires specialized philological expertise in order to gauge and appreciate the Dutch nature of it. This is because the Leiden Williram is a Dutch copy of a German commentary to the Song of Songs, authored by Williram von Ebersberg. The Dutch copyist, probably working in the scriptorium of Egmond, respected most of the German grapho-phonemic conventions, but adapted the language at the morphological and lex ical level to its own Dutch dialect, thereby giving the Leiden Williram a hybrid linguistic identity (cf. Sanders 1974). Because of its linguistic opaqueness, it may come as no surprise that the significance of the Leiden Williram for Dutch cultural heritage is largely unknown to a wider Dutch public.
The aim is to develop a proof of concept for a digital edition of the Leiden Williram in which the Old Dutch lexis and Old Dutch morphology of the text can be highlighted at the click of a button, thereby disclosing the value of the text to a 21st century audience and making the Old Dutch nature of the text available to non-linguistic scholars and an interested lay public alike. This innovative digital curation of the text will be executed by providing a TEI-compliant (Text Encoding Initiative) XML edition, based on the 1971 edition and the digitized manuscript. The text edition can then be published online on a dedicated website, hosted on Leiden University servers under a Creative Commons license.
Thesauri are lexicographic resources that organize words and word senses. One of their main uses is to look up available alternative phrasings, but thesauri offer a number of uses beyond that: they are veritable treasure troves for cultural, linguistic, anthropological, and literary-critical research. This is especially the case when such thesauri are arranged in a topical fashion, an ordering of all its groups of synonymous words according to their meaning. The current forms in which the majority of these thesauri are available make it difficult for scholars to use them to the fullest. In making existing historical thesauri available in a more suitable form for use and reuse, bringing them to the Semantic Web as Linked Data, this project aims to facilitate a wide variety of research that focusses on the vocabulary and culture of current and past times.
The proof of concept this project intends to deliver comes in the form of a web-based platform for historical thesauri that have been expressed in a Semantic Web form. With it, Stolk aims to demonstrate the added value of such a form for these thesauri. This novel platform should make it easier for users to access these thesauri, query them, filter them, and expand on them – a platform for which one need not be a computer scientist in order to utilize. This web-based platform will sport a user interface specifically for thesauri, able to display and browse them by recognizing the Linked Data terminology used for lexical concepts and lexical senses attributed to them. Furthermore, the platform needs to be fitted with an intuitive manner to formulate new queries based on historical thesauri content.
Which lexical items were available to Shakespeare? Which ones seem to have been restricted to poetry (possibly because they were considered archaic), and do such restrictions appear to have had an impact on the word choices Shakespeare made for his plays? Which words did the poet and playwright prefer over others that expressed the same notion? The added value of both the visualisation and querying mechanisms of the platform will be demonstrated by answering just such questions.
The use of numismatic sources is incorporated in Claes’s research project “Dialogues of Power”. This project aims to analyse the legitimising dialogue between Roman emperors and their Germanic legions during the so-called “crisis of the third century”. By doing so, this project will shed new light on how the loyalty of soldiers towards their emperors was established and which communication strategies emperors used to (re-)gain the political and military stability that ultimately helped to reunify the Roman Empire after the crisis.
Because imperial coinage is a direct vehicle for imperial communication, various messages, visual and verbal, could be disseminated by them. In the third century, silver coins still formed the major part of soldiers’ payments. This means that the imperial centre had an excellent tool to send messages addressing the military. For instance, messages of victories or trust in the army on these silver coins could flatter soldiers in order to win their loyalty.
The analysis of coin messages will be based on coin hoards, deposited between AD 180 and 285 in and near the military zone of the Lower Rhine limes, especially at the military camps in Germania Inferior. Although hoards often represent a random sample taken from circulation, hoard-based samples represent an untouched record from antiquity, providing information about the period in which the coins were used and deposited. Moreover, these coins from hoards give us additional information as opposed to single coin finds. The date of the youngest coins in a hoard give us a terminus post quem for the date at which coins stopped to be added to the hoard and the composition of a hoard can give us an idea about the date of the deposition of the coins, which may be different from the closing date. Most coins composing a hoard were probably withdrawn from circulation within a short space of time before the date at which the saving process ended. All in all, it can give us an idea when certain coin messages were disseminated. Additionally, hoards are found abundantly at the Empire’s northern border, and more importantly, the coins’ withdrawal from circulation was not related to the messages on them.
I plan to develop a proof of concept for the long-run effect of government policy on ethnic-based inequalities. My specific research question is: Does Dutch colonial redistributive policy affect ethnic-based economic inequalities in contemporary Indonesia? My proposed research shares micro-level approach with standard analysis on inequalities but differs from the existing literature in the following way. First, it takes a long-run perspective back to the Dutch colonial administration days and, if feasible, even longer in the past. Second, it explicitly estimates the causal effects of redistributive policy on between-ethnics inequalities. Third, the focus is not only on the one-time effect but also on its persistence and durability. Fourth, it will generate and use a new digitized modern-standard dataset from old administrative records, in addition to present-day socio-economic and geographic information system (GIS) data, for the analysis.
In this proposed proof of concept development stage, I ask the following simple questions: Are household-level and/or village-level colonial administrative data in the last 19th and early 20th centuries available and accessible? If so, is it feasible to identify and reconstruct the variables of interest using text-mining and GIS procedures? Is it also possible to identify relevant colonial redistributive policies and their actual implementation at household or village level? In other words, this project is akin to a feasibility study to assess whether the available data permits for digital-humanities pre-processing and further statistical causal inference estimation on certain level of units of analysis (either household- or village- or sub district-level). If there is a positive result, I plan to scale up the project for a large-scale text-mining and GIS exercises to produce high-quality datasets that can be used for my and other interested researchers’ studies on the persistent effects of colonial policies on ethnic-inequalities and other socio-economic indicators.
Nicole van Os and Deniz Tat
This project aims at the digitalization and preparation for further analysis of approximately 550 letters written between 1954-1974 by a non-Muslim, non-Turkish single woman living in Istanbul to her niece in the Netherlands. These letters, which are written in a mixture of languages including French, English, Turkish and some Greek and Italian, are of interest to scholars of two separate disciplines: social historians and linguists. They are of interest to social historians, because they provide a unique insight into the multi-lingual environment of the Istanbulites belonging to minority groups during a period of political turmoil which made the lives of members of these groups much harder. They also provide us with detailed information about the (social) life of a single woman, working as a teacher, in a period that the position of women changed.
The letters are also of interest to linguists since the frequent code-switches between typologically distinct languages have structural, psycholinguistic and sociolinguistic implications. Frequent classical, or intra-sentential switches in these letters provide us with an opportunity to understand how a multilingual speaker exploits her ability to alternate, let’s say, between Turkish, a scrambling SOV language with no grammatical gender, and English, an SVO language with relatively strict word order, or French, a language with grammatical gender. In this respect, it is a unique opportunity to shed light on how word-order, information structure, structural and inherent case, phi features (e.g. number, person and gender) are determined in code-switching.
After the initial digitalization of the handwritten letters, we will investigate the possibility of automatic detection of socio-historically relevant items in the corpus on the one hand and the automatic detection of code-switching, on the other hand. Moreover, we plan to develop a way to detect patterns by cross-referencing these two sets of data to determine how code-switching relates to different domains of life referred to in these letters.
The National Archives in London, Kew have made available (online) the slave registers that the British colonial government mandated in Sri Lanka/Ceylon between 1818-1832. The same database exists for 16 other slave colonies of the British including Barbados, Jamaica, and Trinidad. In the case of Sri Lanka this data has never been collected or analyzed in a systematic way owing to the large number of entries. The proposed research entails in the first instant transcribing the 828 images in the slave registers on Jaffna (http://search.ancestry.com/search/db.aspx?dbid=1129) onto excel sheets and creating an easily consultable database. With this research I hope to create a data base of over 10 000 slaves that will contain such information as the name/gender of owner, the place of residence, name of slave (slaves in Jaffna unlike slaves brought from outside the island kept their names), gender of slave, age of slave, children of slave, date of manumission, other details (death for example).
This data once collected will be a unique source to reconstruct society in Jaffna, add complexity to our understanding of the structure of the caste system, produce insights into land ownership and labour patterns according to produce (tobacco or palmyrah etc..) It will also allow us on occasions to trace the lives of particular individuals, slaves or proprietors in the registers that I have encountered in my other archival source material, court cases or petitions. This project will feed into my on-going book project entitled ‘Slave in a Palanquin. Slavery and Resistance in an Indian Ocean Island’.
This project investigates the role of industrial infrastructure in Northeast China in shaping the events and outcome of the Chinese Civil War (1946-1950). It asks:
- How did the Chinese Nationalist Army lose the Chinese Civil War (1946-1950)?
- Did the effect of variation of industrial infrastructure on military campaign make the Chinese Nationalist Army lose the Chinese Civil War?
In Northeast China, the industrial infrastructure— transportation network, electric grid, mining and manufacturing facilities, and urban built environment—was developed under Japanese control (1931-1945). After the fall of the Japanese empire in August 1945, the regional industrial infrastructure attracted the attention of postwar powers. The Soviet Union and the United States competed with each other to take over the region from Japan. The Soviet Red Army, which won the competition, removed large portions of the industrial infrastructure as they retreated in March 1946. The violent rivalry between the Chinese Communists and the Chinese Nationalists erupted here as they sought to seize control over the region. In spite of its overwhelming advantages in troop size and firepower, the Chinese Nationalist army unexpectedly lost to the Chinese Communist in 1948.
Historians generally agree that a correlation exists between the region’s industrial infrastructure and the Chinese Civil War, but rarely agree on the degree or nature of this correlation. This project seeks to illuminate this correlation between these two variables: the damaged industrial infrastructure and the logistical needs of military campaigns. It uses GIS to map out the spatial distribution of industrial infrastructure and military campaigns, and to analyze the spatial correlation between these two variables. In doing so, it aims to obtain a nuanced picture of the material conditions shaping the unfolding of this military event.
Carmen Parafita Couto
It is well known that bi-/multilinguals in some bi-/multilingual communities combine languages in the same sentence when they communicate with one another, known as code-switching. Most researchers agree that code-switching is not a random mix of languages, but that it is rule-governed. Recent work by members of our research team using different language pairs has established that, in general, multilingual speakers choose the morphosyntactic structure (word order and grammatical morphemes, making up the ‘matrix language’) of just one of their languages and insert words or phrases from their other language(s) into the selected frame (Parafita Couto et al. 2014; Parafita Couto & Gullberg, 2017). Although this pattern has been established in different bilingual communities, to our knowledge no previous study has examined in more detail the factors which determine the choice of the matrix language.
In this pilot study, we investigate the extent to which this choice is based on speaker characteristics as compared with community norms, while holding the language pair constant. As the first stage of a larger project involving cross-community study of comparable communities in other geographical locations, we focus on two Spanish/English bilingual communities, Miami (USA) and Gibraltar (Europe). The contexts of Miami and Gibraltar provide a clear contrast both in terms of geographical location and history. Data from both locations has already been collected in the form of natural speech conversation, and the Miami data has already been transcribed and coded (http://bangortalk.org.uk/). We will transcribe and code the Gibraltar data in a similar way to the Miami data in order to facilitate comparison between the two datasets. This will enable us to identify what the code-switching patterns in the two communities have in common and how they are different.
By focussing on only Spanish/English in different communities, we will be able to show to what extent the occurrence of a particular switching strategy may be traced to the influence of syntax/grammar and extralinguistic factors (e.g. social network, attitudes, etc.).