Universiteit Leiden

nl en

Research project

Tracking the Tocharians from Europe to China: a linguistic reconstruction

This project intends to provide an integrated linguistic assessment of the hypothesised migration route of the Tocharians.

2016 - 2022
Michaël Peyrot

Tocharian is known through manuscripts from Northwest China dating from 500–1000 AD. It is an Indo-European language, related to a.o. Latin, Greek, English and Dutch. The speakers of Tocharian must therefore have made a long trek from Europe to China. On the basis of contacts with other languages we aim to establish their migration route.

Tocharian is an extinct branch of the Indo-European language family, which includes a.o. English, Latin, Greek, Persian and Sanskrit. Tocharian was discovered in manuscripts from the Tarim Basin in Northwest China, which date from ca. 500–1000 CE. The Indo-European languages stretch from Ireland to the Sea of Bengal in one uninterrupted belt, but Tocharian, found in an isolated region of China, is a notorious exception to this geographic distribution. This project intends to find out how Tocharian ended up there.

The common ancestor of the Indo-European languages, Proto-Indo-European, can be hypothetically reconstructed and is often located in the east of present-day Ukraine. Consequently, speakers of Tocharian must have made a long trek eastward before they settled in the Tarim Basin. Archaeological and genetic evidence suggests that they first moved east to southern Siberia in the 3rd millennium BCE and then south to the Tarim Basin, where they may have arrived as early as the 2nd millennium BCE. The arrival of the Tocharians in the Tarim Basin is possibly linked to ancient corpses found there: the so-called Tarim Mummies.

This project intends to provide an integrated linguistic assessment of the hypothesised migration route of the Tocharians. If the Tocharians had already moved east to southern Siberia in the 3rd millennium BCE, this requires an early departure from the Proto-Indo-European homeland. We will clarify the starting point of the migration of the Tocharians through a systematic analysis of the affinities of Tocharian with the other Indo-European languages.

Languages preserve precious information about their prehistory through the effects of language contact. Through close scrutiny and periodisation of the different layers of contact of Tocharian and its prehistorical neighbours, we aim at reconstructing the migration route of the Tocharians from the Proto-Indo-European homeland all the way to China.

Meier, Kristin & Michaël Peyrot. 2017. “The word for ‘honey’ in Chinese, Tocharian and Sino-Vietnamese”. Zeitschrift der Deutschen Morgenländischen Gesellschaft 167: 7–22.
Damgaard, Peter de Barros et al. 2018. “The first horse herders and the impact of early Bronze Age steppe expansions into Asia”. Science 360(6396): 7711.
Kroonen Guus J., Gojko Barjamovic & Michaël Peyrot. 2018. “Linguistic supplement to Damgaard et al. 2018: Early Indo-European languages, Anatolian, Tocharian and Indo-Iranian”. Zenodo.
Damgaard, Peter de Barros et al. 2018. “137 ancient human genomes from across the Eurasian steppes”. Nature 557: 369–374.
Peyrot, Michaël. 2018. “Linguistic history of the steppe”. Section 2 (p. 10–14) of the Supplementary Materials to “137 ancient human genomes from across the Eurasian steppe”. Nature 557: 369–374.
Peyrot, Michaël. 2018. “Interrogative stems in Hittite and Tocharian”. Indogermanische Forschungen 123: 65–90.
Peyrot, Michaël. 2018. “Tocharian B etswe ‘mule’ and Eastern East Iranian”. In: Farnah. Indo-Iranian and Indo-European Studies in Honor of Sasha Lubotsky. Ann Arbor: Beech Stave, 270–283.
Peyrot, Michaël. 2018. “Tocharian agricultural terminology: Between inheritance and language contact”. In: Guus Kroonen et al. (eds.), Talking Neolithic: Proceedings of the workshop on Indo-European origins held at the Max Planck Institute for Evolutionary Anthropology, Leipzig, December 2–3, 2013. Washington DC, 242–277.
Peyrot, Michaël. 2018. “On the East Iranian genitive plural ending”. Indo-Iranian Journal 61: 118–130.
Dragoni, Federico. 2018. “The Oldest Attested Pāzand in the Bundahišn Text of the Munich Manuscript M51 An Orthographic and Phonological Analysis”. Studia Iranica 47: 165–199.
Peyrot, Michaël. 2019. “Indo-Uralic, Indo-Anatolian, Indo-Tocharian”. In: Alwin Kloekhorst et al. (eds.), The precursors of Proto-Indo-European. The Indo-Anatolian and Indo-Uralic hypotheses. Leiden, 186–202.
Bernard, Chams B. 2019. “On the etymology of Persian yak ‘one’”. Wékwos 4: 41–55.
Dragoni, Federico, Niels Schoubben & Michaël Peyrot. 2020. “The Formal Kharoṣṭhī script from the Northern Tarim Basin in Northwest China may write an Iranian language”. Acta Orientalia Academiae Scientiarum Hungaricae 73: 335–373.
Dragoni, Federico. 2020. “The Tumshuqese Year of the Goat and the Fremdzeichen x₆”. Journal Asiatique 308: 215–223.
Bernard, Chams B. 2020. “A newly discovered Persian variety: the case of “Zoroastrian Persian””. Orientalia Suecana 69: 57–67.
Bernard, Chams B. 2020. “Some plant and animal names in Gavruni”. In: Romain Garnier (ed.), Loanwords and Substrata. Innsbruck, 27–61.

Tocharian has been in contact with various Iranian languages. Some of these are known, such as the Bactrian, Sogdian and Khotanese, while others are not directly attested, so that Tocharian is an important, if not the only source for these varieties. While the Middle Iranian languages Bactrian and Sogdian appear to have had a clear but smaller impact on Tocharian, an otherwise unattested Old Iranian variety has exerted more profound influence. This Old Iranian variety seems to have been part of the Iranian steppe dialects largely forming “Scythian”.

In this subproject, we examine the contacts between Tocharian and Old and Middle Iranian, with special focus on the otherwise unattested Old Iranian variety. A major goal is to establish features to carefully distinguish between the different Iranian sources.

In the Tarim Basin, the oldest known linguistic neighbours of Tocharian are Khotanese and the closely related but only fragmentarily known Tumshuqese. Surprisingly, few instances of linguistic contact between Tocharian and Khotanese have been proposed, and many of these have not been widely accepted.

In this subproject, we examine the contacts between Khotanese and Tocharian, and investigate selected topics of Khotanese historical grammar.

Although Chinese influence was not always strong in the Tarim Basin, the stronger Chinese presence in especially the Hàn and Táng periods obviously has left its traces in the Tocharian language. Because of the uncertainties in the phonetic interpretation of older Chinese and the marked typological difference between Tocharian and Chinese phonology, contacts between Tocharian and Chinese are difficult to judge and to establish.

In this subproject, we examine contacts between Tocharian and older Chinese.

Alexander Lubotsky has attributed shared vocabulary in Indo-Iranian that does not allow a Proto-Indo-Iranian reconstruction because of irregular phonological correspondences and morphology to an otherwise lost language of the ancient Bactria-Margiana Archaeological Complex (BMAC) in present-day Turkmenistan and Afghanistan. In some cases, Tocharian seems to have preserved lexical items from this prehistoric source as well.

In this subproject, we examine contacts between Tocharian and the prehistoric BMAC language.

The Tocharian Trek

We originally planned to investigate a number of relevant topics within this project, but these are now part of the related project The Tocharian Trek. In this ERC project, Principal Investigator Michaël Peyrot and collaborators Louise Friis, Stefan Norbruis, Niels Schoubben and Abel Warries investigate, among others, the phylogenetic position of Tocharian and contacts between Tocharian and Uralic, Turkic and Niya Prakrit.

This website uses cookies.  More information.