Welcome to the support page for
Language trees with sampled ancestors support
a hybrid model for the origin of Indo-European languages
Heggarty et al. (2023) in Science
doi: 10.1126/science.abg0818
Published 28 July 2023. BibTeX file here.
Full list of all 33 authors and personal webpages.
1. Download the Article Free
To download the article free, please go to the IE-CoR database and use the permitted link at the start of the home page. Thanks!
The full article is 10 pages, only online as a pdf. , and includes at the start the 1-page summary in the print edition of Science.
Please also explore iecor.clld.org, our language database, and our Supplementary Materials, where we directly address the many controversies about Indo-European origins and phylogenetic analyses.
2. Media Interest and Popular Science Articles
This paper has been reported on in Science news, New Scientist, El País (in Spanish and in English), Frankfurter Allgemeine, the Telegraph of India, the Globe & Mail (Canada), Gazeta Wyborcza, (Poland) and the History First blog, among others.
For a fuller list of coverage also on social media (Twitter, Facebook, etc.), see Altmetric.
Science correspondents have a challenging task to cover a vast range of different sciences, and to explain complex issues as straightforwardly as possible for a non-specialist public. Specialists might not always agree with how press stories express everything, even when ‘citing’.
3. Supplementary Material
Two supplementary text files are freely available at Science, as pdfs:
We also make available the full files of our language data, and our phylogenetic analyses, as detailed in 6 and 7 below.
4. Explore the IE‑CoR Language Database Online
Please also explore iecor.clld.org, our language database, and our Supplementary Materials, where we directly address the many controversies about Indo-European origins and phylogenetic analyses.
On IE‑CoR you can view the data for one or more individual languages, or individual meanings, or explore all or any one of the 5013 different cognate sets (i.e. sets of related words, which go back to a single ancestral form).Please explore iecor.clld.org, by far the best and fastest way to get an understanding of the language data we analysed and that underlie our phylogenetic results. IE‑CoR is our database of Indo-European Cognate Relationships in ‘core’ vocabulary, 170 reference meanings in which words are generally more stable than in most meanings.
The screenshot below shows (by colours) the patterning of shared word roots in the primary term for the sample meaning FIRE across the 109 modern languages (round dots) and 52 ancient or historical languages (diamonds) in our database.
On IE‑CoR you can view the data for one or more individual languages, or individual meanings, or explore all or any one of the 5013 different cognate sets (i.e. sets of related words, which go back to a single ancestral form).
Note at the bottom of the screenshot the link to the IE‑CoR Definition, tailored for each meaning. To avoid a major methodological problem with past cognate databases (see Heggarty (2021)), IE‑CoR follows its own precise definition of a specific target sense (of the word ‘fire’ in English, as the meta language used as the label for its reference meanings).
On IE‑CoR you can view the data for one or more individual languages, or individual meanings, or explore all or any one of the 5013 different cognate sets (i.e. sets of related words, which go back to a single ancestral form).
For each meaning there is a map of how different cognate sets pattern across the various major branches of the Indo-European family. For an instant idea, click compare these Illustrative examples:
5. Tips on Using, Citing and Linking to the IE‑CoR Database
IE‑CoR permits a simple search syntax to create custom URLs to go direct to one or more filtered subset of data in any given table/view.
For example, to compare any two languages, such as Early Vedic and Younger Avestan, the URL is:
https://iecor.clld.org/values?sSearch_3=Avestan:+Younger,Vedic:+Early
This is equivalent to selecting the languages manually from the drop-down box under the languages column on the lexemes page. Similar custom URLs can be created for other subsets of the data, on other pages too.
The syntax for such URLs is:
Hence https://iecor.clld.org/values?sSearch_3=Avestan:+Younger,Vedic:+Early
As further illustrative examples:
It is also possible to show custom combinations of meanings and languages together, i.e. words from specified languages only in one or more specified meanings. This can be done only from the Lexemes page (i.e. /values) by adding seerch terms to filter both the languages column [ &sSearch_3= ] and the meanings column [ &sSearch_1= ].
Here is the syntax for some illustrative examples for a mixed selection of two Germanic languages, five Romance, and Ancient Greek, for the single meaning FIRE, and then for a specified set of six meanings:
1. Download the Article Free
To download the article free, please go to the IE-CoR database and use the permitted link at the start of the home page. Thanks!
The full article is 10 pages, only online as a pdf. , and includes at the start the 1-page summary in the print edition of Science.
Please also explore iecor.clld.org, our language database, and our Supplementary Materials, where we directly address the many controversies about Indo-European origins and phylogenetic analyses.
Attentive readers may have noticed in reference 37 of our paper that IE‑CoR permits a simple search syntax to create custom URLs to go direct to one or more filtered subset of data in any given table/view.
For example, to compare any two languages, such as Early Vedic and Younger Avestan, the URL is:
https://iecor.clld.org/values?sSearch_3=Avestan:+Younger,Vedic:+Early
This is equivalent to selecting the languages manually from the drop-down box under the languages column on the lexemes page. Similar custom URLs can be created for other subsets of the data, on other pages too.
The syntax for such URLs is:
Hence https://iecor.clld.org/values?sSearch_3=Avestan:+Younger,Vedic:+Early
As further illustrative examples:
For computational processing rather than browsing, you can also download the raw data tables that underlie IE‑CoR.
To explore or reproduce our phylogenetic analyses, follow the:
All input data, analysis and results files can be downloaded together as zips from
https://doi.org/10.5281/zenodo.8147476.
Alternatively, they can be browsed and downloaded individually from
https://share.eva.mpg.de/index.php/s/E4Am2bbBA3qLngC.
The Guide also gives links to where all software used can be freely downloaded.
Much of the background and contentious issues in phylogenetic analyses of Indo-European languages, and ancient DNA, can be read in Sections 1 and 2 of the Supplementary Materials.
Further background will be provided here soon …
Indo-European origins, and the application of Bayesian phylogenetic methods to cognate data from language families, are both highly contentious, with a whole series of controversies in each. Likewise, there is much debate and some controversy in interpreting the ancient DNA record with respect to various branches of Indo-European: which past population movements inferred from ancient DNA might be the best candidates for explaining the spreads of which branches of Indo-European?
Necessarily we take up these issues in our paper, and especially in the supplement where much more space is available to discuss them. We refer here repeatedly to the corresponding sections in the supplement, and to sensitivity analyses (SAs) that we performed specifically to investigate them. For references cited here, for now please see the references section of the supplementary analysis (direct links will be provided here soon).
On the Indo-European question, controversies surround arguments in favour of the Steppe hypothesis, each of which has however been challenged. Among these are ...
On Bayesian phylogenetic methods, controversies surround in particular ...
On which ancestries in ancient DNA may correspond to which branches of Indo-European languages, there have also been conflicting proposals and arguments, especially on the following issues.
Coming soon …
Coming soon: links to personal homepages …
Paul Heggarty
Cormac Anderson
Matthew Scarborough
Benedict King
Remco Bouckaert
Lechosław Jocz
Martin Joachim Kümmel
Thomas Jügel
Britta Irslinger
Roland Pooth
Henrik Liljegren
Richard F. Strand
Geoffrey Haig
Martin Macák
Ronald I. Kim
Erik Anonby
Tijmen Pronk
Oleg Belyaev.
Tonya Kim Dewey-Findell
Matthew Boutilier
Cassandra Freiberg
Robert Tegethoff
Matilde Serangeli
Nikos Liosis
Krzysztof Stroński
Kim Schulte
Ganesh Kumar Gupta
Wolfgang Haak
Johannes Krause
Quentin D. Atkinson
Simon J. Greenhill
Denise Kühnert
Russell D. Gray
5b. III Meeting of the Network for the Study of Andean Languages
Former Dept of Linguistics, Max Planck Institute for Evolutionary Anthropology, Leipzig
30th November -1st December 2011
main conveners and organisers: Paul Heggarty
main funding: former Dept of Linguistics at MPI‑EVA
6. The Incas and Their Origins
Former Dept of Linguistics, Max Planck Institute for Evolutionary Anthropology, Leipzig
13th-15th June 2014
main conveners and organisers: Paul Heggarty — David Beresford-Jones — Adrian Pearce
main funding: former Dept of Linguistics at MPI‑EVA
7. Rethinking the Andes-Amazonia Divide
Former Dept of Linguistics, Max Planck Institute for Evolutionary Anthropology, Leipzig
16th-17th June 2014
main conveners and organisers: Paul Heggarty — David Beresford-Jones — Adrian Pearce
main funding: former Dept of Linguistics at MPI‑EVA
8. A Cross-Disciplinary Prehistory of Andean Civilisation
Dept of Linguistic and Cultural Evolution, former Max Planck Institute for the Science of Human History, Jena
12th-14th December 2015
main conveners and organisers: Paul Heggarty — Chiara Barbieri — David Beresford-Jones — Adrian Pearce
9. Populating the Diverse Landscapes of South America
Universidade Federal de Minas Gerais, Belo Horizonte, MG, Brazil
18th-21st October 2017
main conveners and organisers: Fabrício Santos — Paul Heggarty
main funding: CAPES (Brazilian Education Ministry) — DLCE at the former MPI‑SHH