Recent Focused Series »

Indo-European Origins
Northern California
The Caucasus
Imaginary Geography
Home » Cultural Geography, Indo-European Origins, Linguistic Geography

The Mis-Mapping of the Indo-European Homeland

Submitted by on October 25, 2012 – 3:38 am 12 Comments |  
As mentioned in the previous GeoCurrents post, the animated map that accompanies the Science article of Bouckaert et al. contains numerous errors when it comes to the location and timing of language differentiation “events”. While such mistakes are problematic in and of themselves, they also lead to the incorrect mapping of the ultimate Indo-European differentiation event, that is the highest-order split of Proto-Indo-European into its first branches. Locating this initial division, in turn, effectively specifies the location of the Indo-European homeland, the key issue that the article purports to resolve. In other words, mis-mapping permeates the whole project, from the incorrectly mapped individual languages (and in some cases dialects) to the incorrectly mapped intermediate proto-languages to the incorrectly mapped ultimate proto-language, Proto-Indo-European. This post focuses on two additional problems that skew the analysis toward locating the Indo-European urheimat in Anatolia as opposed to the steppe zone.

As we have seen, Bouckaert et al. assume a diffusion-only model of language spread, without considering non-random migration routes that are best modeled as advection. However, since they fine-tune their model by assuming some non-uniformity to the diffusion process (e.g. by distinguishing “land” from “water”), they avoid locating the origin of population spread in the geographical center of the selected sample of Indo-European tongues, as the authors stress in the Supplementary Materials. As they point out, the geographical centroid of the sampled languages, marked on their maps by a green star (p. 958) is located roughly in northern Crimean peninsula at the edge of the steppe zone. Bouckaert et al. furthermore avoid another potential simplification: selecting the center of mass of the language sample as the Indo-European homeland. Because of their oversampling of European languages versus those of Asia, their center of mass would most likely hit around the Balkans, not Anatolia. If anything, this simplistic approach to locating the urheimat would generate “evidence” in support of the Balkan-Carpathian theory of Indo-European origins, proposed by the eminent Russian linguist and historian Igor D’iakonov in his seminal 1985 paper “On the Original Home of the Speakers of Indo-European” and supported most recently by Alexei Kassian of the Russian Academy of Sciences.

While Bouckaert et al. manage to avoid these two oversimplifications—selecting the geographical center as the origin of completely isomorphic diffusion or choosing the sample’s center of mass—nonetheless two factors do appear to sway their solution towards Anatolia: the location of the highest-order split in the IE language family and the southern location of extant Indo-Iranian languages.

The first issue concerns the Anatolian branch of Indo-European, whose latest known location is in what is now the Asian part of Turkey. Given Bouckaert et al.’s assumptions, this means that the split into Anatolian- and non-Anatolian (or Indo-European proper, IEP for short) branches necessarily happened in Anatolia. However, this hypothesis is problematic for two reasons. First, at the time that this highest-order differentiation happened, there was no reason to expect that Anatolian would end up as a relatively small cluster of languages which would all become extinct long before Bouckaert and his colleagues were born, while the IEP would grow into a vastly bushier family tree, many of whose descendants will survive into the third millennium CE. In other words, there is no reason to assume that the Anatolian daughter of PIE remained in situ, whereas the IEP moved on somewhere else. It could have been the other way around, with the IEP daughter staying where the PIE mother was spoken and the Anatolian daughter moving away. Such a process is exactly what the steppe theory hypothesizes. Nor does the fact that all Anatolian languages were last attested in Asia Minor mean that they were always spoken there. For example, while Gothic survived the longest in Eastern Europe, its homeland is in what is now Germany or southern Scandinavia.

The same point can be illustrated with numerous additional examples from extant languages and families. In several cases,  not a single language of a given family is still spoken in the original homeland. For example, much evidence indicates that the Austronesian language family originated in mainland coastal China, yet today no members of the family can be found on the mainland, where they were displaced long ago by advancing Sinitic languages. The closest location to this hypothesized urheimat where Austronesian  languages are still spoken is in Taiwan, which also exhibits the highest degree of diversity for Austronesian languages among all the areas where these languages are currently found.

Similarly, the homeland of Turkic languages is generally thought to have been in Mongolia, and area that is now totally outside the family’s distribution. The Turkic family also provides clear evidence that the homeland need not be located in the area where the first language to have split off the family tree is currently attested: the most distinctive Turkic language is Chuvash, but it is spoken not in northern Mongolia, but rather in the distant Middle Volga region. Significantly, a diffusion-only model of the kind developed by Bouckaert et al. would never designate a linguistic homeland in an area outside of the range of currently spoken and historically attested extinct languages within the group, even though we know this to have been the case in regard to several major language families.

The same pattern obtains within the Indo-European family: for example, the Celtic languages originated in the Central Europe (and are mapped as such by Bouckaert et al.), yet no Celtic language is spoken on the continent, with the exception of Breton, which backtracked there from the British Isles in historical times.

In general, whether Anatolia, the Ukrainian steppes, the Balkans, or some other location turns out to be the ultimate Indo-European homeland, it is almost certain that the language spoken there now has not developed in its current location throughout time. For example, Ukrainian did not evolve from its earliest ancestor wholly in Ukraine, even if we assume the steppe theory of PIE: wherever PIE originated, its relevant descendants are known to have migrated into Eastern Central Europe (Northwest Indo-European), then backtracked eastwards into Eastern Europe (Proto-Balto-Slavic), continuing into the area between Vistula (Wisla) and Dniepr rivers (Proto-Slavic), to the Middle Dnieper region (East Slavic), and finally to present-day Ukraine. It is likely from the same Slavic homeland between the Vistula (Wisla) and Dniepr that the South Slavic languages (Serbian, Bulgarian, Macedonian) moved to the Balkans, while Romanian ultimately traces back to the Romance homeland in Italy. In fact, no language currently attested in the Balkans appears to have developed entirely in region, without sojourning somewhere else at some earlier time, even if we assume the Balkan theory of PIE. It is thus not clear why Anatolian languages, such as Hittite or Luwian, must be exceptional in this respect by never moving away from their postulated homeland.

The second problem concerns the relatively southern location of extant Indo-Iranian languages, as well as that of their intermediate proto-language as depicted in Bouckaert et al.’s animated map. Thus, the “movie” shows the Indo-Iranian population front advancing eastwards from present-day Turkey into Iran and onwards to northern India. Given their assumptions of “diffusion only” and an Anatolian homeland, this is the only way Indo-Iranian expansion can be mapped. However, it is well-established that the Indo-Iranian proto-language was located much further to north, in the North Caspian steppe zone. Indo-Iranian speakers subsequently migrated eastward into Central Asia before splitting into Iranian and Indic branches. Therefore, either the “diffusion only” or the Anatolian homeland assumption, or both, must be abandoned.

The evidence for the steppe sojourn of Indo-Iranian speakers comes from the numerous loanwords from Indo-Iranian languages into Uralic, which must have happened over a prolonged period of time. For example, Early Proto-Uralic is said to have borrowed *juxi/jôwxi ‘to drink’ from the Early Proto-Indo-Iranian *gughew (which traces back to PIE *ghew-); the borrowing of the Late Proto-Uralic *śeta ‘100’ from Late Proto-Indo-Iranian *ćatam (tracing back to PIE *kmtóm) must have happened at a later period (see Häkkinen 2012, p. 8). Such a prolonged period of linguistic contact suggests a long-term co-existence of the two language families in neighboring areas. While the location of the Uralic homeland is itself a controversial issue, the consensus is to place it in the Volga-Ural region, although its ancestor probably came from Southern Siberia, north from the Sayan Mountains. Since the Indo-Iranian branch must have developed in the vicinity of Proto-Uralic, it can be placed in the North Caspian Steppes, but not on the Iranian plateau. In other words, the earliest location of the Indo-Iranian branch was much farther to the north than its descendant languages, even the earliest ones attested in written records, disproving the idea of a direct diffusional route from Anatolia to Iranian Plateau and hence to South Asia.

The fact that virtually all known migrations to, and invasions of, South Asia came via a the northern steppe corridor further supports this theory. Agents of the British Raj were preoccupied with India’s northwestern frontier for good reason, as they understood this historical-geographical dynamic quite well. Historically attested mass movements into the Indian subcontinent through this route are numerous. One of the more interesting examples is that of the Indo-European speaking Yuezhi, who arrived roughly 2,000 years ago. Many scholars believe that the Yuezhi were descendants of the Tocharians of the Tarim Basin, whose language was an early IE offshoot. Although the Yuezhi did not spread their language in South Asia, they were instrumental in building the powerful Kushan Empire that long served as a bridge between India, Central Asia, and the Middle East. Subsequently, several waves of Turkic speakers descended from Central Asia to the plains of northern India; like the Yuezhi, their linguistic role was relatively minor, but their political and cultural impacts were hugely significant. Any model of population movement that rules out such instances of advection will miss such crucial patterns, and hence will result in fundamentally false accounts of human history.



D’iakonov, Igor M. (1985) “On the Original Home of the Speakers of Indo-European”. Journal of Indo-European Studies. Volume 13.

Häkkinen, Jaakko (2012) “Problems in the method and interpretations of the computational phylogenetics based on linguistic data An example of wishful thinking: Bouckaert et al. 2012”, available online.







Previous Post
Next Post

Subscribe For Updates

It would be a pleasure to have you back on GeoCurrents in the future. You can sign up for email updates or follow our RSS Feed, Facebook, or Twitter for notifications of each new post:

Commenting Guidelines: GeoCurrents is a forum for the respectful exchange of ideas, and loaded political commentary can detract from that. We ask that you as a reader keep this in mind when sharing your thoughts in the comments below.

  • German Dziebel

    “Similarly, the homeland of Turkic languages is generally thought to have
    been in Mongolia, and area that is now totally outside the family’s

    Asya, what’s the reasoning behind placing the Turkic homeland in Mongolia instead of somewhere close to where Chuvash is spoken? Uralic languages are thought to have dispersed from around Kazan’ but pre-proto-Uralic is pushed eastward into South Siberia. I would guess the Turkic homeland is placed in Mongolia because it’s generally believed that Turkic is part of Altaic but I can imagine Turkic languages mirroring the Uralic dispersal trajectory, with Povolzhie being an important diversification point manifested in Chuvash.

    • Asya Pereltsvaig

      What would be the story of Yakut/Sakha (i.e. how did they end up where they are) if the Turkic homeland is in the middle Volga area?

      • German Dziebel

        That’s not really a counterargument, Asya (provided you intended it as an argument and not a simple question), especially if we’re talking about a nomadic population. The Samoyedic and Ugric branches of Uralic expanded across the Urals from their Volga River homeland. Something similar could’ve happened to Yakut. Why not? I was just curious about the positive reasoning behind placing the homeland of Turkic languages near Mongolia, not the reasoning whereby Yakuts couldn’t have come from anywhere else but Mongolia.

        • Martin W. Lewis

          Many thanks for the comments, German, which get to some important issues. First, I would emphasis that we do not suppose that the Turkic languages belong to the Altaic family, as it now seems to some linguists that Altaic is a Sprachbund, not a genuine family. The further that one looks back into the histories of the Turkic, Mongolian, and Tungusic languages, the less similar they seem to be, or so they argue. Second, the deepest split within the Turkic family is between the Oghuz and Oghur branches, with Chuvash representing the only surviving member of the Oghur lineage. The histories of the other Oghuric languages (Khazar, Turkic Avar, and Hunnic perhaps) point to an eastern origin, followed by a westward migration. But the evidence is spotty, and a lot of conjecture is involved.

          • Asya Pereltsvaig

            I agree with Martin here on both points.

        • Asya Pereltsvaig

          It was actually meant as a genuine question… It is indeed possible that the Yakuts migrated from the upper (or middle) Volga region. Still, I find it less believable exactly because Ugric and Samoyedic peoples were there… Are there any influences from Ugric and Samoyedic languages on Yakut, I wonder? (again, just a genuine question)

        • Jaska

          Sorry I’m late with this, but there are Bolghar Turkic loanwords in both Samoyed and Mongolic (and even in Tungusic and Chinese [or at least word mentions], if I remember right), so the earliest evidence about the both Turkic main branches comes from Asia. There is a very comprehensive book about locating the homelands of the language families of the Altaic type: Janhunen, Juha 1996: Manchuria: an ethnic history.

          • Asya Pereltsvaig

            This is fascinating, thanks, Jaska!

          • TʀoᴘʏʟıuM

            This ties up with a detail of the Altaic debate though. IIUC, the loans in question are considered Bolghar simply on account of /r/, /l/ rather than /z/, /ʃ/ (unlike the large stratum of Bolghar loans in Hungarian, which displays numerous commonalities with Chuvash) — and pro-Altaicists consider this disrepancy between the attested distribution of Bolghar and distribution of loanwords that look like Bolghar another argument for reconstructing original *ŕ, *ĺ rather than *z, *š in Proto-Turkic.

            Although the rhotacism/zetacism and lambdacism/sigmaism division lines largely coincide with the pro-Altaic/anti-Altaic one, it should be noted that the questions are not necessarily connected (nor are they necessarily connected to each other).

          • Jaska

            According to Janhunen, the following points are against the relatedness of the Altaic language families:
            1. No shared basic vocabulary, only cultural items.

            2. No shared old words between Turkic and Tungusic, only between Mongolic and either Turkic or Tungusic (+ some Turk –> Mong –> Tung).
            3. Rhotacism/lambdacism has been explained in the Turkic framework as a split of *s to *s and *z (> Chuv. r), and a split of *sh to *sh and *zh (> Chuv. l). Apparently the opposite directions would meet some problems, although I haven’t seen the original argumentation by Shherbak.

          • TʀoᴘʏʟıuM

            Yes, I can’t say I’m sold on Altaic at all. Regardless I’ve seen good arguments in both directions on the liquid/sibilant issue, and the proposal of *ŕ, *ĺ (together with its implications for the early history of Turkic) should not be dismissed as just a reconstruction cheat for a pro-Altaic agenda. I recall e.g. Helimski noting that in the Old Turkic script, the letters for “z” and “š” are analyzable as the letters for /r/ and /l/ plus a diacritic.

          • Asya Pereltsvaig

            Thanks to both of you for the fascinating discussion!