Indo-European Origins

Can language spread be modeled using computational techniques designed to trace the diffusion of viruses? As recently announced in the New York Times, a team of biologists claims to have solved one of the major riddles of human prehistory, the origins of the Indo-European language family, by applying methodologies from epidemiology. In actuality, this research, published in Science, does nothing of the kind. As this series of articles shows, the assumptions on which it rests are demonstrably false, the data that it uses are woefully incomplete and biased, and the model that it employs generates error at every turn, undermining the knowledge generated by more than two centuries of research in historical linguistics and threatening our understanding of the human past.

Our video presentation of this series’ main points can be found at:

The Vexatious History of Indo-European Studies, Part III

(Note to readers: This is the third of at least five posts derived from a draft chapter of our forthcoming book on the Indo-European controversy. This particular chapter examines the intellectual history of Indo-European studies, focusing on the most contentious ideas and ideologically motivated arguments. Its ultimate aim is to help explain why the Anatolian theory of Indo-European origins, which is rejected by almost all specialists in their field, would nonetheless appeal deeply to journalists, editors, funding agencies, and scholars in other disciplines. Again, references are not included in this draft.)

 Renewed Confusion of Race and Language

While early 20th century racial scholars were reducing the scope of the White (or “Caucasian”) race in Europe, stressing the separation of its so-called Nordic, Alpine, and Mediterranean stocks, a countervailing tendency operated in Africa. Although this movement was not directly linked to debates about Indo-European origins, it did feed into a renewed conflation of race and language in the postwar period that influenced popular conceptions of the so-called Indo-European peoples. It also provoked a sequence of scholarly reactions what would eventually begin to sever the race-language connection.

race4The main tendency in early 20th century African physical anthropology was to inflate the geographical bounds of the Caucasians at the expense of Black Africans. Some writers have traced this maneuver to the defeat of the Italian Army by the Kingdom of Abyssinia at the Battle of Adwa in 1896; since Blacks were widely thought to be incapable of defeating a modern European military force, the conclusion followed that the victorious Ethiopians must actually be sun-darkened Whites. As the facial features of Ethiopians tend to be more like those of North Africans than those of sub-Saharan Africans, this idea received some support from physical anthropology. By the mid 20th century, however, cartographers were expanding the Caucasian label deep into the heart of the continent, encompassing peoples of wholly African appearance. A map in the 1946 Atlas of World Affairs, produced with support from the U.S. military, treated not just Ethiopia and Somalia as demographically dominated by “Caucasian (or white)” people, but also northern Kenya, South Sudan, Uganda, and the northeastern corner of D. R. Congo.

As the peoples of Uganda and South Sudan are hardly “White” by any physical indicators, one must ask how they could have been so classified. The answer, in essence, is language. The scholars responsible for this maneuver knew that it was problematic. Yet as C. G. Seligman explained in his influential book Races of Africa (1930):

 Language—helpful as it may be—is no safe guide to race. Yet the study of the races of Africa has been so largely determined by the interest in speech … that names based on linguistic criteria are constantly applied to large groups of mankind and, indeed, if intelligently used, often fit quite well. … [I]n this volume linguistic criteria will play a considerable part in the somewhat mixed classification adopted. (9-10)

The key construct employed by Seligman and his peers was the “Hamitic Hypothesis,” which takes us back again to Noah’s son Ham. As Hebrew, Arabic, and other closely related languages were defined as Semitic (i.e., linked to the progeny of Shem), more distantly related languages in the same family, such as Ancient Egyptian, Somali, and Galla (Oromo), were linked to Ham and hence deemed Hamitic. As Europeans gained knowledge of interior Africa, scholars increasingly linked all advances in African culture to conquests or incursions by the generally dark-skinned yet putatively Caucasian Hamites; as Seligman put it, “the civilizations of Africa are the civilizations of the Hamites” (96). European writers often seized on dubious physical or linguistic markers among elite African populations as a sign of Hamitic descent. Thus the taller and more sharply featured Tutsi aristocrats of Rwanda/Burundi were viewed as largely Caucasian Hamites, unlike the Hutu commoners. In this case, the two communities spoke the same Bantu language, but it was reasoned that the Tutsis must have spoken a Hamitic language before they overcame the more numerous Hutus. As linguistic information was gathered from eastern Africa, Nilotic-speaking peoples—including many of the pastoralists of the region—were often subsumed into the same putative Hamitic family (although Seligman classified the southern Nilotes such as the Masai as “half-Hamites” [157] while regarding those of what is now South Sudan as “hamiticized Negroes”[169].) At the extreme, as in the portrayal in the 1946 Atlas of World Affairs, it would seem that all eastern Africans speaking non-Bantu languages, such as the Zande of northern D.R. Congo, were assigned to a Caucasian or at least a half-Caucasian racial position, regardless of their physical attributes. Yet Seligman himself thought that even the Bantus have some “Hamitic blood” (178), and he thus limited the “True Negroes” to the coastal zone of West Africa.

R4Africa was not the only part of the world in which race was widely confused with language. In the post-WWII intellectual environment, the extreme claims of pre-war racial scientists were no longer credible, but race remained a key concept for understanding human diversity. For general pedagogical purposes, the most expedient solution was simply to map races along language lines. As a result, peoples speaking Indo-European languages in Europe and India were often racially separated from peoples speaking Uralic, Turkic, and Dravidian languages. In the 1946 Atlas of World Affairs mentioned above, Turks, Hungarians, and (most) Finns are mapped as “Mongolian (or yellow).” In the popular World Book Atlas, Hungarians and eastern Finns are classified as mixed Caucasian and Mongolian, whereas most Turks are depicted as purely Mongolian, as are the Hungarians living in the Carpathian Mountains of Romania. A more extreme conflation of race and language is found in a map edited by the noted Welsh geographer and race4-1anthropologist H. J. Fleure, published in Bartholemew’s Advanced Atlas of Modern Geography of 1962. Here Finns and Estonians are mapped as “Asiatic or Mongolian” because of their “yellow skin colour.” On the same map, “Dravidian” is also advanced as a skin-color group (as part of an “Australo-Dravidian” race of “Melanodermic” people). Here even the map projection, deemed “Nordic,” is seemingly racialized.

carleton-coon-map-originalThis chaotic conception of racial diversity in the postwar period provoked a minor reaction. One scholar in particular, the American physical anthropologist Carleton Coon, sought to place racial understanding on a more scientific basis by stripping out involuted taxonomies and firmly rejecting the mixing of racial and linguistic categories. Coon noted the absurdity of classifying the Finns as “yellow”—albeit while failing to see that there is nothing “yellow” about the skin of East Asians— and scoffed at the idea that Europeans are divided into discrete races. Relying on a variety of physical indicators and guided by evolutionary theory, Coon, divided humankind into the Caucasoid, Mongoloid, Congoid (sub-Saharan African), Australoid, and Capeoid (far southwestern African) stocks, which he regarded as distinctive enough to constitute separate subspecies. Coon’s conception remained racially hierarchical, but he no longer placed Caucasians—let alone Aryans—at the apogee. In an illustration tellingly captioned “The Alpha and Omega of Homo sapiens,” Coon contrasted an Australian Aborigine, supposedly possessing a cranial capacity of a mere 1000 cubic centimeters, with a Chinese scholar enjoying “a brain nearly twice that size”(p. XXXII).

Just as Coon was developing his evolutionary approach to racial taxonomy, the entire concept of physical race came under devastating attack. The key figure here was the anthropologist Ashley Montague, who cartographically demonstrated that the main diagnostic traits for race—including skin color, cranial index, nose shape, stature, and so on—have their own distributional patterns, failing to exhibit the spatial co-variation that would be required to support the notion of distinct races. By the late 1970s, few scholars considered race as anything but a social construct, and a pernicious one at that. The pendulum swing was so extreme that any talk of physically based divisions among humankind came to be seen as unacceptable, leading some scholars of genetic diversity to despair. As modern analysis shows, numerous genetic markers do indicate significant physical differentiation among such groups as western Eurasians and Eastern Eurasians.

Marija Gimbutas and the Feminist Revision of Indo-European Studies

As racial anthropology was being reformulated and then abandoned, Indo-European studies were undergoing their own transformation. The key figure here was the Lithuanian-American archeologist Marija Gimbutas, who turned the Aryan hypothesis on its head, portraying the original Indo-Europeans not as history’s heroes but rather as its villains. In 1956, Gimbutas linked the kurgan burial mounds in the Pontic Steppes north of the Black Sea to speakers of the proto-Indo-European language. She associated this so-called Kurgan culture with pastoral, patriarchal warrior bands. In later excavations of Neolithic villages in southeastern Europe, she described a culture that seemed to be the opposite on all scores: sedentary, peaceful, and gender egalitarian. Gimbutas elaborated this thesis in the 1970s in a series of books on the deities of what she called Old Europe, essentially the Balkan Peninsula before the coming of the Kurgans. These female-centered, goddess-worshiping societies, Gimbutas claimed, were highly cultured, almost fully egalitarian, and peaceful, lacking fortifications and offensive weapons. Their irenic civilization, she further argued, was demolished by the Kurgan invasions, which spread not just the Indo-European language family but also warfare, hierarchy, and male domination.

kurgan2aGimbutas’s basic archeological work was solid, and most Indo-Europeanists have accepted some version of her Kurgan hypothesis that places the origin of the language family among the pastoral (or semi-pastoral) peoples of the Pontic Steppes. But her characterizations of both the “Kurgans” and the “Old Europeans” went too far for most specialists, who saw vast leaps from scanty remains to huge generalizations. And some of her lay followers went farther still. In Riane Eisler’s 1988 treatise, The Chalice and the Blade, the Kurgan conquests are seen as ushering in a global age of male domination, social hierarchy, and mass violence. The implication was that a gentle, egalitarian social order is the human birthright, and could yet be reclaimed if only we undo the social damage imparted by the early Indo-Europeans. The Chalice and the Blade was a bestseller, helping propel the wave of goddess worship that swept certain feminist circles in the late 20th century. It was lauded by prominent intellectuals, including Joseph Campbell, the doyen of mythology study. The famed anthropologist Ashley Montagu, noted above for his dismantling of the biological concept of race, hailed The Chalice and the Blade as the “most important book since Darwin’s Origin of Species.” And in odd corners of current popular culture, “Kurgans” still play the role of malevolent sub-humans; in the popular Highlander film series, for example, a character named “the Kurgan” comes from a tribe of the same name, “infamous for their cruelty, and … known to ‘toss children into pits full of starved dogs, and watch them fight for [the] meat’ for amusement.”[1] The same idea reappears in the video game Blackmoor Archives.

Still, Eisler’s comprehensive vision failed from the onset. As male domination characterized almost all historically known human societies, it can hardly be attributed to a single ancient people located in one particular part of the Earth. In today’s world, rates of male on female violence reportedly reach a peak in Melanesia, a realm of small-scale societies about as far removed from the “Kurgans” as could be imagined. Despite its appeal to the left, Eisler’s thesis was overwhelmingly Eurocentric, substituting Europe (actually, a corner of Europe) for the world as a whole. But even many of the less extreme assertions of Gimbutas herself have been undermined by scholarly analysis. The peoples of Old Europe were not altogether peaceful and female-centered, just as the speakers of proto-Indo-European and their immediate descendants were almost certainly not insistently androcentric and violent.

Work in world history also casts doubt on the Gimbutas vision. It is easy to imagine militaristic nomads from the Eurasian steppes as much more male-dominated than their sedentary neighbors, but comparative analysis suggests otherwise. Through the early modern and modern periods, women among the traditionally pastoral Kazakhs and Kirghiz of Central Asia have generally enjoyed more autonomy and power than those living in the village and urban societies of the (Sart) Uzbeks[2] and Tajiks. In medieval Mongolia, female empowerment was pronounced; as Mongol men were often absent at war, it is hardly surprising that women took on major managerial and political roles in the homeland. It is also noteworthy that the Scythians, ancient Indo-European-speaking pastoralists of the Pontic Steppes, not uncommonly buried their females in military gear. Perhaps Herodotus was on to something when he wrote of Amazon warrior women among the tribes of the area. Whether such conditions of relative female empowerment existed among the proto-Indo-European-speakers is anyone’s guess, but it is clear that we cannot simply assume overwhelming male domination based on pastoralism and military prowess.

Orientalism and the Renewed Assault on Indo-European Philology

In the works of the pre-WWII Aryan school and those of the late 20th century feminist revisionists alike, the deep Indo-European past primarily served ideological ends. Certainly the goals of the two camps were opposed; where the former romanticized violence and domination, the latter sought to bolster peace and equality. But whatever their motivations, writers in both groups allowed their desires and prejudgments to guide their conclusions. In this regard, the early Orientalist philologists stood on much more solid ground. Max Müller and and his fellows certainly had their biases and blind spots—as we all do—but their commitment to empirical scholarship allowed them to partially transcend their prejudices.

Yet at the same time that the early Indo-European past was being reimagined by Gimbutas and her followers, the reputation of the early Indo-European philologists was again being savaged, as the field itself was again brought to the forefront of scholarly discourse. The key text here was Edward Said’s 1978 book Orientalism, which condemned the entire project of philological scholarship for serving European imperialism by facilitating intellectual domination over the non-Western world. To be sure, Said subjected Jones to relatively light criticism and mostly ignored Müller, but both were ultimately damned. Said accused Jones of trying to “subdue the infinite variety of the Orient” by attempting to codify the main texts of the region (p 78). For Said, there was no escaping the taint; even “great Orientalist works of genuine scholarship,” he argued, “came out of the same impulse” as “Gobineau’s racial ideas”(p. 8).

From a historical point of view, there was something deeply ironic about this broad-brush attack on the Indo-European philologists. For the early Orientalists who wrote on India were demonized by the arch-racialists of their own day precisely because they sought to erase rather than inscribe the “ontological and epistemological distinction between ‘the Orient’ and … “the Occident”—the very distinction in which Said located the essence of Orientalism (p. 2). To be sure, one can find passages in Jones, Müller, and their peers that strike the modern reader as inadequately sensitive or even bigoted, but so too one can find such sentiments in all writers of the period. In the end, to tar all Orientalists as complicit in the imperial project is to descend into a form of anti-intellectualism, rejecting out-of-hand an invaluable legacy of thought.

Indo-European Revisionism in South Asia

Meanwhile, the legacy of Müller and his peers have came under increasing attack from another quarter altogether, that of Indian nationalism. This school is epitomized in D. N. Tripathi’s edited collection of 2005 entitled A Discourse on Indo-European Languages and Cultures. The various contributors to this volume understandably object to the old narrative of the Aryan invasion of the sub-continent, a story that emerged in the 19th century from a combination of philological inquiry and racial science. According to this account, superior Aryans invaded South Asia in the Bronze Age, conquering and ruling over the indigenous dark-skinned people and then creating the caste system to ensure that the two groups remained distinct and unequal. Support for this theory was supposedly found in the Rigveda, one of humankind’s oldest text. Yet as Trautmann shows, this neat and simplistic narrative of Aryan invasion had actually been opposed by most of the leading European Sanskritologists of the 19th century. It has also been rejected by modern mainstream scholars, who deny stark racial divisions and tend to posit plodding infiltrations of Indo-European speakers into the Indian subcontinent, along with a gradual and complex development of caste ideology. And regardless of the seemingly clear division of South Asia into an Indo-European north and Dravidian south, it has long been recognized that the entire region shares numerous linguistic features, making it a Sprachbund or linguistic convergence zone.

The current school of Indo-European revisionism in India, however, goes much further in denouncing the old Aryan hypothesis. Some of these writers deny any foreign impact on ancient South Asian civilization, as if in fear that acknowledgement would sunder the unity of India and compromise the nationalist agenda. As Tripathi specifies in his introduction, the main point of the volume is to show that the Indo-European language family originated in South Asia with the Indus Valley civilization and then subsequently spread westward. Sanskrit, he contends, “is the most suited choice as the proto-Indo-European language,” adding that the “antiquity of the Vedas is far more than what Max Müller and others have tried to fix” (p. 13). Other chapters redeploy from Europe to India the exhausted trope of the intrinsic Aryan inclination to migrate. Ajay Mitra Shastri, for example, argues that, “the frequent migrations of enterprising peoples from India westward are responsible for the commonness and great similarity in the vocabulary of the speakers of Indian, West Asia, and European languages.” Yet Shastri is moderate compared to T. P. Verma, who claims not only that Sanskrit was the original language of all humankind, but that it was a direct gift from above. As he boldly argues, “Vedas are verbal transformations of God” (p. 116), essentially taking us back to an early 19th century conception of human prehistory. A more extreme version of this thesis is found in the Wikipedia “Talk” page on Max Müller, where the philologist is accused of being a “bigot who was trying to destroy a civilization” merely because he dared to examine religious texts through the lens of secular scholarship.

This Indocentric school of Indo-European studies has generated significant opposition among more traditional scholars, both in the West and in India. According to Edwin Bryant, tensions grew so pronounced that it became “increasingly difficult for scholars of South Asia to have a cordial exchange on the matter without being branded a ‘Hindu nationalist,’ ‘western neo-colonialist,’ ‘Marxist secularist,”’ or some other simplistic and derogatory stereotype.” In an attempt to break down such barriers, a joint volume entitled The Indo-Aryan Controversy was published in 2005, containing insightful arguments from both camps, with several authors emphasizing the influence of the non-Indo-European languages of South Asia on the region’s Indo-European tongues. In the end, however, the “out of India” theory favored by Tripathi and his colleagues cannot withstand the scrutiny that it receives in this volume. As Michael Witzel demonstrates, no linguistic evidence supports an Indian origin of the Indo-European languages, whereas a vast amount of evidence can be found against it. As he concludes, “To maintain an Indian homeland of IE … requires multiple special pleading of a sort and magnitude that no biologist, astronomer, or physicist would tolerate”(p. 375).

Although many Indian scholars have been trying to put the Aryan invasion myth to rest for once and all, the idea nonetheless retains potency in other corners of southern Asia. In the far south of India, many so-called Dravidianists accept the Aryan invasion thesis on face value, but give it a negative spin to oppose Brahmin interests, favor Tamil over Sanskrit and Hindi, and more generally advocate Tamil nationalism. In northern India, Pakistan, Afghanistan, and especially Iran, a pro-Aryanist movement still attracts support, as evidenced by a minor YouTube video genre that celebrates the racial nature of the local population. More than 300,000 views, for have example, have been garnered by a video entitled “Aryan Race in Iran, Afghanistan, Tajikistan, Pakistan, India”; its creator (PersianCyrus) claims that:

The real Aryans live in Iran, Afghanistan 
Tajikistan, Pakistan and India. With the attack of the mongols and turks most of the people there got “turkified” or “mongolzied”. However some of those survived!

In such a manner, anti-Arab and anti-Turkish prejudice in Iran is given a pseudo-scholarly gloss.


[1] I am indebted to GeoCurrents reader William Barnard for  bringing this character to my attention. The quotation is from the Wikipedia article on the fictional character known as “the Kurgan.”

[2] The term “Uzbek” has been used to refer to two separate groups. Originally it referred to a largely pastoral group speaking a Turkic language closely related to Kazakh, a group that created the Uzbek Khanate of the Early Modern Period. In the early 20th century, Soviet ethnographers reassigned to the term to the sedentary peoples of the region who speak a heavily Persian-influenced Turkic language. Previously, these people, along with their Tajik neighbors, had generally been called “Sarts.”

The Vexatious History of Indo-European Studies, Part III Read More »

The Vexatious History of Indo-European Studies, Part II

(Note to readers: this is the second portion of a chapter of our forthcoming book on the Indo-European controversy; more will follow. This chapter outlines the main ideological ramifications of the debates concerning Indo-European origins and dispersion.  It is not an account of the development of Indo-European linguistics. It is rather concerned with the use, and especially the misuse, of linguistic idea by scholars in other fields and by assorted ideologues. References and footnotes are unfortunately not included here.)  


“Race Science” and the Challenge of Philology

875924-MAs “race science” gained strength in late 19th century Europe, it faced a major obstacle in Indo-European philology. European racial theorists maintained a stark separation between the so-called Caucasian[1] peoples of Europe and environs and the darker-skinned inhabitants of South Asia, yet the philologists argued that Europeans and northern Indians stemmed from the same stock. Some of the early efforts to mesh the new racial ideas with linguistic findings  were rather strained. The popular American writer Charles Morris, for example, argued in 1888 that races are divided on the basis of both language and physical type, which generally but not always coincide; he further contended that “the Aryan is one of these linguistic races” (p. 5) that had lost its original physical essence. The general tendency was to emphasize ever more strongly this supposed loss of “purity,” and thus for physical type to trump linguistic commonality. As Isaac Taylor, the Anglican Canon of York, noted a few years later, “The old assumption of the philologists, that the relationship of language implies a relationship of race, has been decisively disproved and rejected by the anthropologists” (p. 5).” By the end of the century, the increasingly victorious racialists regarded the philologists as their main opponents. Taylor concluded his influential The Origin of the Aryans by noting that “the whilom tyranny of the Sanskritists is happily overpast” (p. 332); he also charged philology with having “retarded …  the progress of science” (p. 6)

51qlTvU6i7L._Paradoxically, race scientists relied on the findings of the Indo-European philologists while denouncing them and turning their key discovery on its head. Writers propounding the racialized Aryan thesis emphasized the massive expansion of the Indo-European people in ancient times—a fact demonstrated by historical linguistics—seeing in it prime evidence of Aryan superiority. The preeminence of the ancient Aryans, such writers believed, was evident in the intrinsic restlessness that led them to explore new lands and subdue indigenous inhabitants. As early as the 1850s, Arthur de Gobineau argued that the civilizations not only of India but also of Egypt and China—and perhaps even Mexico and Peru—had been founded by Aryans, whom he extolled as the world’s natural aristocrats. Gobineau and his successors claimed that the original Aryans lost their racial essence as they spread from their homeland and interbred with lesser peoples. The resulting mixture supposedly led to degeneration and the loss of vigor. As the century progressed, more extreme racists argued that “mixed races” cannot maintain themselves, as one of the genetic stocks that went into their creation would necessarily prevail. Isaac Taylor went so far as to argue that the children of parents from “diverse” races are usually infertile, much like the offspring of horses and donkeys (p. 198). As a result, most race scientists concluded that Aryan blood had been swamped out long ago in India, although the more moderate ones allowed that a measure of Aryan nobility could still be found among the Brahmins, owing to their steadfast rejection of cross-caste marriage.

050-Guenther-rassenkarte-1930-m-LegendeAs the Indo-European commonalties discovered by the philologists were reduced to a distant episode of heroic conquest followed by miscegenation, degeneration, and the local extinction of the racial line, race theorists sought to relocate the original Aryan homeland. This search for a European urheimat became intertwined with a simultaneous development in racial thinking: an emerging fixation on head-shape as they key to racial identity and origins. Armed with the seemingly scientific tools of head calipers and cranial indices, anthropologists divided Europeans into several distinct physical types, viewed either as sub-races of the Caucasian stock or as discrete races in their own right. Although disagreements persisted, most racial scientists came to identify the Aryans with the narrow-headed (dolichocephalic), fair-skinned, light-haired people of the north, rather than the broader-headed (brachycephalic) “Alpines” of central Europe or the darker-complexioned, shorter “Mediterranean” peoples of the south. (German theorists of the Nazi era added yet more European races, such as the stocky blond “Falisch” race supposedly found in parts of western Germany.) In this reading, the original Celts, Slavs, Greeks, and Italics had been Aryans, but by intermarrying with others they had lost their racial essence, retaining only the linguistic marker. Only the Nordic peoples—often IE_homeland_proposals_mapidentified with current and past speakers of the Germanic languages[2]—could count as true Aryans, a notion closely identified with the German[3] linguist and archeologist Gustaf Kossinna. If northern Europeans represented the genuine Aryan line, uncontaminated with the blood of the subjugated peoples, then it stood to reason that the Aryans had been the indigenous inhabitants of northern Europe. Various theories were consequently advanced to locate the Indo-European cradle somewhere near the shores of the Baltic Sea. The linguistic evidence remained ambiguous, however, leading to prolonged debates about the precise location of the homeland.

The many inconsistencies and contradictions that riddled this emerging synthesis were either bypassed or accommodated through special pleading. Western European writers who denigrated the Slavs while celebrating the Germans overlooked the fact that northern Poles and northern Russians tend to have narrower heads and fairer complexions than southern Germans. The non-Indo-European Finnic peoples with their Uralic languages presented a greater problem; Estonians in particular tend to be rather narrow headed and extremely fair. One expedient was to classify the Uralic language family as a distant cousin of the Indo-European family, assuming that the speakers of the two original proto-languages sprang from the same racial stock. The widespread notion that the Uralic tongues belonged to a Ural-Altaic family that also included Mongolian, however, challenged this idea, leading to profound discomfiture. One result was awkward descriptions of the Finns, with one writer describing them as “linguistic Mongolians” who are nonetheless “intermediate between the blond and the Mongolian [physical] types, although much nearer the former” (Morris 22).

As the racial interpretation of prehistory gained predominance in the late 19th century, Max Müller attempted to stem the tide, objecting strenuously to the misappropriation of his work. In his Biographies of Words and the Home of the Aryas, published when he was 64, Müller forwarded a surprisingly modern conception of linguistic history. Although he had long stressed the kinship of northern Indians and Europeans, he now denied that he had ever conceptualized it in terms of race. Instead he denounced any identification of language groups with racial stocks, contending that “an ethnologist who speaks of Aryan race, Aryan blood, Aryan eyes and hair, is as great a sinner as a linguist who speaks of a dolichocephalic dictionary or a brachycephalic grammar.” Müller further sought to discredit the romantic celebration of the proto-Indo-Europeans, mocking the “taken for granted idea” that “in the beginning … there was an immense Aryan population somewhere, and that large swarms issued from a central bee-hive which contained untold millions of human beings.” Müller went so far as to cast doubt on the core notion of a single Proto-Indo-European language, arguing instead that that the language family could have emerged out of a welter of related dialects. He further contended that speakers of these dialects might have spread their tongues not by way of massive invasions but rather through the gradual infiltration of relatively small numbers of people out of their Asian homeland. But Müller reserved his most profound contempt for those who associated an Aryan race with northern Europeans:

But where is there an atom of evidence for saying that the nearer to Scandinavia a people lived, the purer would be its Aryan race and speech, while in Greece and Armenia, Persia and India, we should find mixture and decay? Is not this not only different from the truth, but the very opposite of it?

It is thus for good reason that Trautmann contends that Müller was the “Public Enemy Number One” of the racial scientists (172).


The Triumph and Decline of “Racial Science” and the Aryan Ideal

After the turn of the century, racialist writers tended to distance themselves ever further from the Indo-European idea. The influential polemicist Houston Stewart Chamberlain —one of Hitler’s favorites—hesitated to use the term “Aryan” for his favored race due to its association with the Indo-European language family, preferring instead “Teutonic.” Chamberlain “granted that there was once a common ancestral Indo-European race,” but assumed that its essential traits had long ago vanished everywhere except among the Teutonic folk of northern Europe. Oddly, he wanted to restrict the term “Aryan” in the modern world to individuals who embodied the supposed traits for their distant forebears. Chamberlain’s 1899 The Foundations of the 19th Century went through twenty-four editions and sold more than 250,000 copies by the late 1930s. But despite its public success, its flaws were so overwhelming that it failed it to impress even some of the world’s most ardent imperialists. In this regard, Theodore Roosevelt’s trenchant review is worth quoting at some length:

 [The Foundations of the 19th Century] ranks with Buckle’s “History of Civilization,” and still more with Gobineau’s “Inégalité des Races Humaines,” for its brilliancy and suggestiveness and also for its startling inaccuracies and lack of judgment. … Mr. Chamberlain’s hatreds cover a wide gamut. They include Jews, Darwinists, the Roman Catholic Church, the people of southern Europe, Peruvians, Semites, and an odd variety of literary men and historians. But in his anxiety to claim everything good for Aryans and Teutons he finally reduces himself to the position of insisting that wherever he sees a man whom he admires he must postulate for him Aryan, and, better still, Teutonic blood. He likes David, so he promptly makes him an Aryan Amorite[4].

Despite Roosevelt’s skeptical views, “Aryanism” in its various guises emerged as a potent force in the United States, where it often took on a particularly American cast. An important text here is Joseph Pomeroy Widney’s 1907 Race Life of the Aryan Peoples. Widney was an influential thinker, founder of the Los Angeles Medical Society and the second president of the University of Southern California. A man of his times, he disparaged philology while arguing that “the history of the world is largely only the history of the Aryan man.” Widney often compared the original Indo-European expansion to the settlement of the United States by Europeans. Like many of his predecessors, he found their racial essence in pioneering restlessness: “For there is unrest in the Aryan blood, an unrest which is ever urging it out and on.” Widney’s signal contribution, if one could call it that, was synthesizing racism with environmental determinism. At the time, geographers stressed the contrast between the salubrious temperate climates the deleterious tropics, and here Widney eagerly followed suit. The Aryans of India, he argued, succumbed not only to race mixing but also to the enervating heat, whereas those of Russia were undone by frost along with Mongolian admixture. As he unambiguously put it, “Aryans retain racial vitality only in temperate climates.”

Passing_of_the_Great_Race_-_Map_4Another well-known American racial theorist, Madison Grant, also pictured the prehistoric Aryan adventure through the lens of the westward expansion of the United States. Even more than Chamberlain, Grant rejected the terms “Aryan” and “Indo-European,” contending that the race so denoted had long since vanished almost everywhere. But among the “Nordics,” who alone preserved the racial essence, he found the same spirit of adventure that produced all the world’s great sailors, explorers, and pioneers. “Practically every 49er” in the California Gold Rush, he told his readers, “was a Nordic.” Grant’s 1916 book The Passing of the Great Race was deeply felt in U.S. intellectual circles. The extent of Grant’s racism is evident in the fact that as secretary of the New York Zoological Society he helped arranged to have a Congolese Pygmy[5] exhibited in a cage in the Monkey House of the Bronx Zoo and labeled as a “missing link” between apes and “the white race.”

It is difficult to exaggerate the sway of racial science in North America and northern Europe in the early twentieth century. This was not merely the pet theory of bigots and chauvinists, but a widely accepted doctrine that cut across political lines. It was embraced by some of the most knowledgeable, sophisticated, and progressive thinkers of the time. Even the Fabian socialist playwright George Bernard Shaw found much to admire in Chamberlain’s hymns of racial hatred. Of particular significance, however, was V. Gordon Childe, perhaps the foremost pre-historian of the era. An Australian by birth who was long affiliated with the University of Edinburgh, Childe was an accomplished philologist as well as a preeminent archeologist. He was also a lifelong Marxist, committed to a variety of leftist causes. To be sure, Childe was wary of the extremism of “Houston Stewart Chamberlain and his ilk,” warning that “the word ‘Aryan’ has become the watchword of dangerous factions and especially the more brutal and blatant forms of anti-Semitism” (p. 164). But despite these cautionary remarks, Childe embraced the core of the Aryan thesis. As he concluded his hallmark 1926 book, The Aryans: A Study of Indo-European Origins: “Thus the Aryans do appear everywhere as promoters of true progress and in Europe their expansion marks the moment when the prehistory of our continent begins to diverge from that of Africa and the Pacific” (p. 211).

2264c_03e38417f5b3f1b4ada11a081a05c0aaChilde was too knowledgeable and intellectually honest to impute all human progress to the Aryans. Indeed, he emphasized the fact that the early Indo-Europeans had repeatedly “annexed areas previously occupied by higher types of culture” (p. 200). How to explain such annexations was an intellectual challenge. In one passage, Childe opined that it was “only explicable in racial terms” (p. 200), which he later specified to be largely a matter of brawn: “the physical qualities of that stock did enable them by the bare fact of superior strength to conquer even more advanced people” (p. 212). But in the end, Childe claimed that it was neither bodily strength nor a more generalized racial superiority that allowed the Aryans to triumph, but their language itself, a view originally put forward by the German philosopher and bureaucrat Wilhelm von Humboldt (1767-1835). The final lines of his text attribute Aryan domination to the “more excellent language and mentality that [they] generated” (p. 212). This supposed excellence is spelled out in the first few pages of Childe’s book:

[T]he Indo-European languages and their assumed parent-speech have been throughout exceptionally delicate and flexible instruments of thought. They were almost unique, for instance, in possessing a substantive verb and at least a rudimentary machinery for building subordinate clauses that might express conceptual relations in a chain of ratiocination.”  (p. 4)[6]

Childe, the “great synthesizer” of European prehistory, thus returned to the philological roots of inquiry to explain the mushrooming of the Indo-European language family.

Childe’s theories of Aryan linguistic supremacy, however, had little impact, and he later came to regret having written the book. Over the next decade, a new generation of social and cultural anthropologists began to transform the field. Scholars were now committing themselves to learning the languages of the peoples they studied, and in so doing they undermined the idea that primitive peoples have primitive languages, incapable of expressing abstract concepts. Philologists who studied non-Indo-European languages, moreover, knew full well that there was nothing uniquely Aryan about subordinate clauses. Childe’s linguistic understanding had become antiquated, invalidating the key component of his Aryan theory.

Meanwhile, the emerging school of sociocultural anthropology discredited scientific racism on other fronts. Franz Boas, the German founder of the discipline in the United States, showed that head shape is determined in part by parenting practices, as the cranial indices of American-born children of immigrants deviated from those of their mothers and fathers. The behavioral disparities found in different human groups, Boas argued, stemmed from cultural difference rather than innate temperaments. As the students of Boas gained positions of leadership in anthropology departments across the country, racialists such as Madison Grant despaired.

But it is important to recognize that the revisionism of Boas had its limits. Despite his staunch opposition to scientific racism, Boas, like Childe, remain wedded to the idea that language embodies the worldview of the group that speaks it, revealing its volksgeist, or ethnic essence. This idea would be further elaborated by his student Edward Sapir and Sapir’s student Benjamin Whorf into the eponymous Sapir–Whorf hypothesis of linguistic relativism, which claims that language determines thought. Although a “soft” version of this hypothesis has many defenders, most linguists reject outright the stronger version of the original formulation, which denies the universality of basic human cognition.

Regardless of developments in linguistic theory, by the 1930s, scientific racism was in rapid retreat in the United States and Britain, and by the late 1940s it was discredited even in Germany. With the post-war revelations of Nazi atrocities, the thesis of Aryan superiority was thoroughly ejected from mainstream intellectual life. To be sure, it continued—and continues—to fester in odd corners. These days, it is easy to be reminded of its existence by doing ethnographic map and image searches, in which content from the neo-Nazi website Stormfront appears distressingly often.

The Vexatious History of Indo-European Studies, Part II Read More »

The Vexatious History of Indo-European Studies, Part I

(Dear Readers,

As mentioned previously, I am now working on our forthcoming book on the Indo-European controversy.  I have now finished the chapter on the history of the debates, which I will post here at GeoCurrents, in pieces, over the next two week.  Bibliographic references are not included, although they may be added later. Comments and criticisms are of course welcome.)

Debates about Indo-European origins and dispersion have played a surprisingly central role in modern intellectual history. At first glance, the ancient source of a group of languages whose very relatedness is invisible to non-specialists would seem to be an obscure issue, of interest only to a few academics. Yet it is difficult to locate a topic of historical debate over the past two centuries that has been more intellectually provocative, ideologically fraught, and politically laden than that of Indo-European origins and expansion. Although the controversies have diminished in the Western public imagination since the middle of the 20th century, they still rage in India, and elsewhere their reverberations persist. As a result, the Indo-European question is anything but trivial or recondite. To understand the significance of the current controversy, it is therefore necessary to examine the historical development of Indo-European studies in detail, paying particular attention to the ideological ramifications of the theories advanced to account for the success of this particular language family.

division-2mBefore the mid 1800s, most European scholars conceptualized human diversity primarily through the story of the sons of Noah—Ham, Shem, and Japheth—whose descendants supposedly gave rise to the various “nations,” “stocks,” or “races,” of humankind, terms that were usually applied interchangeably.  Although the geological and biological theories of Charles Lyell and Charles Darwin are rightly viewed as having effectively undermined the religious understanding of prehistory—thus ushering in the secular intellectual age—historical linguistics, or philology as it was then called, played a key role as well. The discovery of deep linguistic connections that cut across the conventional geography of Noah’s descendants unsettled the religious view of the past, encouraging the emergence of a secular conception of human development. As historical linguistics developed over the first half of the 19th century, Bible-based ethnography grew ever less tenable. (Although the noted linguist Mark Baker  argues in The Polysynthesis Parameter that the Tower of Babel story,* which recounts the diversification of languages among Noah’s descendants, might convey a non-literal truth, insofar as the macroparameters built into the deep structures of human language necessarily generate “serious linguistic diversity”—which he claims indicate an origin “distinctly spiritual in nature” [p. 514].)

t-o diagramAlthough the account of Noah’s progeny in Genesis 10 is geographically spare and ambiguous, traditional Jewish accounts usually identified the descendants of Japheth with the north, those of Ham with the south, and those of Shem—the ancient Hebrews and relatives— with the middle zone. In medieval and early modern Christendom, however, the tripartite continental division of the world led most scholars to identify Ham’s descendants with Africa, those of Shem with Asia (or at least western Asia), and those of Japheth with Europe. Early attempts at serious linguistic classification remained within this general framework. The precursor of formal historical linguistics in England, the physician and antiquarian James Parsons (1705-1770), viewed the deep similarities across many European languages as evidence of descent from a common ancestral tongue, which he linked to Japheth. Although the use of the term “Japhethic” to denote the Indo-European language family was abandoned long ago, the Noahic scheme lingers on: “Semitic,” a subfamily of the Afroasiatic languages, derives its name from Shem, while “Cushitic,” another subfamily in the same group, stems from Cush, the eldest son of Ham. (The term “Hamitic,” long used to cover all of the non-Semitic Afroasiatic languages of Africa, was abandoned only in the 1960s after Joseph Greenberg showed that these languages did not descend from a single common ancestor.)

jonesThe celebrated founder of Indo-European studies, Sir William Jones (1746-1794), remained wedded to a Biblical vision of the past. Jones, a well-trained philologist working as a civil servant with the British East India Company in Calcutta, realized that Sanskrit was related to Greek and Latin, and probably to Gothic, Celtic, and Persian as well. As he put it, the resemblances between Sanskrit, Latin, and Classical Greek are so profound that “no philologer could examine them all three, without believing them to have sprung from some common source, which, perhaps, no longer exists…” Thus was born the idea of an Indo-European linguistic family, along with that of a long-lost proto-Indo-European ancestral tongue (although these terms were coined much later). But as Thomas Trautmann explains in Aryans and British India, the modernity of Jones’s comparative linguistics was compromised by his pre-modern ethnographic convictions and designs. Jones’s ultimate project apparently aimed at “recovering the lost language of Noah and of Adam through the comparison of vocabularies” (p. 52). To square the kinship of Sanskrit with the languages of Europe within the Biblical narrative, Jones had to reorient the territory of Noah’s three lines of descent. In his retelling, the children of Ham settled in India and Egypt, where they “invented letters, observed and named the stars and planets,” and otherwise created civilization; later movements brought these same people to Greece, India, northern Europe and perhaps even Mexico and Peru (Trautmann 52). In Jones’s idiosyncratic view, the descendants of Japheth were not the Europeans, but rather the pastoral peoples of Central Asia and perhaps even the stateless tribes of the Americas—groups that he claimed “cultivat[ed] no liberal arts” and had “no use for letters” (Trautmann 52).  Such a view represented an inversion of mainstream European accounts, which celebrated the Japhethic line of Europe while denigrating the progeny of Ham in Africa and, in some accounts, southern and eastern Asia as well.

Jones’s eccentric revision of the story of Noah’s sons had little influence on other scholars, as it rested on fanciful migration scenarios that challenged mainstream biblical understanding. In the long run, however, his linguistic research led to work that undermined religiously inspired ethnography. To be sure, the Noahic thesis continued to have its adherents throughout the 1800s. In the 1850s, the forerunner of “scientific racism,” Arthur de Gobineau, accepted the narrative of Noah’s sons, although he regarded all three as progenitors of the White race, as he did not think that that non-Whites descended from Adam. By the late 1800s, however, academic scholars could no longer invoke the Bible to sketch the contours of prehistory.

The work of Jones and his successors forced European scholars to grapple with the deep connections between the peoples of Europe and those of South Asia. Traditional “universal” histories produced in Christendom had limited their attention to western Asia, Europe, and North Africa, areas known from the Bible and classical literature. Such works typically dispensed with India and areas further east with a few dismissive paragraphs. Such a blinkered view had been challenged by Voltaire and other philosophes of the French Enlightenment, but their assessments were dismissed by both religious stalwarts and European chauvinists. With the rise of comparative philology, however, the Enlightenment’s ecumenical perspective received a temporary boost. Jones’s successors in Britain and India in the early 1800s continued to delve into Sanskrit linguistics and literature, examining as well the relationship between Sanskrit and other South Asian languages. In doing so, these Orientalist scholars emphasized the antiquity and the sophistication of the Indian tradition. At the same time, continental European researchers such as Franz Bopp and Rasmus Rask put the study of historical linguistics on a sound scientific basis, outlining systematic laws of sound change and grammatical transformation. Such work solidified the historical linkages among the languages, and hence the cultures and peoples, of northern India, Persia, and Europe.

Max_MullerOf signal importance to this endeavor was the German scholar of Sanskrit, Max Müller, who long taught at Oxford. Müller coined the term “Aryan,” derived from Sanskrit texts, to denote the original group of people whose language spread so broadly and diversified so extensively. The Aryan homeland, he suspected, lay in Central Asia, probably in Bactria (northern Afghanistan), a theory currently supported by the noted linguist Johanna Nichols. To Müller and many of his fellow Orientalists, the differences in physical appearance between Europeans and their Indian relatives was superficial; the latter had darker skin merely because of their ancestors’ prolonged exposure to the sun. The revealed kinship of what later became known as the Indo-European peoples fostered deep interest in India and, to a lesser extent, Persia. As knowledge accumulated, a veritable “Indomania” grabbed hold in a few corners of European intellectual life.

The resulting respect accorded to India, however, generated a strong reaction, a movement propelled as well by the intensifying economic and technological divergence of Europe and Asia and by the steady advance of Western imperialism. In philosophy, Hegel and most of his heirs disdained all things Indian in withering terms, while in Britain utilitarian thinkers such as James Mill disparaged Indian civilization and attacked its Orientalist defenders, contending that progress in South Asia could only be realized by wholesale Westernization. But at least Mill and his fellow British liberals believed that progress in India was possible; as the 19th century wore on, the rise of so-called racial science led to a ratcheting up of anti-Asian antipathy and other forms of bigotry, a movement that would culminate in the horrors of the Holocaust.


*Genesis 10 explicitly states that the various Noahic descent groups developed their own languages, while the next chapter, Genesis 11, which recounts the story of the Tower of Babel, tells us that all people at the time spoke the same language. Current-day Biblical literalists deal with this seeming contradiction by arguing that the sequencing of the Bible does not necessarily reflect chronological order, and that as a result many of the passages in Genesis 10 recount episodes that occurred after the events outlined in Genesis 11. In Christian literalist circles today, the origin of human diversity is largely explained on the basis of the “confounding of languages” that followed the construction of the Tower of Babel, although the story of the sons of Noah still figures prominently as well.



The Vexatious History of Indo-European Studies, Part I Read More »

Questions for Readers Regarding Biblical Ethnography

As mentioned in an earlier post, I am now devoting most of my attention to the book on Indo-European origins that Asya Pereltsvaig and I are writing. I am currently working on a chapter that recounts the intellectual history of the Indo-European concept, which is a fascinating and complex topic. Right now, I am perplexed in regard to an issue stemming from Biblical ethnography, and I am hoping that GeoCurrents readers might have some knowledge that they would be willing to share.

tumblr_mqgippOlcK1s6c1p2o1_1280Through the 1700s, most European scholars understood human diversity primarily through the story of the dispersion of the sons of Noah—Shem, Ham, and Japheth—recounted in Genesis 10. Thus, when William Jones determined that Sanskrit, Latin, Greek and other languages of Europe, Persia, and India were related, he tried to fit this pattern into the Biblical narrative, specifically by arguing that the speakers of all of these languages were descended from Ham. This idea went against the established concept, which regarded Europeans as the progeny of Japheth and Africans as the descendants of Ham (see the medieval T-O world map posted to the left). To Jones, the “Japhethic” peoples were rather the nomads of Central Asia and the Americas. (Traditional Jewish accounts, on the other hand, tended to associated Japheth with the north, Ham with the south, and Shem with the middle latitudes, as on the second map).

division-2mSubsequent work by historical linguists contributed to the discrediting of Biblical ethnography, and thus helped usher in the secular intellectual age. Christian fundamentalists who stress Biblical inerrancy, however, still believe that Genesis provides the key to understanding human linguistic and racial diversity. Yet their websites usually downplay Genesis 10 and the sons of Noah and instead focus on Genesis 11, which recounts the story of the Tower of Babel. In reading the relevant passages in the Bible, I am struck by their contradictory nature. I am curious about how these contradictions have historically been handed by both religious thinkers and scholars of human diversity operating in the Biblical framework. Why, in particular, did early European ethnographers stress Genesis 10 rather than Genesis 11?

The text of Genesis 10 seems to claim that descendants of the sons of Noah developed their own separate languages before the Tower of Babel was constructed, which would seemingly explain why early historical linguists stressed these passages. Genesis 10:20, for example, is usually translated into English as, “These are the sons of Ham, after their families, after their tongues, in their countries, and in their nations,” just as 10:31 is translated as “These are the sons of Shem, after their families, after their tongues, in their lands, after their nations.” As Asya notes, in the Hebrew original 10:31 reads “le-mishpaxotam li-leshonotam be-artzotam le-goyehem,” literally, “to their families, to their languages (PLURAL!), in their lands, to their peoples.” (The last word “goyim” is interesting in that in the Bible it means various peoples, as in “ethnic groups,” “ethno-linguistic groups”, “ethno-linguo-religious groups”, or even “clans.” “Nations” seems too big of a word. Over time, however, it came to signify “peoples other than the Jews.”)

Genesis 10 thus seems to claim that the original human language diversified as the descendants of Noah scattered across the world. In the initial passage of Genesis 11, however, a different picture emerges: “And the whole earth was of one language, and of one speech.” Such a single language, however, was “confounded” after the construction of the Tower Of Babel (Genesis 11:7). What then do Biblical experts think happened to the languages that had been spoken among the different lineages of Noah before the Tower was built? It is also unclear who actually build the tower, as the relevant Biblical passages do not specify the subject. As Genesis 11:2 reads, “And it came to pass, as they journeyed from the east, that they found a plain in the land of Shinar; and they dwelt there.” But who were “they?” Some modern fundamentalist websites claim that all of humankind gathered at Shinar to build the tower; as result, the scattering that occurred after Babel was destroyed and the single human language was “confounded” gave rise to subsequent human linguistic and racial diversity. In this interpretation, the early scattering of Noah’s sons and their progeny was of no lasting significance, as it had nothing to do with post-Babel linguistic differentiation. But if this is the case, why did Biblical ethnographers of earlier centuries stress Genesis 10 and the sons of Noah while downplaying Genesis 11 and the Tower of Babel?

Hmtdna6889214_f520One interpretation, seen in the map to the left, claims that the scattering of the sons of Noah happened after the Tower of Babel incident, but this requires a reversal of the sequence of events as recounted in the Bible. Fundamentalist efforts to square the Biblical account with modern science can be quite involved: the diagram posted here, taken from the “Creation Wiki,” tries to fit the Noahic descent groups with a modern mitochondrial DNA tree diagram. I have not encountered the terms “Mrs Ham, Mrs Shem, and Mrs Japheth” elsewhere.


Questions for Readers Regarding Biblical Ethnography Read More »

Some Strange Fantasy Maps

Drenai MapThe world of science fiction and fantasy is an excellent place to find strange maps, and few are stranger than the Drenai map posted here. David Gemmell’s Drenai series has prompted a number of fans to map the world depicted in the novels. Most are rather straightforward pictures of the author’s fantasy realm. One amateur cartographer, however, decided to map the world on the basis of the Earth analogues of the various societies portrayed in the series. To do this, he has smashed together the British Isles, France, North Africa, Iberia, Mongolia, Korea, and eastern China. Such a maneuver is odd enough, but the really bizarre feature is the doubling of eastern China. Note how the southeastern subcontinent is formed by two mirror images.

Tetrakon MapMaps used in fantasy game-playing can be quite intricate and sophisticated. Cartographers working in this genre, however, can also get carried away. The political map of Tetrakon posted here is impressively large, as can be gathered from the detail that I have also posted. The map looks fairly realistic at first glance, owing in part to fractal geometry; the use of self-similar patterns allows geographical features to remain distinct as one zooms in on any particular place. The problem is that in the real world, many coastlines are relatively smooth Tetrakon Map detailand straight. As a result, the Tetrakon map has a jarring appearance, as all of the land/sea patterns here are much the same.

Some Strange Fantasy Maps Read More »

Ideological Agendas and Indo-European Origins: Master Race, Bloodthirsty Kurgans, or Proto-Hippies?

This final contribution to the Indo-European series turns once again to the potential ideological agendas lurking behind theories of IE origin and expansion. As was noted previously, no other issue in human prehistory has been so ideologically fraught; the original IE speakers have been recruited to serve a variety of fantasies, ranging in temper from naively benign to unimaginably vile. For Nazis and their ilk, the original Indo-Europeans constituted the Aryan super-race whose descendants were destined to rule the world. Followers of a certain feminist school of prehistory, in turn, have turned the “Aryan thesis” on its head, portraying the same people as the bloodthirsty “Kurgans” overrunning the peaceful, matriarchal civilization of “Old Europe” and ushering in a global age of violence and male domination. As was argued in the earlier post, it is understandable that some scholars would want to discredit all such overreaching interpretations based on the crushing might of the horse-empowered original Indo-Europeans. If it could be demonstrated that the IE languages were actually spread by Neolithic farmers slowly pushing into new areas as their numbers increased, all such troublesome theories would be effectively undermined.

Yet it is one thing to hope for such a paradigm switch and another to push it along by a purposeful manipulation of data and analysis. Doing so would be a blatantly ideological act, and hence a betrayal of science and reason. Assessing scholarly motivations, however, is a hopeless task, and we have no way of knowing whether Bouckaert et al. have intentionally selected their data and skewed their model in order to support the Anatolian thesis of IE origins. We do think that it is possible, however, that they have unconsciously let their own ideological commitments guide their research program. Our evidence here comes from two sources. First, as we have demonstrated over the past two months, both the data selection and the model construction are warped to consistently favor the Anatolian hypothesis, most egregiously by ignoring all ancient IE language spoken in the steppe zone and by ruling out advection as a mechanism of language spread. Second, it seems likely from the comments posted on this website that distaste for the idea of violent incursions, often viewed as a necessary feature of the “steppe hypothesis,” colors the authors’ perspective. Quentin Atkinson, the article’s corresponding author, quotes Larry Trask to make this point:

Nevertheless, the vision of fierce IE warriors, riding horses and driving chariots, sweeping down on their neighbours brandishing bloody swords, has proven to be an enduring one, and scholars have found it difficult to dislodge from the popular consciousness the idea of the PIE-speakers as warlike conquerors in chariots.

Although the desire to wish away the “bloody swords” of the human past is understandable, it is also naïve, as violence unfortunately pervades our history. One does not have to embrace the vision of Thomas Hobbes, recently updated and re-theorized by Steven Pinker in his tome, The Better Angels of Our Nature: Why Violence Has Declined, to accept that this is indeed the case. I suspect that Pinker exaggerates the bloodiness of hunting-gathering societies, a charge made most forcefully by Christopher Ryan, co-author of the intriguing and controversial Sex at Dawn, yet I also suspect that Ryan descends into hyperbole of his own in emphasizing the peacefulness and sexual license of our Paleolithic ancestors. But when it comes to pre-modern agricultural societies, the evidence is overwhelming: enveloping violence was the norm almost everywhere. If one wants to rule out the possibility of bloody swords and other weapons, one would be advised to examine something other than human history.

But even if armed struggle has been pervasive for most of the past 10,000 years, it does not follow that all non-foraging societies have been equally bloody. As is always the case, different groups vary considerably on this score. If one searches the ethnographic literature, one can find a few documented tribal farming societies that shunned warfare and all of its trappings. Yet the unfortunate truth is that such groups were usually victimized by their more aggressive neighbors, and hence were seldom successful in maintaining their numbers and territories.

One of the most interesting groups of historically peaceful peoples is the Hanunó’o of the Philippines, whose social formation was described by the great American anthropologist Harold Conklin roughly a half century ago. The Hanunó’o constitute a small group (roughly 14,000) of tribal cultivators living in the southern interior portion of the lightly populated island of Mindoro. An encyclopedic treatment of Philippine ethnic groups* frames their peaceable inclinations in concise terms: “Warfare, either actual or traditional, is absent.” But Hanunó’o were able to maintain their irenic way of life only by retreating to rugged and inaccessible areas, and even so they were periodically targeted for centuries by slave raiders from the Sulu Archipelago. Intriguingly, the Hanunó’o seem to be a remnant of what was once a much larger and more sophisticated society, evident by the fact that they have long enjoyed widespread literacy in their own script, an essentially unprecedented phenomenon in a small-scale, tribal society. Conflicts between Spain and the Muslim naval powers of the southern Philippines (the so-called Moros) evidently destroyed the formerly prosperous mercantile centers of Mindoro, after which remnant groups fled the bloody swords of both the Spaniards and the Moros into the inaccessible uplands. There they maintained a generally peaceful way of life, although at a fairly significant cost.

But with the exceptions of some hunter-gatherer bands and a few societies of tribal cultivators, nearly continual violence was the common lot of humanity before the contemporary era. Thus even if Indo-European languages spread into Europe and South Asia through the gradual influx of Neolithic farmers, as Bouckaert et al. argue, the process would have almost certainly been marked by generalized conflict and extensive bloodshed as the Mesolithic indigenes were dispossessed of their lands. By the same token, had the IE languages been spread by horse-riders advancing into the lands of the Neolithic farmers, as most versions of the “steppe hypothesis” contend, violence would also have accompanied the process. But would such a scenario have necessarily entailed substantially greater levels of bloodshed than the majority of such cultural “encounters” experienced over thousands of years across the globe? Equestrian warriors would certainly have had profound military advantages over horseless peoples, but that does not necessarily mean that they would have been any more savage than the human norm. It is also quite possible that IE languages spread mostly through gradual incursions supported in large part by economic or other non-military advantages. Anthropological blogger Al West, for example, surmises that the early Indo-European speakers gained power by selling horses and other goods (see below) to other peoples. Certainly the massive non-IE linguistic substrates found in such IE branches as Greek, Germanic, and Indo-Aryan indicate deep levels of cultural exchange with the indigenous inhabitants of the regions into which the early Indo-European speakers moved.

Portraying the early Indo-Europeans as a uniquely fierce or malevolent people, as some of Marija Gimbitas’s followers were inclined to do, involves more ideological projection as sound appraisal. One can certainly stress the violent nature of their social interactions, but one can just as easily place the emphasis elsewhere. In fact, one can even turn the Gimbutas thesis on its head and portray the steppe-dwelling early Indo-Europeans as gender-egalitarian precursors to the hippies of the late 20th century. Although such a portrayal strays again into the realm of fantasy, it is no less reasonable than either the Herrenvolk (“master race”) or the “demonic Kurgan” theses. As such an inversion of the conventional framing of the original Indo-Europeans makes an interesting thought experiment, and I would ask my readers to indulge me here for a few paragraphs.

The prime evidence for “gender egalitarianism” among early Indo-Europeans derives, ironically, from the realm of war. As was mentioned in an earlier post, the Scythians, an Iranian-speaking group who maintained a largely pastoral way of life in the hypothesized IE steppe homeland, were noted for their female warriors. Herodotus famously wrote of the Amazon fighting women of the region, an observation partially conformed by recent archeological finds; as David Anthony reports, twenty percent of the Scythian/Sarmatian “warrior graves” of the lower Don and Volga river valleys include female remains that had been dressed for battle in identical fashion to the males whose skeletons were found in the same graves. The mere presence of women warriors does not, of course, imply actual gender egalitarianism, nor does it say anything about the social relations of the actual proto-Indo-European speakers, who lived in earlier times. It does, however, indicate a significant extent of female empowerment in an important IE group that maintained an equestrian mode of life on the Pontic Steppes.

Imagining the early Indo-Europeans as proto-hippies is made possible by the group’s close association with marijuana and perhaps other psychoactive plants. Building on the works of archeologists Andrew Sherratt and David Anthony, Al West argues that, “it’s possible that proto-Indo-European speakers became rich and powerful through selling … intoxicants,” further claiming that “Indo-European-speaking people traded THC-laden hemp from the steppes all the way down into the Near Eastern cities, which were naturally a major centre for trade from all over Eurasia. … If this scenario is right, then to the people of Babylon the arrival of Indo-European speakers must have seemed like one crazy dream.”

Although West is probably off-track in suggesting that proto-Indo-European speakers were responsible for the spread of cannabis as a recreational or spiritual drug, such an association is reasonably made for the progenitors of one the main branches of the IE family, the proto-Indo-Iranians. Evidence again comes from both Herodotus, who famously wrote of cannabis ingestion among the Scythians, and from archeological digs; Sherratt discovered charred cannabis residue in a Kurgan site dating back some 3,500 years BCE. Linguistic evidence also plays a role. The hemp plant, which produces valuable fibers and seeds in addition to its mind-altering resin, had been known across much of Eurasia for millennia, and thus had undoubtedly been referred to by many different local names. Cognates linked to the word “cannabis,” however, spread across and beyond the Indo-European-speaking realm in the third millennium BCE, which is believe by some experts to indicate that a new pharmaceutical use for the plant had been discovered and was itself expanding. Although the lines of linguistic descent are not clear, the new term for the plant, which eventually gave rise to the Latin word Cannabis, seems to have been associated with proto-Indo-Iranian steppe dwellers (see the discussions here, here, and here).

Cannabis was probably not the only mind-altering substance used by these people. Perhaps the largest mystery in the history of pharmacology is the identification of soma, the ritual intoxicant of the Rigveda, known as haoma in the Avesta (the sacred text of Zoroastrianism). More than a hundred Vedic hymns extol the unknown substance. Linguistic evidence indicates that soma/haoma was probably not cannabis, although it has been speculated that they were often consumed together. Numerous plants and fungi have been proposed as soma candidates, as spelled out in a detailed Wikipedia article. The primary division in the scholarly literature is between those who think that it was a hallucinogenic substance (such as the mushroom Amanita muscaria) and those who think that it was a stimulant, such as ephedra (also known as má huáng or “Mormon tea”). Recent research seems to be inclining in the direction of ephedra.

Regardless of its true identity, “soma” was ensconced in the Western public imagination by the publication of Aldous Huxley’s Brave New World in 1932, in which a drug called soma is used as mechanism of social control. More recently, the name has been embraced by the hippie community of northern California. The Wikipedia includes a “soma” article dedicated to a marijuana breeder of that name; the article itself notes that this particular Soma is “internationally known as a ‘Ganja Guru’ after developing award-winning cannabis strains.” I doubt very much, however, that ancient Indo-Iranian folk pharmacologists would have recognized this Soma as a kindred spirit.

The point of this excursion is not to argue that such a deeply anachronistic “proto-hippie thesis” has any merit. It is rather merely to show that making such an argument is possible. All human cultures are complex assemblages of ideas and practices, any number of which can be selected for emphasis. Especially when it comes to poorly understood cultures of the ancient past, we should be wary of any thesis that is based on any kinds of essential traits.

*Ethnic Groups of Insular Southeast Asia. Volume 2: Philippines and Formosa. Edited by Frank M. LeBar. 1975. New Haven: Human Relations Area Files Press. Page 76.


Ideological Agendas and Indo-European Origins: Master Race, Bloodthirsty Kurgans, or Proto-Hippies? Read More »

The Different Modes of Language Spread

In this second-to-last post on Indo-European origins and expansion, we turn once again to language diffusion, a cornerstone of the model employed by Bouckaert et al. A previous post asked whether languages actually spread by diffusion, arguing that the much more rapid process of advection is often more important. As was then pointed out, physical geographical factors, such as impassible mountains and fertile river corridors, guided such advectional movement. Today’s post considers language movement more generally—whether conceptualized as diffusion or advection—focusing more on the social than the natural environment.

A root error of Bouckaert et al. is regarding language expansion as a singular process. Actually, it can operate in two complete different modes: sometimes a language spreads with a group people, and sometimes it does so among different groups of people. To put it in most schematic terms, language movement occurs when a speaker moves from place A to neighboring place B, but it can also happen when a resident of A imparts his or her language to a resident of B. One process is basically demographic, the other conversional. In geohistorical terms, both forms of language expansion have been ubiquitous. They are generally meshed together in a complex manner, but sometimes one or the other process dominates. As they differ so fundamentally, it they could be realistically modeled in the same manner.

The clearest case of demographic expansion occurs when a single human group arrives on an uninhabited landmass and settles it. As the population expands in numbers and spreads geographically, its language will gradually differentiate into dialects and eventually into separate languages, as sub-populations pushing into new areas become socially separate and their forms of speech drift apart. Such linguistic differentiation could be arrested and reversed by state formation or the emergence of over-arching religious or other cultural institutions, but over the long span of the human past, divergence is usually the rule.

The settlement of Madagascar some 1,500 years ago is a prime example of such virgin-land expansion. Linguistic evidence confirms that the original Austronesian-speaking settlers arrived from Borneo in the Malay Archipelago. As their descendents spread over the mini-continent, their original language differentiated into dialects, some of which are regarded by linguistic splitters as separate languages (the Ethnologue lists ten). Later streams of migrants from the African mainland enhanced the island’s genetic diversity while introducing new linguistic elements, but the newcomers always adopted the language of the original settlers. As a result, all the indigenous forms of speech on Madagascar are very closely related, and are usually classified as variants of the single Malagasy macro-language.

Examples of the opposite process of conversional language expansion are common in today’s world. The process occurs whenever parents neglect to pass on their own mother tongue to their children, in favor of the language of one of their neighboring groups. Hundreds of languages have become endangered in over past generation alone by such changes in behavior. Most disappearing American Indian languages in the United States, for example, are in danger not because their populations are dying out or because their lands are being overrun by English speakers, but rather because decisions are made by parents to raise their children as English speakers.

Such processes of language abandonment and replacement are by no means limited to the modern world. A prime ancient example comes from the Philippine archipelago. Almost all Philippine languages belong to one branch of the Austronesian family, which is almost limited to the Philippines (see the map posted here). Such a pattern would seemingly indicate that the Philippines, like Madagascar, had been initially populated by a single group of settlers whose descendants subsequently spread over the archipelago as their language differentiated. But the actual demographic history of the Philippines was completely different. The original Austronesian settlers came to a land that had already been occupied by tens of thousands of years. Its indigenous* inhabitants were collectively called “Negritos” by Spanish authorities, a word meaning “small, dark-skinned people.” Their languages were undoubtedly unrelated to Austronesian, but we cannot say much beyond that. Although the Philippine indigenes have survived to this day, they abandoned their original tongues many centuries ago in favor of the Austronesian speech of the newcomers.

The social interactions between the Austronesian migrants and the indigenous inhabitants of the Philippines are poorly understood, but the key dynamics are evident. The newcomers were an agriculture people with much more highly developed technologies and forms of political integration than those held by the native foragers. The Austronesian migrants demographically overwhelmed most parts of the archipelago in short order, spreading their language(s) and well as their genes. Yet the indigenes held on in a number of rugged areas, particularly those characterized by heavy, year-round rainfall, such as the Sierra Madre Mountains of eastern Luzon** (in the winter dry season, the Sierra Madre catches rain from trade winds forced up-slope). From such redoubts, however, the indigenous foragers interacted extensively with their Austronesian neighbors, exchanging rain-forest products for agricultural and manufactured goods. Eventually, the languages of their trading partners fully “diffused” across their societies and then began to evolve in their own directions. Today, the several surviving “Negrito languages” are much more closely related to the languages of their neighbors than they are to each other. Strikingly similar processes have occurred elsewhere in the world. The most notable case is that of the “Pygmies” of central Africa, another group of diminutive, rainforest hunter-gatherers who long ago abandoned their own languages in favor of the tongues of their more numerous and powerful neighbors, in this case, languages in the Bantu sub-family of Niger-Congo.

The two cases explored above, Madagascar and northeastern Luzon, are best regarded as ends of a spectrum. Most examples of linguistic expansion involve both processes. When one language group expands it usually does so into the territory of a people speaking another language. As communication between natives and newcomers is essential, many individuals acquire a second language. Over time, such a process often leads to the linguistic conversion of the indigenous group—although advancing group are sometimes converted instead, in which case the language frontier retreats. Such encounters are generally accompanied by some conflict, as the native inhabitants typically resent the incursions of the newcomers, who in turn often use force to advance into new lands. To the extent that the indigenes are able to resist the settlers, they will delay the linguistic expansion. The effectiveness of any such resistance in turn depends on the relative numbers of the two groups and on their levels of political and technological development. Any realistic modeling of linguistic spread must take such factors into consideration.

Patterns of physical geographical play an important role here as well, as resistance by native inhabitants is usually more effective in areas of rough or otherwise difficult-to-traverse topography. In some cases, a particular climatic feature can stop language advance; the spreading Bantu-speakers, for example, encountered a firm barrier in the arid and Mediterranean climates of southwestern Africa, which precluded their faming practices and therefore created a refuge for peoples speaking Khoisan languages. Even the geometry of landmasses can play a role. As Anglo-Saxon speech spread across southern England, Celtic speakers were increasingly concentrated in the funnel-shaped peninsula of Cornwall, increasing their population density, shortening their defensive perimeter, and thereby enhancing their ability to resist the spread of English (further north, it was the rugged uplands of eastern Wales that afforded such protection).  Yet again, all such features must also be taken into account by any effective attempt to model language spread.

The movement of one language group into the territory of another typically results in complex and variable linguistic interactions. Outcomes again depend heavily on relative numbers and different levels of technological and political development. When a large group of technically advanced people spreads over a landscape occupied by scant numbers of less technically advanced people, the linguistic impact can be minimal. As English advanced across Australia, for example, it picked up place names, animal designations, and words for unique landscape features (such as billabong) from Aboriginal languages, but not much more. But when two groups with more similar levels of development come into contact, much more intensive linguistic interactions typically result. Sometimes the linguistic substrates bequeathed by vanquished populations can be profound at both the grammatical and lexical levels, at other times they are of little significance, and occasionally they seem to be minor at first glance but turn out to be surprisingly important.***

When a language group moves into the lands of a different people, the initial linguistic development is often that of widespread bilingualism. If the newcomers are dominant, as they often are, the subjugated indigenes will find advantage in learning the new language, but even members of the dominant group sometimes acquire the native tongue. Gender relations typically play a crucial role here as well. Men from the more powerful group often take women from the subordinated people, insisting that their native wives learn their language. Such women do so imperfectly, often imposing upon it sounds, words, and grammatical patterns from their native tongue. When they pass down the transformed language of their husbands to their children, a certain degree of linguistic fusion results.

The preceding discussion only hints at the possible complexities involved in the linguistic interactions that occur when one language group pushes into the territory of another. Even so, it deeply challenges the diffusion model of Bouckaert et al. Rather than advancing by steady progression, an expanding language often moves forward in a spatially dispersed manner, as its speakers establish themselves as a dominant social stratum in a foreign land. Many members of the native population will learn the new language, but they will at first continue rearing their own children in their own tongue. After a number of generations of such bilingualism, most parents in the indigenous group may opt to acculturate their infants in their second languages rather than in their mother tongues. As a result, a language could “spread” almost instantaneously over fairly sizable areas. Over broader areas, however, such a process is likely to be patchy, with some areas “converting” much sooner than others.

A prime example of such uneven processes of language change comes from Anatolia. Most of the region was Greek-speaking in the 11th century when the Turkish influx began. By the 13th century most of Anatolia was firmly under Turkish rule, and by the middle of the 15th century Greek political power had vanished everywhere. Throughout this period, Turkish gradually supplanted Greek, but along both the Black Sea coast and that of the Aegean Sea, largely bilingual but primarily Greek-speaking communities persisted until the expulsions of the early 20th century. And as we saw in an earlier post, mixed “Turkish-Greek” forms of speech emerged in some areas.

A second major challenge to the diffusion model emerging from this analysis involves the unpredictability of language change when two (or more) linguistic communities come to occupy the same general territory. Although one might expect that the language of the dominant group would always prevail, that is obviously not the case—if it were, England would have switched to a Romance language after the Norman conquest, and Russia would have ended up with a North Germanic language of its Variangian rulers. Instead, England kept a Germanic tongue, and Russia—a Slavic one.

Interesting examples of the uncertain nature of language change after a successful invasion come from the Danubian grasslands of central and southeastern Europe. From the fourth century to the ninth century CE, this area experienced four major incursions by non-Indo-European-speaking, militarily dominant, pastoral peoples from the steppe zone to the east: those of the Huns, the Eurasian Avars, the Bulgars, and the Magyars. All four groups built empires of a sort, and all subjugated the much more numerous local inhabitants. The Huns and the Avars, however, disappeared within a century or so with little trace, linguistic or otherwise. The Bulgars, on the other hand, built a kingdom so powerful that vestiges of it survive to this day in the form of Bulgaria, but their Turkic tongue vanished long ago, failing to maintain itself in the heavily Slavic environment over which the Bulgars ruled. The Magyars, on the other hand, were able to firmly establish their language, which is spoken today by roughly 15 million people, even though the Magyars themselves were a relatively small group, substantially outnumbered by the peoples that they dominated.

Could one have predicted the fates of the Hunnic, Avar, Bulgar, and Magyar languages merely from the basic facts of their migrations, conquests, and state formations? I rather doubt it, as far too many contingencies were involved over long periods and broad territories. More to the point, could any such processes be successfully modeled as instances of linguistic diffusion? Here the answer must be a definitive “no.” Of course Bouckaert et al. would object here, as they rule out all episodes involving the “rapid” spread of a single language. Yet over the past several thousand years, the rapid spread of single languages has been the stuff of linguistic history over broad segments of the terrestrial globe. If such processes are ignored, nonsense necessarily results.


*The term “indigenous” becomes problematic wherever multiple waves of settlement have impacted a particular place. The term is used here in the relative sense, referring simply to groups that predated other groups with which they are compared.

**Intriguingly, the most rugged area of northern Luzon, the Cordillera Central, did not serve as a refuge for the indigenous hunter-gatherers, as all of its recorded ethno-linguistic groups are descended from the Austronesian migrants. The Cordillera, the site of my own doctoral research, is an usual area in many respects, as it was historically characterized by higher population densities than those found in the adjacent lowlands to the east; dense populations, in turn, necessitated the construction of some of the world’s most elaborate agricultural terraces (see the photo to the left). In all likelihood, such high population density in the mountains resulted from Spanish pressure; residents of northern Luzon who did not want to submit to Spanish rule and forced Christianization fled to the uplands, where they had to build terraces in order to survive. Prior to this influx, small numbers of “Negritos” may have lived in parts of the Cordillera.

***Intriguingly, substrate influences that seem insignificant at first glance can actually turn out to be important. For decades, linguists looked for Celtic influences on English in the wrong places and thus could not find them; even such a recent, authoritative text as Baugh and Cable’s A History of the English Language (1993) states that, “Outside of place-names the influence of Celtic upon the English language is almost negligible” (p. 85). Currently, however, many of the linguistic peculiarities of English are being attributed to the Celts. These include the do-support construction (where do is required in questions and for negation), the diphthongization of long vowels (possibly, the first push that started the chain reaction of the Great Vowel Shift), expressing possession inside noun phrases, using the same –self items for reflexives (“John cut himself”) and intensifiers (“The president himself will visit”), using the same verb forms for both causative structures (“I broke the vase”) and inchoative ones (“The vase broke”), and the it-cleft (“It was a car that he bought”).



The Different Modes of Language Spread Read More »

How Large Was the Area in Which Proto-Indo-European Was Spoken?

As the current series on the origin and expansion of the Indo-European languages nears its completion, only a few remaining issues need to be discussed. Today’s post examines once again the mapping by Bouckaert et al. of the area likely occupied by the speakers of Proto-Indo-European (PIE). The focus here, however, is not on the location of this ancestral linguistic homeland, which they situate in southern Anatolia, but rather on the size of the area over which the language was supposedly spoken. The area so depicted on their maps, it turns out, is almost certainly much too large to be credible. By mapping a Neolithic language as covering almost one hundred thousand square kilometers, Bouckaert et al. demonstrate, yet again, a fundamental failure to understand the basic patterns of linguistic geography.   

Bouckaert et al. give a surprisingly precise figure for the area that their model indicates as the probable homeland of proto-Indo-European: 92,000 km2, roughly equivalent to the extent of Hungary or of the American state of Indiana (see the yellow polygon in the map to the left). But given the characteristically opaque phrasing of the authors, it is not immediately clear if this zone is supposed to represent the actual (likely) spatial extent of the PIE-speaking community, or if it is merely supposed to show the broader area in which a much more spatially restricted language group was located. One can deduce, however, that that the former argument is being advanced based on the authors’ framing of the spatial hypotheses supposedly advanced by two different proponents of the steppe theory:

The areas of the hypotheses are approximately 92,000 km2 for the Anatolian hypothesis, 421,000 km2 for the narrow Steppe hypothesis, and 1,760,000 kmfor the wider Steppe hypothesis. So, these areas show a bias toward the Steppe hypothesis; the area covered by the narrow Steppe hypothesis is more than four times larger than that of the Anatolian hypothesis. Likewise, the area covered by the wider Steppe hypothesis is more then (sic) 19 times larger than that of the Anatolian hypothesis.

As can be seen in the map posted here, the area outlined by the “narrow Steppe hypothesis” fits precisely within the area demarcated by the “wider steppe hypothesis.” Such a depiction would not be logical if Bouckaert et al. were proposing that these “areas” were merely the proposed zones in which in a more spatially restricted language had been located, as opposed to the probable zone that such a language actually covered. If the latter meaning had been intended, the “narrow Steppe hypothesis” would merely be a more precise version of the “wider Steppe hypothesis” rather than a different “hypothesis” altogether. One can thus conclude that the authors intend the yellow polygon to indicate the area over which Proto-Indo-European had been spoken, as posited by their model with the given parameters of uncertainty.


In the modern era, and to a significant extent across the past several thousand years, there is nothing unusual in a single language being spoken over a 92,000 square kilometer block of territory. But for such a situation to obtain, expansive spatial connectivity is necessary, which in turn depends on the power of the state or of some other form of social integration. In the world of Neolithic farmers, such regionally integrative institutions were almost certainly lacking, and as a result linguistic communities would have been much more spatially restricted. Such spatial limitations would have been even more pronounced in areas characterized by rough topography and formidable mountain ranges, as such barriers impede communication and thus enhance social and linguistic fragmentation. Yet as can be seen in the map posted here, Bouckaert et al. place the PIE homeland precisely in such a location. A single language spoken by tribal farmers over such a vast expanse of broken topography is all but impossible.

The situation in regard to the homeland identified by the steppe hypothesis would have been different. Under conditions of equestrian-oriented pastoral nomadism, linguistic communities could have occupied much larger territories than those found among agriculturalists living at the same time. The relatively flat topography of the steppe zone, moreover, would have allowed relatively easy communication among scattered groups. Sizable seasonal aggregations, often of a ceremonial nature, are also common under such circumstances, enhancing social solidarity over a broad expanse of land. But even given all of these considerations, the 421,000 km2 and the 1,760,000 km2 figures noted by Bouckaert et al. for the PIE homeland in two versions of the “steppe hypothesis” are still improbable. Geographically aware theorists thus tend to argue only that the original PIE homeland was situated in the western steppe zone, not over its full extent.

We cannot, of course, determine the areal extent of any prehistoric language, as the needed documentary evidence is lacking. It is tempting to associate specific languages with archeologically attested “cultures” that can be mapped, but it must be recalled that language often fails to correspond to groups defined on the basis of shared material culture; consider, for example, the “Pueblo Indians” and the Northwestern cultures of indigenous North America, both of which were highly multilingual, even at the language family level, yet substantially shared the same material cultures. Material culture, after all, is much more dependent on—and serves in part as an adaptation to—the physical environment, whereas languages seldom co-vary with physical geography; there is no way in which a certain word order pattern, or morphological type, or sound system would be more appropriate for any given landscape. All that we can do, therefore, is argue on the basis of contemporary analogues. Here we find that the areas covered by linguistic communities in those parts of the world that maintained “Neolithic” agricultural systems and forms of socio-political organization into modern times were of a restricted spatial scale. The archetypical location here is New Guinea, which is to this day characterized by pronounced linguistic fragmentation, as can be seen in the map posted here. One might object, however, on the basis that New Guinea is an extreme case and as such should not be used for comparative purposes. But in historically stateless areas elsewhere in the world, even where Neolithic technologies were superseded millennia ago, highly restricted linguistic territories remained the rule, as can be appreciated from the language map of central Nigeria posted here.* Maintaining a single language over an area as large as Hungary in such a context is highly unlikely, to say the least.

Similar objections apply to the mapping of the proto-languages of the major IE branches in Bouckaert et al. One must again consider the authors’ intentions in regard to their portrayal of these languages. It is not exactly clear, for example, what they mean by “the inferred location at the root of each subfamily is shown on the map” (see the map caption posted to the left). The “inferred location” of what? Presumably, they mean the inferred location “of the root,” and presumably “the root” refers to the proto-language that later generated each IE branch. It is still not clear, however, whether the colored areas are supposed to indicate the likely locations over which these proto-languages were spoken, or whether they merely show the probable zones in which much more spatially restricted languages were spoken. If the former scenario is indeed the case, the areas depicted are again much too large.

Of the “root languages” mapped on this figure, that of the Indo-Iranian languages is most preposterous. The previous post specified most of the problems associated with this inferred location. The map posted here also shows the extraordinary disconnection between the existing archeological evidence and the spatial hypothesis advanced by Bouckaert et al. I would further note that the area they advance for the origin of the Indo-Iranian languages makes no sense from the standpoint of physical geography. Its western apex is located in the middle of the uninhabitable Dasht-e Kavir (Great Salt Desert), its central portion is situated in the heights of the Hindu Kush, and its eastern extremity lies in the fertile plains of Punjab. It is unthinkable that any sedentary Neolithic population would have occupied such a territory at any given point in time.

*One could, however, argue that New Guinea and central Nigeria are highly linguistically diverse in part as a function of time. Both areas have been inhabited by modern humans for a very long period. Most of Eurasia has been populated by Homo sapiens sapiens for considerably time than West Africa, and to some extent even New Guinea (the presence of Neanderthals probably impeded the movement of modern humans into western Eurasia for millennia). As a result, one might expect somewhat greater linguistic differentiation in those places as compared to southern Anatolia. But it is also true that the Americas, which had been populated by modern humans for less time than western Eurasia, were also characterized by pronounced linguistic diversity. Significantly, agricultural areas in pre-Columbian North and South America that were not occupied by state-level societies were characterized by spatially restricted language groups.


How Large Was the Area in Which Proto-Indo-European Was Spoken? Read More »

The Consistently Incorrect Mapping of Language Differentiation in Bouckaert et al.

As mentioned in previous GeoCurrents posts, the animated map that accompanies the Science article of Bouckaert et al. depicts their model in action, showing the expansion and differentiation of the Indo-European languages in time and space. Earlier posts criticized the map’s contour shadings, which indicate high probabilities of IE languages being spoken in given areas at given times. Today’s post takes on a related issue, that of the branching lines that spread across the map as the presentation unfolds, indicating both linguistic relationship and the general directions of language-group expansion. Here we can clearly see that the model generates a nearly continuous stream of misleading information and outright error.

Analyzing the ramifying lines on animated map is challenging. Nothing is labeled, colors are often hard to differentiate, and no key is provided. The companion website does promise a “legend for movie S1,” but provides only a brief caption: “Movie showing the expansion of the Indo-European languages through time. Contours on the map represent the 95% highest posterior density distribution of the range of Indo-European.” One must thus infer what the lines represent based on the supplementary text and on the manner in which different segments lengthen and divide in particular places as time proceeds.

Each line represents a branch of the Indo-European language family. Those that appear early in the animation indicate the deepest divisions, while those that emerge later represent the shallower splits of linguistic “sub-sub-families” and so on. In some cases, minor instances of linguistic differentiation are marked, extending down to the dialectal level. The North Germanic line, for example, begins to bifurcate on the Sweden-Norway border in the late 1700s, showing the divergence of Norwegian and Swedish, and then splits again in central Sweden in the mid 1800s, indicating differentiation that, according to the authors, produced three separate Swedish languages (see the maps below). Over most of the map, however, splits at the level of individual languages, let alone that of dialects, are not noted: if they were, the map would be so cluttered by the end as to be undecipherable. Yet again, consistency does not seem to be a priority.

The lines are not of uniform appearance. Older language stems are clearly depicted in a darker shade than more recent branches. As elsewhere, differentiating the hues employed is difficult, especially after the background color used to denote IE languages in general abruptly changes from yellowish-greens to shades of blue-green. (As a result of this problem, in some of the maps that follow I have changed the green lines under investigation to shades of red.) Interpreting differences in line shape and thickness is another challenge. Almost all lines are equally thick and even, extending uninterrupted across the map. In some instances, however, thin, irregularly shaped spurs emerge from the main stems, some of which eventually thicken and spread into new areas. Certain lines are interrupted, with unexplained gaps appearing on the map. Some of these gaps seem to indicate language divergence without diffusion, but other remain mysterious, as is the case with the differently shaped and colored line fragments that appear in what is now western Germany (see map detail to the left). By the end of the animation, Italy is covered by a jumble of oddly uneven and discontinuous lines that are almost impossible to parse out, as can also be seen on the map posted here.

The spatial extension of the lines over time seemingly indicates the pace of expansion of the various IE subgroups into new territories, while the shaded contours depict the expansion of Indo-European as a whole. The two methods of showing expansion, however, do not always correspond. While the 95 percent probability contour for IE as a whole never reaches Russia (except for a tiny zone in near Pskov), the East Slavic line pushes well into what is now western Russia, although it does not do so until the early 1600s. Such a depiction is of course absurd on face value, as East Slavic languages had been spoken in this area and well beyond it for many hundreds of years; it must be recalled, however, that the animated map is designed to show only the latest possible time of expansion, not the actual period in which it occurred.

The major significance of the lines, however, is not their depiction of language group expansion but rather of linguistic divergence. The authors emphasize repeatedly that their animated map depicts the locations at which linguistic differentiation occurred, which in turn generated the branching patterns of the Indo-European tree. Although they formally model such divergence as occurring at precise points, they admit that it cannot be pinpointed in such a manner:

Our phylogeographic model allows us to infer the location of ancestral langauge (sic) divergence events corresponding to the root and internal nodes of the Indo-European family tree. Since we model internal node locations as points in space, our posterior estimate for the location of divergence events can be interpreted as a composite of the range over which the ancestral language was spoken and stochastic uncertainty inherent in the model.

Regardless of the uncertainty that the model encompasses, language divergence cannot realistically be modeled as occurring through discrete events that happen in restricted places. The differentiation of languages is rather a process that often occurs over an extended period through an expansive area of related dialects (see the earlier GeoCurrents post on the “wave model”). Leaving such objections aside, however, it must still be asked whether the model of Bouckaert et al. accurately depicts the generalized locations and timings of the divergence “events” that gave rise to the different branches of the Indo-European family, allowing that they did not occur at the precise points indicated on the map, but rather merely in the general vicinity of those places. Here the answer is—yet again—an emphatic “no.” As it turns out, virtually every depiction of linguistic differentiation that can be traced by historical sources is incorrect. Considering as well the erroneous mapping of linguistic expansion given by both the extending lines and the spreading contours, the animated map can only be regarded as a vast compendium of error. It is not that it fails to get everything right, but rather that it gets virtually nothing right.

To illustrate the level of error generated by the model, I will examine in detail the depictions of the expansion and differentiation of several branches of the Indo-European family. One could do the same for all IE sub-families, but such an exercise would be unnecessarily tedious. Before beginning the exercise, a few stipulations are necessary. To begin with, the following analysis is based strictly on the animated map, ignoring material found elsewhere in the article or website, which often runs against the cartographic depiction. While the authors note in their textual supplements, for example, that West Germanic speakers arrived in Britain around 400 CE, the map delays the event for several hundred years. Yet as we have previously seen, what such a cartographic portrayal actually means is that the diffusion of Germanic languages to Britain could have occurred no later than the date indicated by the map, within the general parameters of uncertainty allowed. My point, however, is that we know from historical sources that Germanic languages definitely arrived in Britain at a much earlier period, as the authors themselves acknowledge. If the cartographic depiction of the linguistic “Germanification” of Britain is thus not simply “wrong,” it is both misleading and exceptionally trite.

The Greek and Albanian subfamilies make good starting point, as their cartographic depiction is particularly telling. Bouckaert et al. idiosyncratically regard Greek and Albanian as together constituting a distinct IE sub-family. (Most linguists regard Albanian as an IE isolate that shares certain affinities with Balto-Slavic, Germanic, and Greek; the Science authors classify it with Greek most likely on the basis of borrowed words, as the two languages have been in intimate contact for millennia). Their animated map depicts the ancestral Albano-Hellenic group as arriving on the eastern shores of the Greek Peninsula circa 3000 BCE, and then differentiating into the Greek and Albanian branches around 1500 BCE. Greek then pushes southward into Attica (the Athens area), while Albanian moves to the west into Thessaly in what is now central-eastern Greece. Subsequently, virtual stasis ensues for a few thousand years, with no significant movement of either branch and no further linguistic differentiation. Motion finally kicks in during the thirteenth century CE, when Albanian experiences a “divergence event” in central Greece and begins expanding to the west and north. By the 1500s, the northern Albanian branch finally reaches what is now Albania. At about the same time, the southern Albanian line begins a several-hundred-year maritime phase during which it diffuses across the waters of the Adriatic, finally reaching southern Italy in the 1800s.

The actual geo-histories of the Greek and Albanian languages are completely unlike the fantasy version advanced by the model. As it would again be wearisome to recount all of the many errors involved, I will focus instead on explaining why their depiction is so spectacularly wrong. As is generally true, the erroneous portrayals of these two language groups was predetermined by the error-pocked initial map of language distribution, ancient and modern, that informs the mathematic model. As was discussed in earlier posts, Illyrian, the likely progenitor of Albanian, is ignored, Ancient Greek is absurdly shown as limited to Attica, Albanian is unreasonably divided into four languages, and the areas occupied by Albanian-speaking communities in southern Greece are grotesquely exaggerated while those of Albania itself are absurdly reduced. As garbage is fed into the equations, garbage not surprisingly comes out.

The depiction of the Balto-Slavic languages is risible as well. This language sub-family is portrayed as branching off the main western IE stem circa 3000 BCE in the northern Danubian basin, and then as heading northward over the Carpathian Mountains into what is now central Poland. A small gap emerges on this line circa 950 BCE roughly along the Carpathian crest, which might indicate the Slavic languages differentiating from the Baltic ones. The Baltic line then continues to move northward, although it does not reach Lithuania until the fifth century of the Common Era. A Slavic spur, meanwhile, clearly emerges at roughly 300 BCE, again in the Carpathian Mountains, and begins to slowly creep southward in the early centuries of the Common Era. Diffusing back across the Danubian Basin, it reaches what is now Croatia in the 600s CE. By 900, it has extended as far south as Macedonia, at which point it breaks into several segments. East Slavic emerges out of the same Carpathian hub in the mid 900s CE, and then heads in a northeasterly direction; a hundred years later, West Slavic makes its appearance, branching off from roughly the same location. By the early 1600s, West Slavic has moved westward along the modern Czech-Poland border, approaching what is now eastern Germany. Over a hundred years later, it finally reaches the area now occupied by the Lusatian (Sorbian) speakers. Meanwhile, the East Slavic branch generates three smaller branches circa 1600 in the area where modern Poland, Ukraine, and Belarus converge; these twigs presumably represent Ukrainian, Polish, and Belarusian, which Bouckaert et al.—and no one else—regard as forming a minor Slavic sub-family.

Everything that we know about the historical evolution and distribution of the Slavic languages directly contradicts the mapping of Bouckaert et al., as we should now come to expect. As it would again be tiresome to specify all of these errors, I will note only a few of the more glaring examples. First, it has long been established that the Slavic languages had expanded westward all the way to the Elbe River in what is now central northern Germany in the immediate post-Roman period, entering the lands that had been essentially abandoned by the Germanic tribes that invaded the dying Western Roman Empire. It is also understood that the process of Drang nach Osten in the high medieval period resulted in the re-Germanization of the far western Slavic lands, extending as far east as Silesia and Pomerania. The Lusatian-speaking areas, however, resisted this tide, and thus long remained as Slavic enclaves in a Germanic sea. Silesia and Pomerania, however, were in turn “re-Slavicized” after the post-WWII expulsions of German-speakers. The modeled spread of the South Slavic languages is equally off base. It is also well known that Slavic languages pushed southward into Greece beginning in the 500s and especially during the chaotic aftermath of the Byzantine coup of 602, reaching the central Peloponnesus by the end of the century. As Byzantine power collapsed though most of the peninsula, the Greek language retreated to coastal enclaves. The re-Hellenization of the Greek Peninsula did not begin until the reign of the Empress Irene in the late 700s, and was never fully completed. In regard to the East Slavic branch, numerous absurdities have been discussed in previous posts, and hence will not be recounted here.

Perhaps the most amusing depictions concern the expansion of Insular North Germanic, a minor branch that today includes only Icelandic and Faroese. Recall that Bouckaert et al. model the spread of languages over water the same way that they model it over land, only at a much slower pace (with the exception of their “sailor [sub-] model,” which postulates equal rates of expansion over water and land.) But they always take expansion over any surface as a gradual, diffusional process; recall that instances of “rapid” expansion are purposively ignored, although the pace required for such a designation is never specified. The expansion of North Germanic languages to the islands of the North Atlantic is thus modeled as an example conventional diffusion across isotropic space. The animated map thus show the language group spreading out of northern Denmark in the 700s and heading into the North Sea. Some two hundred years later, these languages are portrayed as reaching the Faroe Islands, and by the mid 1000s they are shown as having finally landed on Iceland.

The only way to make sense out of such mapping is to imagine the speakers of these languages as living at sea on boats that remained relatively stationary over the course of many years, gradually diffusing to the north as the decades passed. The authors, I am almost certain, would object to this characterization, noting that their mapping of Insular North Germanic expansion is not actually meant to depict what it actually does depict (“the language could have arrived any time earlier than the date at which our model shows it as arriving”).  The fact remains, however, that the ancestral language of Icelandic—Old Norse—arrived in Iceland by way of a few voyages that lasted weeks, not month or years, let alone centuries.  This relatively well-attested process was intentional, can be dated relatively precisely to the late 800s, and is known to have been initiated largely by men from what is now Norway, although most of their wives/female-slaves were Irish (see, e.g., Bryan Sykes’ Saxons, Vikings, and Celts: The Genetic Roots of Britain and Ireland). By the explicit criteria specified by the authors, such a “rapid expansion of a single language” should have been ignored. But regardless of how such particular instances are handled, it is clear that if one insists on modeling the spread of languages to distant islands by a process of diffusion, nonsense necessarily results.

Finally, the portrayal of the Romance languages is equally ludicrous. This history of this group is particularly well known, as the spread and differentiation of the various Romance languages, all descended from Latin, occurred in relatively recent times and have been thoroughly documented in written sources, many of which Bouckaert et al. reference in their supplementary materials. Latin spread rapidly with the armies and administrative hierarchies of the Roman Empire, and is hence discounted by the model. As Latin expanded, it began to differentiate, a process that began well before the establishment of the Empire; as noted in a previous GeoCurrents post, a non-IE substrate on Sardinia evidently resulted in significant divergences from standard Latin on the island during the Republican period. Elsewhere, various vernacular forms of speech began to diverge under Roman rule, a process that accelerated after the fall of the Western Empire in the fifth century. The result was the establishment of a widespread Romance dialect continuum that eventually gave way, although never completely, to the standardized national languages of the modern era.

Now consider the manner in which Bouckaert et al. model the spread of the Romance languages. As they do no consider the initial expansion of Latin, they keep the Romance branch confined to central Italy until the fall of the Western Empire. As the empire weakens in the third century, new branches seem to emerge and begin to diffuse in this Italian heartland, although the color scheme leaves some doubt about this process (see the map call-outs). Romance languages clearly emerge in the following century, and by the early 600s one branch finally makes its way to what is now southern France, whereas another has extended to the middle of the Adriatic Sea. Three hundred years later, the western branch reaches the Pyrenees. In the twelfth century, another “divergence event” produces the group that encompasses French and Walloon; beginning along the Mediterranean coast, this division does not reach central France until the 1600s. The Iberian branch, however, is even more delayed, not reaching Portugal until the late 1800s. At about the same time, another Romance sub-family finally makes its landfall in Sardinia.

I anticipate that if the authors were to respond to such criticisms, they would charge me with engaging in a naively literal reading of their animated map. Language divergence “events” along a branching patterns of linguistic differentiation, they might insist, have to be mapped as if they took place at a single location, when in actuality the model supposes only that they took place somewhere within the much larger areas in which the given parent languages were spoken. Such an objection would be fair enough, but it still does not hold water if the actual differentiation processes took place hundreds of miles away from the areas indicated on their maps. In actuality, French emerged out of the Germanic-influenced “Vulgar” Latin dialect(s) of the Paris Basin, and subsequently spread outward, due in large part to the power and prestige of Paris and the French state. Significantly, it did not diffuse outward in an even manner, but rather spread to cities and town well before it penetrated the countryside. French also expanded more slowly where it encountered markedly different dialects/languages, and where other Romance dialects had already established their own prestige registers. Yet again, the issue is not that Bouckaert et al. make few mistakes and that we are unwilling to tolerate error, as has been charged. The issue is rather that their model gets just about everything wrong, often spectacularly so.



Sykes, Bryan (2007) Saxons, Vikings, and Celts: The Genetic Roots of Britain and Ireland. W. W. Norton & Company.

The Consistently Incorrect Mapping of Language Differentiation in Bouckaert et al. Read More »

Linguistic Phylogenies Are Not the Same as Biological Phylogenies

(Note: This post is jointly written by Martin Lewis and Asya Pereltsvaig)

A key assumption of Bouckaert et al. is that the diversification and spread of languages operates so similarly to the diversification and spread of biological organism that the two processes can successfully be modeled in the same manner. The parallels between organic and linguistic evolution are indeed pronounced. Both processes entail replicating codes that continually change, giving rise to novel varieties that increasingly differ from their progenitors over time. As a result, “phylogenetic trees,” showing descent from common ancestors, are a common feature of both evolutionary biology and linguistics.

But despite their similarities, organic evolution and linguistic evolution are in many ways highly dissimilar. Encoding information for communication is not the same as encoding information that generates life: language is vastly more fluid and complex than the genetic code; individual languages are much less clearly differentiated from each other than are species; and language is a social phenomenon, given to influences largely irrelevant for biological evolution. The key differences can be summarized as follows: biological evolution is unconstrained but governed by natural selection (any mutation can happen, but which mutations remain in the pool depends in large part on natural selection), whereas linguistic variation (seen in terms of deep grammatical properties) is constrained by a system of parameters but is not subject to natural selection. As a result, the branching trees of linguistic descent are merely analogous to the phylogenetic diagrams of biological evolution, and do not indicate the same kind of relationships.

Although organic evolution operates through a much more restricted set of message-carrying units than does human language, it nonetheless produces diversity at a much deeper level. Given the biological constraints of the human brain/mind (as of yet less than fully understood), there are only so many ways in which any given language can be structured. To be sure, the number of possible human languages, both extant and extinct, as well we those that may arise in the future, is vast, but all human languages appear to be “variation on a theme,” guided by the same parameters. Some languages have as few as two vowels (Ubykh, Northwest Caucasian) and others as few as six consonants (Rotokas, North Bougainville); other languages may have as many as 20 vowels (e.g. the Taa language, spoken in Botswana and Namibia, is reported by some sources to have as many as 20 or even 30 vowels, depending on analysis) and as many as 84 consonants (as in Ubykh; the Taa language is reported to have 87 consonants under one analysis, 164 under another). But crucially, all languages differentiate vowels from consonants and use both. Some languages put verbs before subjects and objects, while others place them at the ends of sentences, but all languages have verbs, subjects and objects.* Some languages can build sentence-long words packed with of numerous prefixes, infixes, or suffixes, while others use stand-alone, stripped-down words to do the grammatical work of expressing tense, number etc., but all languages make words from morphemes—and all construct sentences. As a result of this limited space of possibilities, completely unrelated languages evolving on their own often come to share major grammatical traits.

Linguistic evolution, unlike that of the biological realm, moves at a rapid clip. In non-literate societies, words change so quickly that after some five to eight thousand years not enough cognates can be traced back to establish linguistic relatedness. In the same time span, grammatical structures can undergo wholesale transformations, and sound inventories can change drastically as well. As a result, even clearly related languages can have next to nothing in common with each other, and can only be linked through investigations into their ancestors. Hindi and English, two of the three most widely spoken Indo-European languages, are dissimilar in almost every respect.** On casual inspection, Hindi would seem to have more in common with the non-Indo-European languages of the Indian sub-continent than it does with English.

Thus, relatedness at the family level and overall linguistic similarity often fail to correspond. Maps showing major language patterns typically bear little if any resemblance to maps depicting linguistic families. Even something as seemingly basic as word order correlates poorly with lines of descent. For example, Indo-European languages can be SVO (subject-verb-object; marked by red dots on the map to the left), such as English, Romance, and most Slavic languages (but Sorbian, a Slavic language, is SOV); SOV (marked by blue dots), such as the Indo-Iranian languages (yet Kashmiri is SVO); or VSO (marked by yellow dots), such as the Insular Celtic languages (yet Cornish is SVO). Some other families, such as Austronesian, have an even greater variability in the basic word order:  Niuean is VSO, Malagasy is VOS, Rotuman is SVO, and Tuvaluan is OVS.

Similarly, features of morphological typology (how words are formed from morphemes) often cross-cut connections established by common descent. Whereas Proto-Indo-European, like most of its daughters, was a synthetic language (building words from multiple non-root morphemes), English and Afrikaans are relatively analytical (with low ratios of morphemes to words), which gives them a certain affinity with Mandarin Chinese (a highly analytical language). As discussed in an earlier GeoCurrents post, isolating languages are found in Africa (Hausa, an Afroasiatic language), Asia (Vietnamese, Austroasiatic), Oceania (Rapanui, Austronesian), and the Americas (Kipea, Kiriri). In phonology as well, similar patterns obtain, as sound inventories often fail to show systematic correspondences with language families. The Indo-European languages of South Asia, for example, are in many respects more phonologically similar to the Dravidian languages of the same region than they are to most other IE language. One of the characteristic phonological markers of the region, the rich inventory of retroflex consonants, is also scattered across the rest of the world, found in about 20 percent of all languages belonging to a wide variety of families.

One of the best ways to appreciate the relative insignificance of language families in regard to the global distribution of such features is to explore the maps that can be generated on the WALS website, such as the one reproduced above. Few if any of these maps bear much resemblance to the familiar depiction of the world’s major language families.

Again, the contrast with biological evolution is stark. The farther removed organisms are from each other on the tree of life, the fewer genes they necessarily share. Even when convergent evolution results in similarities between distantly related organisms, the parallels are relatively superficial. As a result, modern genetic inquiry can establish precise levels of biological relatedness, a process that has revolutionized taxonomy over the past few decades. In the biological realm, moreover, the farther one moves up different branches of evolutionary descent, the more distinctive the organisms found along it generally become. Chordates (the phylum that includes vertebrates) share a distant common ancestor with echinoderms (sea stars and their relatives), and some tunicates, primitive members of phylum Chordata, might be mistaken by unschooled observers for sea lilies in phylum Echinodermata. (Tunicates more generally look like unrelated jellyfish and other cnidarians; a few could be mistaken for rocks, but such rocks disconcertingly bleed when cut open.) But no one would ever mistake any mammal with a sand dollar, a sea cucumber, or any other echinoderm, animals characterized by radial rather than bilateral symmetry. The two phyla have simply evolved in strikingly different directions. If linguistic evolution worked in the same manner, it is questionable whether translation between distant languages would even be possible. Moreover, the disparate patterns of spatial distribution of deep grammatical properties, such as the ones illustrated by the WALS maps, would not be found.

In language, deep grammatical properties can radically change, often taking on the same forms as those encountered in wholly unrelated tongues. As a result, linguistic relationships are often anything but obvious, and can only be discerned though intensive study; significantly, such hidden connections can hold true even for relatively recently emerged languages. A fluent speaker of the major Germanic languages, for example, might be nonplused to learn that Frisian is more closely related to English than it is to Dutch. Yet according to some specialists, even Low German is “phylogenetically” closer to English than it is to (High) German—even though Low German is generally regarded as a mere dialect (or group of dialects) of German!

Linguistic evolution is only vaguely analogous to organic evolution for a variety of reasons, but a crucial factor is the fact that vastly less sharing occurs across biological lineages. We now know that genes can jump from one species to another, but the process is relatively rare; in this realm, change generally occurs as a result of random mutations acted upon by natural selection, not from the borrowing of elements from other species. When it comes to languages, however, sharing is ubiquitous. Languages are almost always borrowing words, and sometimes they adopt grammatical properties of other languages as well. At times, two completely unrelated languages essentially merge to create a hybrid tongue. To be sure, linguists are almost always able to determine which language contributed more elements and more basic structures, and hence should count as the parent tongue. (It should be noted that the use of the terms “parent” and “daughter” in relation to languages is misleading since, unlike in the biological realm, where individual organisms are discrete, the transition from “parent” to “daughter” language is always gradual.) When it comes to creole languages, however, such determinations are not always easy. In regard to grammar, different creoles of completely different parentage are often more similar to each other than they are to any of their source languages. In some instances of mixed languages, admixtures of vocabulary, grammar, and phonology run so deep that linguists abandon the quest for unambiguous classification. Cappadocian Greek, for example, is slotted by the Wikipedia into the seemingly impossible “Greek-Turkish” language family. Does Indo-European therefore encompass this language? Other sources, such as the Ethnologue, place this language in the Greek branch of the Indo-European family, but Turkish influences on Cappadocian Greek are pronounced: it has certain sounds that have been borrowed from Turkish, as well as vowel harmony; it has developed agglutinative inflectional morphology and lost (some) grammatical gender distinctions; and its basic word order is SOV. And Cappadocian Greek is by no means the only example of such a thoroughly “mixed language.” In the biological realm, in contrast, such mixtures are so obviously impossible that they have generated their own nonsense genre, as exemplified by Sara Ball’s delightful flip-book, Crocguphant.

Linguistic family trees must therefore be taken as often showing lines of partial descent, unlike the phylogenetic diagrams of organic evolution. To gain a more complete understanding of linguistic relatedness, it is necessary to complement language families with other kinds of connections. The various languages of a Sprachbund, or a linguistic convergence area, for example, derive from different families, yet nonetheless come to share many features through long histories of mutual interaction. One must also consider linguistic strata, which take into account the influences imposed by one language on another. The role of a linguistic substratum, derived from a previously existing language that was later supplanted by another tongue, can be profound. In many cases, such linguistic substrates were instrumental in generating subfamilies; the Germanic languages, for example, are distinct from other Indo-European languages not merely because they drifted in their own particular direction, but also because that acquired a major substrate from another (unknown) language family. Sometimes, the ghostly presence of a long extinct language or language family can be detected through such substrates. Vedic Sanskrit, for example, was definitely an Indo-European language, but it was influenced not only by the preexisting Dravidian and Munda languages of the Indian subcontinent, but also by an unknown substrate deemed by Colin Masica “Language X.”

A useful alternative to the linguistic tree is the so-called wave model, or Wellentheorie, originally devised to explain some of the characteristics of the Germanic languages that seemed to defy the phylogenetic approach. In wave theory, fluid dialect continua replace the stable, geographically bounded languages required by models predicated on direct descent from ancestral tongues. Here, innovations can occur at any points within a dialect continuum; such changes then spread outward in a circular manner, eventually dissipating as the distance from the innovation center increases.*** If a bundle of innovations substantially overlap and become entrenched, a new dialect, or even language, can be said to have emerged. But according to wave theory, such a “language” is still best viewed as an “impermanent collection of features at the intersections of multiple circles.”

Wave theory does recognize, however, the fact that a single language/dialect can appropriate an entire dialect continuum, subordinating more localized speech forms and eventually driving them into extinction, as indeed was the case in regard to Standard German over most of Germany. Such a process, however, generally requires the power of the state or of some other overarching institution. Such geographically expansive and culturally potent organizations, however, are a feature of the relatively recent past; for most of humankind’s existence, the institutions necessary for producing linguistic standardization over broad areas were lacking. We are so used to the modern world of mass communication over vast distances and of language-standardizing governments and educational systems that we easily forget that in earlier times, and in many remote areas to this day, different linguistic environments prevailed. Overall, we suspect that for most of human history, the wave theory more accurately captures the process of language change than does the standard phylogenetic model. Yet in the most general terms, the two models complement each other relatively well.

*Debate does rage, however, about whether the so-called “non-configurational languages” such as the Australian language Warlpiri, have subjects and objects in the same sense as the more familiar, “configurational” languages like English or French. The reader is referred to Baker (2001) for evidence of subject-object asymmetries in such non-configurational languages.

**For example, Hindi makes a phonemic distinction between aspirated and unaspirated voiced stops, has fusional case/number morphology, subject-object-verb word order, postpositions, and uses the ergative-absolutive alignment in the preterite and perfect tenses; English, in contrast, has no aspirated voiced stops (and does not use aspiration phonemically at all), has largely abandoned fusional morphology, has lost the case system except with pronouns, employs a subject-verb-object word order, uses prepositions rather than postpositions, and is characterized by nominative-accusative alignment.

***Ironically, the diffusion analogy of Bouckaert et al. may be best suited to describing dialectal continua rather than divergence and expansion of languages and language families; we shall return to this point in a forthcoming post.



Baker, Mark C. (2001) The Natures of Nonconfigurationality. In Mark Baltin and Chris Collins (eds.) The Handbook of Contemporary Syntactic Theory. Oxford: Blackwell. Pp. 407-438.


Linguistic Phylogenies Are Not the Same as Biological Phylogenies Read More »

Two More Weeks of Indo-European Linguistics

Dear Readers,

We have received a few complaints that GeoCurrents is focusing almost exclusively on a specific controversy in historical linguists, and that as a result it is ignoring other issues of interest and concern. We will readily admit that we been somewhat obsessive of late, as we find this particular issue deeply fascinating and highly significant from an academic perspective—although maddening as well. But we are approaching the end of the current series. We currently anticipate two more weeks of posts on Indo-European linguistics, after which the site will revert to its more typically varied and eclectic range of topics. But that does not mean that we will be abandoning this particular issue. On December 13th, Asya and I will be delivering a talk at Stanford University on the “Mis-modeling of Indo-European Languages,” which will be co-sponsored by the Stanford Linguistics Department and the Stanford programs in the History of Science and World History. We will subsequently decide whether we should try to publish our work on this subject or merely archive it in a special series on the GeoCurrents site.

Two More Weeks of Indo-European Linguistics Read More »

103 Errors in Mapping Indo-European Languages in Bouckaert et al. Concluded: Part V, Western Europe

By now, all of the cartographic failings of Bouckaert et al. have become familiar. On the map of France and neighboring areas, for example, we see the unreasonable elevation of minor dialects to the status of discrete languages (three forms of Breton make the list), the replacement of a non-Indo-European language with an Indo-European languages (the Basque region is shown as French speaking), the improper use of political boundaries as linguistic boundaries (French is not shown as extending into Switzerland), the preferential classification of dialects as languages when they are associated with states (Walloon counts as a language, unlike the other equally distinctive langues d’oïl of northern France or the langues d’oc of southern France; Flemish counts as a language, unlike other equally distinctive forms of Dutch), and the simple geographical misplacement of languages (Romansh is placed in northwestern Italy rather than southern Switzerland). Of particular note in regard to the linguistic mapping of France is the fact that Corsica is completely obliterated by circle #48 (see the map of the Italian Peninsula in the previous post).


The mapping of the Iberian Peninsula is particularly simplistic. The authors have simply placed Portuguese in Portugal and Spanish, along with Catalan, in Spain. The fact that Galician in northwestern Spain is closer to Portuguese than to Spanish is ignored, and the Basque-speaking region is mapped as if it were Spanish speaking. The Balearic Islands are also neglected, as archipelagoes generally are in the authors’ land-biased approach.


The map of the British Isles severely misconstrues the Celtic tongues. Irish, for example, is shown as extending across all of the Republic of Ireland and as entirely absent from Northern Ireland. In actuality, Irish has long been largely limited to the western margin of the island, and as late as the early 20th century was still spoken in parts of what was to become the political unit of Northern Ireland. The mapping here, in other words, is yet again political rather than linguistic. By the same token, Welsh is placed in the coal-mining districts of southern Wales where it has been absent for generations, just as Cornish is depicted in areas where it was not been spoken for hundreds of years. The mapping of Scottish Gaelic is not bad, but the term used—“Scots Gaelic”—is off the mark. The proper term is “Scottish Gaelic,” as “Scots” refers to a different language altogether. Scots, or Lowland Scots, is usually regarded as a highly distinctive form of English, but some linguists regard it as a language in its own right (CNN has recently reported on the demise of one of its dialects).*




The mapping of extinct language is also poorly executed. Old English is essentially restricted to the historical kingdom of Wessex, even though the language extended as far north as the Edinburgh region of what is now southeastern Scotland, and included dialects of Kent, Mercia, and Northumbria. Significantly, even the Wessex (West Saxon) dialect of Old English extended farther to the east than what Bouckaert et al. would allow for Old English in its entirety.

The language map of Bouckaert et al. that I have criticized over these past five posts is a cornerstone of their model, yet it is also wholly inadequate for the task. Many of the errors found here ramify through all of the maps that they have produced.  But even if a serviceable map had been constructed, the model would still yield nonsense, as most of the assumptions upon which it is based are unwarranted, as we shall in more detail see in subsequent posts.

*Although I am no expert on this topic, I would argue that Lowland Scots is almost but not quite interintelligible with Standard English, especially in its spoken form, and thus deserves to be regarded as a separate language. Although I love the poetry of Robert Burns, I generally need translation. Take for example, these versus from “Auld Lang Syne”:

In the Original Scots:

We twa hae run about the braes,

and pu’d the gowans fine;

But we’ve wander’d mony a weary fit,

sin auld lang syne.


We twa hae paidl’d i’ the burn,

frae morning sun till dine;

But seas between us braid hae roar’d

sin auld lang syne.


In Standard English:

We two have run about the slopes,

and picked the daisies fine;

But we’ve wandered many a weary foot,

Since long long ago.


We two have paddled in the stream,

from morning sun till dinner time;

But seas between us broad have roared

since long long ago.


Or listen to the delightful poem “To a Mouse” on ScotsIndependent website:


Wee, sleekit, cow’rin, tim’rous beastie,

O, what a panic’s in thy breastie!

Thou need na start awa sae hasty

Wi bickering brattle!

I wad be laith to rin an’ chase thee,

Wi’ murdering pattle.


103 Errors in Mapping Indo-European Languages in Bouckaert et al. Concluded: Part V, Western Europe Read More »

103 Errors in Mapping Indo-European Languages in Bouckaert et al., Part IV (Central Europe)

(Continued) The main problems with the language map of eastern Central Europe in Bouckaert et al. have already been discussed; to whit, the depiction of “national” languages as coterminous with state boundaries. The authors do occasionally deviate from this norm, showing, for example, a tiny non-Romanian area in northwestern Romania. Note also that they show Latvian as failing to reach Latvia’s northwestern coast. This view is indeed historically accurate, as northern Courland was the land of the Livonians, a Finnic-speaking people. The last native speaker of Livonian, however, died in 2009; for decades before that, Livonian was severely endangered and most speakers were bilingual in Latvian or Russian. If the map purports to depict the present situation, it is flatly wrong here. If it depicts the relatively recent past, as it does for some areas, it is more on target. Unfortunately, no time specification is provided.

Such unspecified chronology is a more intractable problem for the depiction of extinct languages. Major languages of the distant past often experienced major geographical changes, sometimes literally moving en mass when their speakers migrated. The Goths, for example, probably originated in what is now Sweden, later crossed the Baltic into northern Central Europe, subsequently moved into the steppes north and northwest of the Black Sea, and eventually spread with victorious warrior bands over much of the Roman Empire; the final redoubt of the language was the Crimean Peninsula, where it persisted until the ninth century and perhaps until early modern times. Any Gothic language polygon would thus fit a specific place only at a specific time. Bouckaert et al. have apparently selected the period just after the movement of Gothic out of Scandinavia, although the area specified does not seem to match what (little) is known about the early relocation of the language (see the map to the left).

As mentioned in the previous post, the placing of Byelorussian (Belarusian) in a small corner of the Czech Republic is a careless transcription error. But the intended depiction, that of Eastern Czech, is still off base. Czech is not heavily differentiated into dialects. The truly distinctive forms of the language are half way to Polish. Cieszyn Silesian and other Lach dialects are regarded by most Czech linguists as a Polish-influenced form of Czech and by most Polish linguists as a Czech-influenced form of Polish (politics do tend to intrude into linguistic discussions). Such dialects, however, are not on the map. What is (supposed to be) shown is “Eastern Czech,” placed in a small corner in the southeastern part of the Czech Republic. It is unclear what this designation refers to. Across the entire eastern half of the republic, one finds the Moravian dialect (or dialects), which are not strikingly different from standard Czech.

The linguistic depiction of the Italian Peninsula in Bouckaert et al. contains some curious features. This portion of the map is difficult to decipher, as extinct languages overlay extant languages, and much the area is covered by the circular labels. It is still clear, however, that the mapping here remains inconsistent. Italian is shown as extending neither into the Po Valley in the north nor to Sicily in the south. Fair enough: the local dialects spoken (or spoken until recently) in those areas are markedly different from Standard Italian, based on the Tuscan dialect. Yet the authors place other parts of the peninsula with equally distinctive dialects, such as Apulia in the southeast, in the Italian language category. In regard to the extinct Indo-European languages mapped here, the major issue is why only Umbrian and Oscan were selected to accompany Latin.




Most of the problems found on the map of Germany and environs have already been discussed. Note, for example, how Luxembourgish makes the cut on political grounds, whereas other distinctive German dialects are ignored. Of special note here is the demarcation of two Lusatian (or Sorbian) languages, although only one is labeled on this map segment. These Slavic tongues of eastern Germany are distinctive, and mapping them as separate languages makes linguistic sense. But it is difficult to understand why these relatively minor languages, with 40,000 and 10,000 speakers respectively, have been added to the tally, whereas Iranian and Indic I-E languages with hundreds of thousands to tens of millions of speakers have been ignored.

The language mapping of Scandinavia shows, yet again, striking geopolitical influence. Here we have Danish blanketing Denmark, Riksmal (or the Norwegian “national language”) everywhere in Norway except the islands and Finmark, and three separate Swedish languages covering all of Sweden except the islands, which remain unmarked. The straight east-west line that separates two supposedly distinct Swedish languages is a curious and highly unlikely feature.

But as one would expect, the continental Scandinavian languages do not actually correspond so well to national territories. Overall, the region is characterized by a dialect continuum so pronounced that some scholars regard all of the mainland North Germanic tongues as a single, regionally differentiated language. Swedish and Danish are almost interintelligible, and Norwegian is often regarded as a kind of a bridge: as a common saying puts it, “Norwegian is Danish spoken in Swedish.” (Norwegian vocabulary is similar to that of Danish, whereas its phonology is more like that of Swedish). But it is more complicated than that, as there is no single Norwegian language at any level. Local dialects cross the border with Sweden, but even in terms of official state recognition, Bokmål (“book language”) competes with Nynorsk (“New Norwegian”), and neither of these two variants are exactly the same as the standardized but non-official Riksmål (“national language”) and Høgnorsk (“High Norwegian”) forms. The differences between Bokmål and Nynorsk are not purely lexical (e.g. Bokmål pike ‘girl’ vs. Nynorsk jente ‘girl’), but concern grammatical patterns too (e.g. Bokmål does not distinguish masculine and feminine genders, whereas Nynorsk does). In a sense, the differences between Bokmål and Nynorsk are more pronounced than those between Bokmål and Danish (e.g. Danish word for ‘girl’ is pige, and most dialects of Danish and its standardized form do not distinguish masculine and feminine genders). The contention among these different language varieties is at once political, cultural, and historical, tied up with Norway’s former subordination to Denmark. Norwegian linguistic nationalists have often wanted to purge specifically Danish elements from the language, whereas linguistic traditionalists would like to preserve them.

Legacies of geopolitical change are also evident in the Scania region of southern Sweden. The dialects of Sweden’s far south are close to those of Denmark—so close, in fact, that some scholars place them within an “East Danish” category. Significantly, Scania was part of the Kingdom of Denmark until it was lost to the rising power of Sweden in 1658; it did not become an integral part of Sweden, however, until 1719, and which point a policy of linguistic “Swedenization” was initiated. “Eastern Danish” is thus considered by some to be a more historical than a linguistic category.

One of the oddest features of the mapping strategies employed by Bouckaert et al. is their reluctance to include islands within the territories of any language. In some cases, island groups are appended to mainland polygons, as can be seen here in the depiction of Danish (in the same manner, the Hebrides are mapped as Scottish-Gaelic speaking). Most often, however, islands and archipelagos are simply ignored, as one can see here in the cases of Norway’s Lofoten and Sweden’s Gotland and Olaand. Had Gotland been considered, I wonder whether it would have been mapped as Gutnish speaking. Gutnish, a disappearing dialect, is distinctive, and is sometimes said to be a direct descendent of ancient Gothic.

The mapping of Old Norse as coinciding with Iceland is also untenable. When Old Norse was spoken on Iceland it was also spoken in Norway, Sweden, Denmark, in northern Scotland, and pockets of the western British Isles.


103 Errors in Mapping Indo-European Languages in Bouckaert et al., Part IV (Central Europe) Read More »

103 Errors in Mapping Indo-European Languages in Bouckaert et al., Part III: From Western Russia to the Balkan Peninsula

(Continued) The most glaring error in the linguistic map of western Russia and environs by Bouckaert et al. concerns the labeling of Belarus. The number “22,” placed in the center of the country, is listed as signifying the “Czech E,” which presumably means “eastern Czech.” As the authors have correspondingly appended the label “Byelorussian” to a small area in the eastern Czech Republic, the error is obviously one of transposition. Such mistakes can occur inadvertently, although the fact that it has gone undetected indicates a troubling failure to engage in routine proofreading.

A much deeper problem is indicated by the intentional mapping. Note how the polygons indicating the Belarusian and Ukrainian languages correspond precisely to the present-day territories of Belarus and Ukraine respectively. Such exact political-linguistic correspondence is rare, and when it is encountered it generally indicates a recent history of state-led linguistic repression or ethnic cleansing, which should be taken into account in any historical consideration of linguistic geography. In the case of Belarus and Ukraine, however, the current distribution of the national languages does not even come close to fitting precisely within the geographical bodies of the respective countries.

Belarusian is widely spoken in Belarus but it is not the country’s majority language and it is dominant only in the west and the south, as can be seen on the Wikipedia map posted here. Even in these areas, Belarusian is losing ground among the young, and is thus classified as a “threatened language.” The threat stems from Russian, which, according to the 2009 national census, is spoken at home by 72 percent of the people of Belarus. Identifying the Belarusian language with the national territory of Belarus is—yet again—a political rather than a linguistic statement.

Placing the Ukrainian language precisely within the territorial bounds of Ukraine is an even more egregious error. The fact that eastern Ukraine and the Crimean Peninsula are mostly Russian-speaking areas is well known, as it is mentioned almost every time that Ukrainian elections are discussed. According to the Constitution of the Autonomous Republic Crimea, Russian rather than Ukrainian serves as the “language of interethnic communication”. Moreover, government duties in Crimea are fulfilled mainly in Russian, hence it is a de facto official language. The issue of whether Russian should be made co-official in other areas of Eastern and Southern Ukraine that are already de facto Russian-speaking is hotly debated on the parliamentary level. Before WWII, moreover, the linguistic map of the region was far more complex than it is now, an observation that holds true for most of eastern and central Europe. The southern Crimea, for example, was then dominated by people speaking Crimean Tatar, a language in the Turkic family.

The depiction of European Russia is little better. In this case, political boundaries are not slavishly followed, as large areas of northern Russia are correctly shown as non-Russian speaking. But many northern regions that are Russian-speaking, such as Saint Petersburg, are oddly excluded from the realm. Conversely, sizable areas in eastern European Russia are mapped as Russian-speaking when in actuality they are inhabited by peoples speaking Uralic and Turkic languages. It is admittedly difficult to map such languages as (Volga) Tatar, Mari, and Udmurt, as they are not spoken in geographically contiguous areas but rather form archipelagos in a Russian sea. But do such technical challenges warrant the exclusion of such language? More than six million citizens of the Russian Federation speak Tatar as their first language, and mapping them as if they were Russian speakers fails to given them the recognition that they deserve. The Udmurt language, spoken by about half a million speakers, has been recently propelled to the focus of the public attention in Russia and in the rest of Europe when a band of Udmurt-speaking (and -singing) grandmothers won second place at the Eurovision Song Contest.

Such mapping difficulties are by no means limited to western Russia. In many parts of the Indo-European realm, languages are interspersed, forming complex amalgams. As mentioned above, such mixtures were much more intricate before the horrors of the Second World War and its immediate aftermath. Depicting such areas as linguistically uniform, as Bouckaert et al. routinely do, thus results in intrinsic distortions. Such distortions, moreover, seem to be a necessary feature of their basic methodology, as they depict every language within a discrete and uniform polygon. Linking together languages whose speakers are scattered in separate communities over large areas into single bounded spaces results in such absurdities as the gerrymandered Kurdistan mentioned in the previous post.

Such procrustean tendencies reach a laughable extreme in the depiction of the Romani language (that of the so-called Gypsies), seen on the map of the Balkans posted to the left. Romani, labeled 74, is impossible to locate precisely, as the area indicated is covered by the circle16 in western Bulgaria. Presumably, a small, discrete Romani polygon lies below this numerical tag. To restrict the Romani language to this area is beyond absurd. Romani, like the Roma people who (sometimes) speak it, is dispersed over most of Europe. Bouckaert et al., however, do not even manage to adequately locate the language’s center of gravity, as far more people speak Romani in Romania than in Bulgaria. Mapping Romani is, of course, an extraordinarily difficult task, as the linguistic community is not only scattered widely, but its members often relocate. As a result, most cartographers simply indicate the numbers and percentages of Romani speakers (or Roma people more generally) found in different countries.

The rest of the map is not much better. Although the authors differentiate four separate Albanian languages, they depict the northern half of Albania as non-Albanian speaking. They also limit Serbo-Croatian to Serbia and Montenegro, excluding Croatia and Bosnia. Here the categories used and the map itself fail to correspond; what the map shows is the political-linguistic construct of Serbian (plus Montenegrin), used since the break-up of Yugoslavia, whereas the label turns back to the Yugoslavian idea of a single Serbo-Croatian languages, which also encompasses Bosnian and Croatian. From a linguistic standpoint, Serbo-Croatian works best, as all of its politically standardized forms are mutually intelligible to some degree. But by the same token, Bulgarian and Macedonian, shown here as separate languages, are similarly interintelligible. The underlying problem here is the lack of uniformity in the treatment of different languages: if they have four Albanian languages as well as separate languages in Bulgaria and Macedonia, they should have separated Serbian, Croatian, Bosnian, and Montenegrin—or better still, they should have differentiated the non-political dialectal divisions of Serbo-Croatian: Chakavian, Kaykavian, Western Shtokavian, Eastern Shtokavian, and Torlakian.

Finally, the mapping of Greek, both ancient and modern, is bizarrely idiosyncratic.  On what possible basis could the authors limit ancient Greek to Athens and its vicinity? The implicit argument here is that only Attic Greek was Greek, with the other Hellenic polities speaking non-Greek languages, a nonsensical idea. And yet they don’t even manage to map Attic Greek properly, leaving out the islands on which it was spoken. One can only conclude that the authors are incompetent at mapping languages, a cornerstone of their approach.


103 Errors in Mapping Indo-European Languages in Bouckaert et al., Part III: From Western Russia to the Balkan Peninsula Read More »

103 Errors in Mapping Indo-European Languages in Bouckaert et al., Part II: from Afghanistan to Anatolia

(Continued) Moving westward, the linguistic mapping of Iran and environs by Bouckaert et al. contains roughly the same density of error as that of South Asia. As most of these mistakes are noted in map call-outs, and others have been discussed in previous posts, I will focus here on the authors’ misperceptions about the Persian language.

The authors have divided Persian into two languages, labeled “Persian List” and “Tadzik” (a non-standard spelling of “Tajik”). Linguists, however, generally agree that Persian is a single language, albeit one with ten or so dialects, three of which serve as standard literary forms. These three official varieties are labeled Western Persian (or Farsi), found primarily in Iran, Eastern Persian (or Dari), spoken mostly in central and northern Afghanistan, and Tajik Persian (or Tajiki), located in Tajikistan and Uzbekistan. One would have to take an extreme “splitting” position to regard Farsi and Tajik Persian as separate languages. As the Wikipedia notes, “Persian-speaking peoples of Iran, Afghanistan, and Tajikistan can understand one another with a relatively high degree of mutual intelligibility, give or take minor differences in vocabulary, pronunciation, and grammar—much in the same relationship as shared between British and American English.” (It is also significant that the Tajiks historically call their tongue Zabani Farsī). And if separating Farsi and Tajiki is problematic enough, ignoring Dari Persian, spoken by 15-18 million people, is absurd. Doing so sunders the geographically contiguous Persian zone into two widely separated language zones.*

The most glaring blunder on the map of Anatolia and environs concerns the delineation of Kurdish. Here the main problem is the opposite of the one encountered in regard to Persian: several clearly separate languages are lumped together. By strictly linguistic criteria, Kurdish is a subfamily of related tongues. As the Wikipedia puts it, “Kurdish is not a unified standard language but a discursive construct of languages spoken by ethnic Kurds, referring to a group of speech varieties that are not necessarily mutually intelligible …” Kurdish proper is itself divided into two (or three) languages: Kurmanji, Sorani, and, sometimes, Kermanshahi. Philip G. Kreyenbroek, cited in the Wikipedia article referred to above, claims that, “From a linguistic or at least a grammatical point of view … Kurmanji and Sorani differ as much from each other as English and German.” The idea of a single Kurdish language is once again a political construct, albeit one based not on an actual political unit, but rather on the aspirations of most Kurdish people for a state rooted in trans-linguistic ethnic solidarity.

But not only do Bouckaert et al. elide the distinction between these two Kurdish languages, but they also subsume another language into the same category. The language in question is Zazaki (1.5-2.5 million speakers), located in the central part of eastern Turkey. The Zaza people are usually considered by others, and often by themselves, as members of the wider Kurdish ethnic formation, but their language is quite distinctive. It is most closely related to Gorani, spoken in Iran to the south of the Kurdish zone, but it also bears affinity with Talysh, another Iranian language ignored by Bouckaert et al.

Not only are the Kurdish languages misclassified, but so too they are inaccurately mapped. The Kurdish polygon of Bouckaert et al. is truly peculiar, as it excludes the southern part of the Kurdish region (most of the Sorani-speaking zone) while including a western extension into mostly non-Kurdish-speaking areas. Its longer eastern “panhandle” pushes far enough to take in the Kurdish areas in northeastern Iran, but in the process includes non-Kurdish areas along the Caspian Sea and in the Alborz (Elburz) Mountains. Such a fanciful depiction brings to mind the infamous “Gerry-Mander” of U.S. political history. If oriented conventionally, with north at the top, Bouckaert’s gerrymandered Kurdistan reminds me of a lounging rodent; if tilted on its side, it looks more like a galloping dinosaur.

I have also posted an excellent map of the ancient Anatolian languages, which makes a nice contrast to the simplistic depiction of these tongues in Bouckaert et al.

*As a final note on this map, western Afghanistan, a mixed Dari- and Pashto-speaking area, seems to contain an unlabeled polygon for a modern Indo-European languages, which I have marked with a question mark.


103 Errors in Mapping Indo-European Languages in Bouckaert et al., Part II: from Afghanistan to Anatolia Read More »