Controversies over Ethnicity, Affirmative Action, and Economic Development in Malaysia

Malaysia states mapFew issues are more controversial in Malaysia than the country’s National Development Policy, particularly its extensive “affirmative action” provisions that provide economic and social advantages for the majority (61%) indigenous population (“Bumiputeras”) at the expense of the Chinese and Indian communities. Dating back to the early 1970s, this policy has resulted in significant economic gains for the Malay community, but at some economic price for the country as a whole—and at more significant costs for Malaysians of Chinese and Indian background. Although Muslim Malays—and all Malays are automatically registered as Muslims in Malaysia—have been the main beneficiaries, other Bumiputeras (“sons of the soil”), such as the non-Malay-speaking, non-Muslim indigenous peoples of Sarawak and Sabah in northern Borneo, receive the same benefits. Who exactly qualifies as a Bumiputera, however, can be a complicated and controversial matter, as the governing laws vary from state to state. In 2009, for example, a local debate erupted in Sarawak in northern Borneo when a local girl was unable to attend college after being denied Bumiputera status because her mother is Chinese, even though her father is an Iban indigene. In neighboring Sabah state, such a mixed-race individual would in theory be granted Bumiputera status.

Malaysia GDP by state mapAt the national level, controversies surrounding the policy intensified this week after Malaysian Prime Minister Najib Razak announced new measures focused on providing business training and affordable housing for the country’s Bumiputera majority, even though he had previously vowed to substantially reform the affirmative action policy. As reported in a recent Malaysian news article, “analysts said the announcement, made live on national television, aimed to rally Najib’s United Malays National Organisation (UMNO) ahead of party elections but could hurt an already slowing economy.” Although the Malaysian economy has in general performed well over the past several years, it is currently confronting a marked slowdown apparently caused by “current-account deterioration, fiscal balance deterioration, [and] higher leverage.” During periods of economic distress, Malaysia’s affirmative action program tends to provoke heightened controversy, as is currently the case.

Economic production in Malaysia, as in most developing countries, is quite geographically uneven, as can be seen in the first map. Per capita GDP ranges from US$ 2,694 in Kelantan to US$ 18,218 in Kuala Lumpur. The remainder of today’s post considers whether such discrepancies match Malaysia’s geography of ethnicity.

Malaysia Bumiputera MapMalaysia Muslim Population by State mapAs the second map reveals, high levels of economic development (as measured by the admittedly crude yardstick of per capita GDP) are found in and around the Kuala Lumpur area in western peninsular Malaysia, although resource-rich Sarawak in northern Borneo falls into the same general category. The country’s least developed states, in contrast, are found in the northern reaches of peninsular Malaysia. As the third map shows, northern Peninsula Malaysia also has a high percentage of Bumiputeras. Pahang in the central part of the peninsula and Sabah and Sarawak in northern Borneo, however, also have large Bumiputera majorities, yet Sarawak in particular has a high level of per capita economic development. Yet as the next map show, the Bumiputera population of Sabah and Sarawak is quite distinctive from that of the peninsula, as it is heavily composed of non-Muslim indigenous peoples. Malaysia’s Chinese community—against whom the country’s affirmative action programs are largely directed—is particularly well represented in Kuala Lumpur, the country’s economic core. Large Chinese minorities are also found in Johor and Perak, states of middling economic performance. Perak also has a large Indian population, as can be seen in the final map.

Malaysia Chinese Population by State mapMalaysia Indian Population by State mapPerhaps the most interesting thing revealed by these maps is the ethnic contrast between Malaysia’s two capitals: Kuala Lumpur and Putrajaya. Kuala Lumpur, the seat of parliament and by far the largest and wealthiest city in the country, is ethnically diverse, with a Bumipitera minority and an especially large Chinese community. The new city of Putrajaya (population 68,000), Malaysia’s administrative capital since 1999, is in contrast almost entirely Malay and hence almost entirely Muslim (97.4 percent of its population follows Islam according to the 2010 census). Intriguingly, Putrajaya’s lack of ethnic diversity is seldom noted in the literature on the city. Most sources stress instead its planned development, parks, and spacious living accommodations.

Discrepancies in Mapping Persian/Farsi in Iran

GeoCurrents is deeply concerned with language mapping, as we find maps of language distribution to be highly useful and, if done properly, aesthetically appealing. But we also tend to be critical of linguistic cartography, as the spatial patterning of language is often too complex to be easily captured in maps. Dialect continua, zones of pervasive bilingualism, overlapping lingua francas, areas of linguistic interspersion, urban/rural language discrepancies, and mobile language communities all present major challenges for the mapmaker. Differences in population density is another tricky issue. Should one map a virtually unpopulated area in the same manner that one depicts a densely populated zone? And if one decides to leave uninhabited (or mostly uninhabited) areas unmarked, how large and how unpopulated do they have to be before they appear on the map?

As a result of these and other issues, linguistic maps, whether of a particular place, an individual language, or a language family, often vary greatly from one cartographer to another. Such differences were recently brought home as I examined various language maps of Iran, many of which are readily available on the internet. In particular, the area covered Farsi/Persian, the national language, differs significantly. I therefore decided to overlay these different depictions of Persian/Farsi on a uniform base-map of Iranian provinces so that they can be easily compared. Eleven such maps are posted here, both in their original form and with the Persian/Farsi zone extracted and placed on the common base map. The overlays are not particularly precise, owing largely to differences in map projection; a large amount of tedious handwork was necessary to make them accord as closely as they do with the originals. It is also important to note that the original maps themselves vary in regard to the area depicted. Some show merely Iran, but others include neighboring countries as well. The overlay maps, however, show only Iran.

The maps are arranged in rough descending order, with the first map showing the largest expanse of Persian/Farsi, and the last map showing the smallest one.

Farsi Language Map1

Depiction 1.  The first map is by far the simplest, as it shows Iran as uniformly Persian-speaking. Such a depiction is accurate in one sense, as Persian/Farsi is the national language, and hence is used for official purposes throughout Iran. It also serves as the lingua franca of those parts of the country in which it is not the dominant mother tongue. The source map for Depiction 1, however, is problematic, as it purports to show the overall distribution of Persian, yet it does so entirely on the basis of national boundaries. Depicting Afghanistan and especially Uzbekistan as uniformly PMap of Persian Speakersersian-speaking is far from accurate.










Depiction 2. The source map for DeFarsi Language Map 2piction 2, found in the Wikipedia Commons, is oddly titled “iranethnics,” implying that it is concerned with ethnic identity rather than language per se. All of the categories mapped, however, are rooted in language, although the term “Fars” (the name of a province and, more generally, a region) is used rather than “Farsi.” In purely linguistic terms, “Fars” refers to a series of Persian dialects that are quite distinctive from standard Farsi. As one Wikipedia article puts it: “Northwestern Fars is one of the Central Iranian varieties of Iran. Its name is purely geographical: It is not particularly close to Farsi (Persian), but rather to Sivandi.” The Wikipedia’s family IranEthnics Maptree of Iranian languages treats Fars a distinct minor language, with some 100,000 speakers. On the source map for Depiction 2, however, all languages in the Iranian family are subsumed under the “Fars” category except Kurdish and Baluchi. Linguistically, this maneuver makes little sense, as the Iranian languages or northern Iran, such as Gilaki and Talysh, are more closely related to Kurdish than they are to Persian/Farsi. But it is also true that that Gilaki- and Talysh-speakers tend to be much less ethnically distinct from Persians than the Kurds. Finally, this map restricts the extent of several minority languages, particularly Arabic, more than many other language maps of Iran.


Farsi Language Map 3Depiction 3. The base map used for Depiction 3, also found in the Wikipedia, depicts the various languages of the Iranian family, both in Iran and neighboring countries. As non-Iranian languages such as Arabic and Azeri are not depicted, areas in which they are spoken are generally mapped as Persian speaking (“Persan,” on the French map) or at least as partly Persian speaking* (as in the case of the Azeri-speaking area). The Caspian languages (Gilaki, Mazandarani, etc.) are depicted, but only in the Alborz (Elburz) Iranian Tongues MapMountains; the Caspian coast is instead shown as Persian speaking, a somewhat unusual depiction. The base map is also distinctive in elevating the Mukri dialect to the status of a separate language (even the Ethnologue, which tends to split languages, treats it as a mere dialect), and in depicting a sizable “Sangesar” area in the mountains of northern Iran. Yet according to the Wikipedia, the Sangsari language has only 36,000 speakers and is largely limited to the town of Sang-e Sar** (Mehdishahr), located south of the Alborz Mountains in Semnan Province. Related tongues in the Semnani branch of Iranian languages have similarly restricted distributions.

Farsi Language Map 4 Depiction 4. The base map used for Depiction 4, found on the website of a Farsi translation service, is crude and politically compromised, as it incorrectly depicts the distribution of several languages as coincident with provincial boundaries. It incorrectly labels Azeri as “Turkish” and Balochi as “Pashto.” (In contrast to Turkish and Azeri, which are closely related, Balochi and Pashto are only distantly related, as they are members of distinct branches within the Iranian family.)  It also unconventionally classifies the dialects of Farsi spoken in Khorasan as Dari, a term genIranLanguage:Ethnic Maperally limited to Persian as found in neighboring Afghanistan. But the boundary between Farsi proper and Dari—both forms of Persian—is difficult to draw. As the Wikipedia explains:

 The dialects of Dari spoken in Northern, Central and Eastern Afghanistan, for example in Kabul, Mazar, and Badakhshan, have distinct features compared to Iranian Persian. However, the dialect of Dari spoken in Western Afghanistan stands in between the Afghan and Iranian Persian. For instance, the Herati dialect shares vocabulary and phonology with both Dari and Iranian Persian. Likewise, the dialect of Persian in Eastern Iran, for instance in Mashhad, is quite similar to the Herati dialect of Afghanistan.

Farsi Language Map 5Depiction 5. The base map used for Depiction 5, found on a website devoted to Iranian languages, is similar to that of Depiction 3, although it shows a more limited distribution of Persian.

Iranian Languages Map2







Farsi Language Map6






Depiction 6. The base map used for Depiction 6, found in the Wikipedia, is labeled “Languages of Iran.” This map shows a relatively limited distribution of Persian, barely depicting it as reaching the sea. It also shows much larger than usual Arabic- and “Lorish”-speaking areas. It subsumes Mazanderani and the Semnani languages into the “Tabari” category, although according to most analyses Mazanderani is closer to Gilaki (mapped here as a separate language) than it is to the Semnani tongues. (Significantly, the people Iran Main Languages Mapof Mazandaran call their own tongue “Gileki.”) Oddly, the Qashqai Turkic area in Fars Province is missing.







Farsi Language Map 7


Depiction 7. The base map used for Depiction 7 is found on the “Maps of Net” website and is based on Ethnologue cartography. This map also restricts the distribution of Farsi; again it barely reaches the sea, but it does so in a different place than that indicated on Depiction 6. This map shows much larger than usual areas covered by Azeri (“Azerbaijani” Main Ethnic Languages in Iran Maphere), Arabic, and “Balouchi.” It also incorrectly portrays the northeastern Kurdish area as Turkic, labeling it “Khorasani Turks” and coloring it as if it were “Azerbaijani.” The extent of the Qashqai Turkic area in Fars province seems surprisingly large. Perhaps the oddest feature of this map is its exaggeration of the area covered by the southernmost Luri dialect, a very minor tongue by most accounts, and its elevation of this dialect to the status of a separate language (designated here as “Lari” to distinguish it from the “Lori” language of the north). This map also shows one uninhabited area, the Dasht-e Kavir (salt desert), in north-central Iran.

Farsi Language Map 8Depiction 8. The base map used for Depiction 8 is found in a Wikipedia article on Iranian languages. It shows large areas in central Iran as non-Persian speaking; presumably most of these areas are excluded by virtue of being largely uninhabited rather than by speaking a different language, but the mapping conventions make it impossible to be Iranian Language Map 3certain. This map also shows a much larger than usual distribution of the Balochi language, in several discontinuous patches, in northeastern Iran. As in Depiction 3, the Caspian lowland is depicted as Persian speaking.






Farsi Language Map 9Depiction 9. The base map used for Depiction 9 is found on yet another Wikipedia page. It leaves large “sparsely populated” areas in eastern and central Iran blank, thus restricting the distribution of Farsi/Persian. It depicts Lur as a separate language, but divides it into two separate areas, mapping the central Luri zone as Persian speaking. It Iran Ethnoreligious Mapdepicts a sizable area along the Afghan border as “other,” which would presumably refer to Pashto.







Farsi Language Map 10



Depiction 10. The base map used for Depiction 10 comes from a Wikipedia map of ethnicity in Iran, although its categories are again are based on largely linguistic criteria. This map shows sizable uninhabited areas in east-central Iran, a not uncommon maneuver, but also does the same in southeastern Iran, an uncommon move (also found in the base map for Depiction 9). Again like Depiction 9, this map portrays the central Luri areas, but not the northern and southern ones, as Persian-speaking. It depicts a highly restricted Iran Ethnicity MapArabic zone in both Khuzestan Province and farther south along the coast.







Farsi Language Map 11


Depiction 11. The base map used for Depiction 11 comes from an older version of the language map of Iran posted on the Gulf 2000 site, which features the extraordinarily detailed cartography of Mike Izady. This map leaves large areas of sparse population unmarked, and hence restricts the distribution of Persian more than the other maps considered here. It makes several other unusual maneuvers. Luri is mapped as a dialect of Persian, yet the Raji dialect of central Iran is elevated to the status of a separate language. The Minabi dialect of the southeast, described by the Wikipedia as “a dialect which is something between Bandari and Balochi and Persian,” is also mapped as a separate language, and a small Cushitic-speaking zone (labeled “Somali, etc.”) is depicted in the same general area. The extent of Tati, closely related to Talysh, is much greater than in any other language map of Iran that I have investigated.

Iran Languages Izady Map









I am not qualified to assess which of these maps is the most accurate, and I hesitate to say whether such an assessment can even be made. I welcome feedback from readers on these and other issues pertaining to these maps.

*Note: for all depictions, areas shown as mixed between Farsi/Persian and some other language are left unmarked.


**This small city has an interesting recent history. According to the Wikipedia, “The primary religious belief in the area now is Shi‘ite Islam, but before the Islamic Revolution, there were many Bahá’ís in Sangsar, who had to migrate from the city after the revolution, due to a wide range of persecutions. As for other towns of Iran, the name has thus been changed by the Islamic authorities into Mahdishahr as if to signal its imposed pure Muslim identity. Mahdi is the Shia Muslim hidden Imam and Shahr means town in Persian, so Mahdishahr literally means town of Mahdi.”



Xinjiang, China: Ethnicity and Economic Development

China GDP by Prefecture MapAn impressive map of China’s per capita GDP by prefecture, reposted here, appeared in late 2012 on the website Skyscraper City, posted by user “Chrissib” Cicerone.  According to the map, the two poorest parts of China are in southern Gansu province, an area demographically dominated by Han Chinese, and in southwestern Xinjiang, an area demographically dominated by Uighurs, a Turkic-speaking, Sunni Muslim people.

As noted in the previous GeoNote, the level of economic development in Xinjiang as a whole is slightly lower than the Chinese average, as measured by per capita GDP. But as Chrissib’s map shows, Xinjiang shows striking disparities in its own regional economic patterns. As a comparison of a detail of his map with a Wikipedia map of ethnicity in Xinjiang shows, areas dominated by Han Chinese have much higher levels of economic productivity than those dominated by Uighurs. Also essential to note is the fact that the Han Chinese domination of eastern Xinjiang largely stems from relatively recent immigration to the region, a process much resented by Uighur activists.

Xinjiang GDP and Ethnicity mapChina is currently seeking to enhance the economic development of Xinjiang, along with the country’s other western regions. But as Preeti Bhattacharji explains in a recent article published by the Council on Foreign Relations, the project faces a number of ethnic issues:

Xinjiang’s wealth hinges on its vast mineral and oil deposits. In the early 1990s, Beijing decided to spur Xinjiang’s growth by creating special economic zones, subsidizing local cotton farmers, and overhauling its tax system. In August 1991, the Xinjiang government launched the Tarim Basin Project to increase agricultural output. During this period, Beijing invested in the region’s infrastructure, building massive projects like the Tarim Desert Highway and a rail link to western Xinjiang. In a 2000 article for the China Journal, Nicholas Bequelin of Human Rights Watch said these projects were designed to literally “bind Xinjiang more closely to the rest of the PRC.”


Ethnic tension is fanned by economic disparity: not only are the Han-dominated areas more productive, but the Han individuals tend to be wealthier than the Uighurs in Xinjiang. Some experts say the wage gap is the result of discriminatory hiring practices. The CECC reported in 2006 that the XPCC [Xinjiang Production and Construction Corps] reserved approximately 800 of 840 civil servant job openings for Han. This policy was changed in 2011, however, and the XPCC “left almost all positions unreserved by ethnicity.” But the 2011 CECC says both government and private sectors had discriminatory hiring practices against the Uighurs and also denied them religious rights such as observing Ramadan and allowing Muslim men to wear beards and women to wear veils.


Politics and Ethnicity in Ecuador and Bolivia: Twins or Opposites?

Ecuador 2013 Election MapOn the surface, Ecuador and Bolivia exhibit close political similarities. Both countries are led by popular presidents who pursue leftist agendas, taking on multinational corporations, enacting land redistribution, and opposing U.S. interests. In Ecuador, incumbent president Rafael Correa just won an overwhelming victory, besting second-place finisher Guillermo Lasso by a 34 percent margin. In the most recent Bolivian general election (2009), Evo Morales, the socialist incumbent, won a similarly resounding victory, beating challenger Manfred Reyes Villa by a 38 percent margin. The two countries also evince ethno-geographical similarities. Both feature a highland core area populated by large numbers of indigenous Andean peoples, Quechua-speaking in Ecuador and both Quechua- and Aymara-speaking in Bolivia; both also contain expansive lowlands occupied mostly by Spanish-speakers of mixed (Mestizo) and European ancestry, and both have scattered groups of Amazonian Indians, several of which occupy substantial territories and are politically organized.

If one digs a bit deeper, however, the ethno-political situations in the two countries turn out to be quite different. To begin with, indigenous people form a clear majority in Bolivia (55 percent), but a definite minority in Ecuador (25 percent). In Ecuador, Correa gets most of his support from the Spanish-speaking majority, and is opposed by most indigenous groups, whereas in Bolivia, Morales is supported by the Andean Indio majority, but generally opposed by both the Spanish-speaking and Amazonian Indian populations. The two presidents also take different approaches to governing. Correa, who holds a PhD in economics from the University of Illinois, is a technocrat, and is poorly connected with social movements. Morales, in contrast, was the former leader of the coca-pickers union, and is closely connected with grassroots politics.

Ecuador 2013 Election Regions MapIn Ecuador’s recent election, Correa’s won a mandate, easily beating his rivals. But his victory was not as overwhelming as it might seem once one considers the divisions that weakened the opposition. As can be seen on the first map, from Electoral Politics, Correa won a plurality of votes almost everywhere. What is less apparent is the fact that he won a majority of votes in less than half of Ecuador’ territory. To show this pattern, I have remapped the election data at the provincial level, distinguishing the provinces that gave Correa less than half of their votes from those than gave him a majority. On the same map, I have drawn in rough boundaries separating the Pacific coast region, the Andean region, and the Amazonian region. As can be seen, Correa is very popular in the coastal zone but relatively unpopular in Amazonia. The situation is the Andes is mixed. Here some areas supported him fairly strongly, while others favored the various candidates of the opposition.

The opposition candidates were themselves a highly mixed group, preventing the formation of a united front. Three of them took a plurality of votes in several districts:  Guillermo Lasso, of the center-right; Lucio Gutiérrez, a centrist or center-left candidate; and Alberto Acosta, a Marxist. Only Lasso, however, took a significant share of the votes nationwide. Álvaro Noboa, a right-wing populist banana magnate, took more votes than the Marxist candidate Acosta, as did Mauricio Rodas, but neither won the contest in any district, and hence they do not appear on the first map.

Ecuador 2013 Election Languages MapThe ethnic correlations in the election are fairly clear. In general, Correa did extremely well in predominantly Spanish-speaking areas, and not so well in highly indigenous zones. To make this pattern visible, I have superimposed a map of Ecuador’s indigenous languages on the depiction of its recent election, following the Ethnologue’s linguistic mapping. As can be seen, most of the indigenous areas, both those in the highlands and those in the lowlands, favored the opposition. The Marxist candidate Acosta did particularly well among the Shuar and other Jivaroan-speaking peoples of the center-east, rainforest tribes that are relatively well organized and have experienced pronounced environmental stress. In general, however, the indigenous groups divided their votes widely among the oppositional candidates. The incumbent Correa performed most poorly in Napo (population 80,000), where he received only 25 percent of the vote. As the Wikipedia describes the province:

The thick rainforest [of Napo] is home to many natives that remain isolated by preference, descendents of those who fled the Spanish invasion in the Andes, and the Incas years before. As of 2000, the province was the sole remaining majority-indigenous province of Ecuador, with 56.3% of the province either claiming indigenous identity or speaking an indigenous language.

In the highlands proper, Correa’s worst showing was in Bolivar province, where he received roughly a third of the vote. As the Wikipedia describes Gauaranda, the provincial capital:

Since the 1990s, the indigenous majority has seized political power and most of the local elected officers are of Quechua origin. The city has 25,000 inhabitants (2005) and is growing. It suffers severe problems of electricity and water supply.

Correa’s lack of popularity with indigenous groups is usually attributed to his support for the mining and oil-extraction industries, bulwarks of the Ecuadoran economy. As Freedom House notes:

Indigenous people continue to suffer discrimination at many levels of society. In the Amazon region, indigenous groups have attempted to win a share of oil revenues and a voice in decisions on natural resources and development. The government has maintained that it will not hand indigenous groups a veto on core matters of national interest.

For a left-populist perspective on these issues, one can turn to a recent CounterPunch interview with Marc Becker, an associate professor of Latin American Studies at Truman State University. Becker characterizes Correa as a technocratic proponent of “extractivist modernization” who has little understanding or appreciation of social movements. He further argues that racial discrimination undergirds the voting patterns. As he puts it:

[M]y colleagues in Ecuador say that since Correa was elected there’s been an increase of racist incidents and racism in general.  …  The indigenous movements are so well-organized that they gain political space that exceeds their numerical representation in the country.  In Ecuador there is a certain amount of resentment by non-indigenous of indigenous for gaining political power and political space.  It appears that Correa plays the race card and plays into the latent racist attitudes of the dominant population of white and mestizo people.  You see this in Correa’s rhetoric.  He says very nasty things about indigenous people, environmentalists, and what he terms ‘ultra-leftists’.

Bolivia 2009 Election Languages MapIn Bolivia, in contrast, the presidency is held by an indigenous person, Evo Morales, a member of the Aymara group. His main base of support comes from the Andean Indio communities. To show this pattern, I have superimposed a map of Bolivia’s indigenous language groups (again following the Ethnologue) on an electoral map (on this map, a “yes” vote on the Constitutional Referendum of 2009 indicates support for Morales). As can be seen, Morales is not popular in the Spanish-speaking lowland economic hub around Santa Cruz. Nor does he receive much support from the Amazonian indigenous communities, several of which have protested vociferously against his road-building and resource extraction policies.

As the political left has gained power across much of Latin America, conflicts between governments and indigenous groups have not abated, and in many areas they have intensified. As Simeon Tegel put it in the telling headline of his 2011 article: Left Vs. Indigenous of Latin America: Once allies, the two have clashed over environmental concerns.

Geographical Patterns in the 2013 Swiss Election, Part I

Swiss Family Law 2013 Election MapA three-part referendum held in Switzerland in early March received minimal press attention. Some media reports noted the passage of a measure to restrict executive compensation, but the family policy initiative was virtually ignored, as was the one on land-use planning. Today’s post briefly considers the family policy issue, whereas tomorrow’s will look at the executive compensation measure.

The Swiss election guide description of the family policy measure is not very specific:

Do you want the federal order of 15 June 2012 taken on family policy? The federal order would add an amendment to the federal constitution to require the federal government to take account of the needs of the family when performing its duties, and to work with the cantons to promote balance between family and work and to create more day-care facilities to complement schools.

Even though the measure sounds rather indistinct, it provoked strong reactions, with some parts of the country overwhelmingly favoring it, and others strongly opposing it. The general patterns are clear. I have modified the Electoral Politics map of the election results to highlight them. As can be seen on the first map, the French- and Italian-speaking areas of the country in general favored the initiative strongly, which received 54 percent of the vote nationwide. The more rural parts of the German-speaking zone, as well as the Romansh-speaking areas, opposed it. Such patterns would probably be even more clear-cut if the map showed voting behavior below the canton-level. I would not be surprised, for example, if the French-speaking part of Bern, Bernese Jura, actually voted for the measure, although the map would seeming indicate that it voted against it.  By the same token, I would not be surprised if the eastern, German-speaking portion of Valais actually voted against it.

Swiss Family Law 2013 Election Map2Generalization can also be made about the areas that voted strongly against the measure. The core “no” area in the center of the country corresponds closely with the original nucleolus of the Swiss state in the 14th century. The measure was most overwhelmingly rejected, however, in Appenzell Innerrhoden, a northeastern canton. Appenzell Innerrhoden is strongly conservative on social issues, not having given women the right to vote on local issues until 1991. The Wikipedia article on the canton includes some interesting information:

Somewhat before the early 2000s, the idyllic countryside of Appenzell Innerrhoden apparently became popular with nudists, and at the 2009 Landsgemeinde the canton’s residents voted to prohibit naked hiking. Violators would be fined. However nudists who appealed against their fines to the federal court have been reimbursed by the local authorities, as nudism is not a crime under Swiss federal law which takes precedence. It is common for cars rented in Switzerland to be registered in Appenzell Innerrhoden, and thus having license plates starting with “AI”, because of the reduced tax on cars in this canton.



North American English Dialects: Bad Map – Or Fantastic Map?

North American English Dialects MapAn internet search of “bad map” returns, among many other examples, Rick Aschmann’s map of North American English Dialects, reproduced here. Critics complain that the map is so busy and complicated as to be almost unreadable. But what the map lacks in grace and style, it makes up for in detail. On Aschmann’s own website, the map is large and interactive: if one clicks on the green dots representing selected cities, one is taken to videos giving pronunciation samples. The website also contains a great deal of textual information, and is updated regularly, with the most recent addition dated March 13, 2012. For those interested in dialectology, the map and website are veritable treasure troves. Inset maps of local areas, such as New Orleans with its three dialect zones, are particularly interesting—although I do wonder if the New York metropolitan area could have been more finely divided. Another intriguing inset map shows a limited dialect zone in far eastern North Carolina, called here “Down East & Outer Banks.” Having spent time on Okracoke Island, I can attest that the local dialect is highly distinctive.

Aschmann obviously spends a great deal of time on this project, although he describes it a mere diversion.  As he puts it, “I am a professional linguist and a Christian missionary, working in indigenous Amerindian languages. My work has nothing to do with English, so that is why this project is just a hobby.” I would object to the word “just” in the preceding sentence, as Aschmann’s work constitutes a real contribution to knowledge, in my view.

San Francisco Bay Area dialects mapMapping such intricate patterns is obviously a challenge, in part because speech patterns at this level of detail can change relatively quickly. Another problem concerns the limited number of data points, which may result in more precise mapping than is actually warranted. Such quibbles come to mind when I examine the inset map showing dialect areas in California’s Bay Area. According to the map, the region is divided on the basis of the so-called cot-caught merger (also reflected in words such as “Don” and “dawn”). Here we are informed that people in the core Bay Area (white on the map) makes a distinction between these vowels, unlike those in the more peripheral areas. Such a pattern does not match my personal experience. I grew up in Walnut Creek, which is supposedly situated on this dialect border, but in my dialect, almost of the paired word in the table below are pronounced identically (I do differentiate “”cock” from “caulk” and “box” from “balks,” but only because I pronounce the “l” in the latter two). The map also places Palo Alto in the “cot≠caught” area, but my 14-year-old daughter looked at me with disbelief when I mentioned that some people pronounce these words differently. So did three of her friends who happened to be visiting at the time, all of whom grew up in Palo Alto. Her fourth friend, however, had a different take: “Oh, my parents argue about that all the time, because my dad is from New Jersey…”

Don=Dawn mapPersonal reflection and anecdote, however, are poor methods for determining such differences, as people are often unaware of how they actually pronounce specific sounds. Instead, careful investigation is required. We therefore contacted Stanford linguist Penelope Eckert, who has conducted detailed research on the topic, as is evident on her website. She reports that the cot/caught merger is essentially complete among younger speakers throughout the region. To the extent that Aschmann’s inset map of the Bay Area is accurate, it is so only in regard to the region’s oldest speakers.

Research also indicates that parents’ vowel distinctions may not even be apparent to their own children. Children, in general, pick up their pronunciation patterns from peers, not mothers and fathers. As noted in a recent essay:

 The Smiths, natives of Philadelphia, have settled in California and are raising twins Dawn and Don. When Mom or Dad calls either child by name, both kids answer. Even though the parents are pronouncing “Dawn” and “Don” distinctly, the children can’t seem to hear any difference. Why not?

Interestingly, this vowel merger causes anxiety among some people, as they worry that they are speaking incorrectly when they pronounce “Dawn” and “Don” in the same manner. Evidently, the distinction between dialectal differences and correct or incorrect pronunciation is not always clear.



Punjabi and the Problems of Mapping Dialect Continua

Dialects Sometimes Called Punjabi MapThe Wikipedia list of the world’s most widely spoken languages, by mother tongue, puts Punjabi in tenth place, with its roughly 100 million native speakers exceeding the figures given for German, French, Italian, Turkish, Persian and many other well-known languages. The Wikipedia article on the Punjabi language stresses its growing appeal, noting that, “The influence of Punjabi as a cultural language in Indian Subcontinent is increasing day by day mainly due to Bollywood. Most Bollywood movies now have Punjabi vocabulary mixed in, along a few songs fully sung in Punjabi.”

But despite Punjabi’s obvious importance, it is extremely difficult to find a map of the language on the internet. Partly this is due to the fact that Punjabi spans the India-Pakistan border, and most maps of individual languages are country-based. One can thus find many language maps of India that depict Punjabi, and virtually all language maps of Pakistan do so as well. But on Pakistani language maps, the area covered by Punjabi has been diminishing in recent years. Maps made in earlier decades typically showed virtually all of northeastern quadrant of the country as Punjabi-speaking, whereas many recent maps retain the Punjabi label only for the core zone of this region. On these maps, what used to be the southern Punjabi area is now typically mapped as Saraiki-speaking, whereas the north is depicted as Hindko-speaking. Saraiki and Hindko, moreover, are sometimes merged together as the Lahnda language, sometimes called “Western Punjabi.” This linguistic reclassification scheme, however, is quite controversial, especially in Pakistan. Here Punjabi partisans are often irritated by the diminution of their language, whereas locally based scholars are happy to see their own speech-forms elevated to the status of separate languages.

Such controversies stem from the fact that Punjabi forms a dialect continuum, which means that adjacent dialects may be virtually identical, but the farther one travels, the more distinctive they become. As a result, dialects on the opposite sides of such a continuum may be non-mutually intelligible, and hence separate languages by standard linguistic criteria, yet no clear language boundaries can actually be located. The Punjabi dialect continuum is further complicated by the fact that it merges with the Hindi dialect continuum in northern India and with the Sindhi dialect continuum in southern Pakistan. To a certain extent, one can thus imagine a much larger dialect continuum stretching across most of northern South Asia. The standardized form of Hindi is a completely different languages from standardized Punjabi, but on the margins the situation is not always so clear-cut. The presence of Urdu adds yet another layer of complexity.

A relatively new Wikipedia language map (dated January 31, 2013) deals with these issues by mapping local dialects in the Punjabi-speaking area in both Pakistan and India. The caption of this map found on the “Punjabi Language” Wikipedia article (but not on other Wiki articles that use it) is delightfully honest: “Dialects Sometimes called Punjabi.” Note that on this map “Hindko” is highly restricted, whereas “Saraiki” does not appear at all. One must wonder how much sub-dialectal variation is found in some of these mapped dialect areas, particularly in the elongated Derawali zone (colored red on the map).

The Wikipedia article on Derawali  indicates that a certain degree of linguistic convergence is now occurring: “Today like all other dialects in Punjab, a process of unification and getting closer to Standard Pakistani Punjabi (Urdu influenced Majhi written in Shahmukhi) has made it [Derawali] quite similar morphologically, syntactically and mutually intangible with Standard Punjabi.” The lexical table provided in the same article, however, makes Derawali seen quite different from standard Punjabi. Whereas in the latter, the English words “boy, girl, woman, and man” are rendered “Munda, Kuri, Znaani, Aadmi,” in Derawali they are given as “Chohr, Chohir, Aurat, Mard.”

Changing Italian Voting Patterns?

Italy 2013 election Monti Vote MapThe recently completed 2013 Italian General Election has been avidly discussed in the international media. The contest failed to produce a clear winning coalition in the senate, resulting in a hung parliament. It also saw the eclipse of the centrist, technocratic, austerity-oriented party of Prime Minister Mario Monti, which received only about 10 percent of the vote nationwide, as well as the strong return of Silvio Berlusconi, whose coalition barely missed taking a plurality of votes. Perhaps most striking was the strong third-place showing of the new Five Star Movement, led by comedian Beppe Grillo. Grillo’s party is left-populist in orientation, advocating environmentalism, direct democracy, and free access to the internet. It has also been described as mildly Eurosceptical.

taly 2013 election Five Star Vote MapThe Wikipedia page on the election includes a regional breakdown of the vote for the senate, which I have mapped. I was curious to see how this contest would compare with other Italian elections, which generally follow a very clear regional pattern; central Italy, especially Tuscany and Emilia-Romagna, usually votes strongly for the left, while the north, Sicily, and much of the southern peninsula usually favor the right (see the map of the 2008 legislative election below).

taly 2013 election Right-Colition Vote MapThe maps of the recent election reveal few surprises. Monti did relatively well in the more prosperous Po Valley in the north, although even here he received only about 15 percent of the vote (Monti actually did the best among Italians living abroad). In contrast, the new Five Star Movement performed poorly among expats, and did not do particularly well in the economic core-zone of Lombardy, but across most of the country it received roughly 20-25 percent of the vote. The center-right (Berlusconi) coalition slipped a bit in the Po Valley, although it performed well in the Veneto region, and it did relatively well across most of the south, particularly in Campania, the region that includes Naples. The Common Good, a left-leaning coalition, not surprisingly, did very well in Tuscany and Emilia-Romagna and relatively poorly in the Po Valley. It had its best showing, however, in the far northern autonomous region of taly 2013 election Left Coalition Vote MapTrentino-Alto Adige/Südtirol, a mountainous, relatively lightly populated area that includes a significant German-speaking minority. Unlike most other parts of northern Italy, this region often votes for candidates of the left, although it also gives support to regionalist candidates. Significantly, Trentino-Alto Adige gave almost 14 percent of its votes to “other” parties, by far the highest figure among all Italian regions—with one notable exception. The exception is another northern, autonomous region, Aosta Valley (Valle d’Aosta). Here almost 70 percent of voters opted for none of the top four groups, with roughly half of them favoring two regionalist parties.

Italy 2008 election mapAosta Valley is a culturally distinctive part of Italy, as both French and Italian have official status, while 58 percent of the people speak the local Franco-Provençal dialect called Valdôtain, which is in many respects closer to French than to Italian. Two German dialects are also found in the region. Aosta’s birthrate is extremely low, even by Italian standards, but the region’s population is expanding, as outsiders move in to take jobs in the tourism industry. Such features are lItalyRegionsMapikely linked to its strongly regionalist voting patterns.


Remaining Language Families and Geographical Language Groups

Eskimo-Aleut Language MapToday’s post concludes the brief series on world maps of language families, based on abstracting information from a Wikipedia language-family map to make convenient classroom maps. The only remaining category on the map that is a legitimate language family, however, is Eskimo-Aleut. The others are actually geographical groupings of several different families.

With only around 100,000 people speaking its languages, Eskimo-Aleut cannot be regarded as a major family on demographic grounds. It does, however, cover a vast expanse of sparsely settled land, and it is highly significant for Arctic historical studies. I have modified the depiction in the Wikipedia original by adding a small dot for Siberian Yupik, as well as shading areas in which East Greenlandic and “Polar Eskimo” are spoken. I neglected other minor geographical outliers, such as Pacific Gulf Yupik, spoken in south-central Alaska.

Khoisan Language MapThe next map indicates the so-called Khoisan languages of southern Africa. I have again modified the depiction in the Wikipedia original, in this case by adding the geographical outlier of Sandawe, spoken by some 40,000 people in central Tanzania. Although all Khoisan languages share certain features, mostly notably “click” consonants, current linguistic thinking regards the group as an amalgamation of four or five separate families. Many languages in the group are extinct or moribund.

Caucasian Language MapThe “Caucasian” category on the Wikipedia map is also a geographical grouping, as at least three separate language families are found here: Kartvelian, Northwest Caucasian, and Northeast Caucasian. The map exaggerates the extent of these families, especially Northwestern Caucasian. Here the depiction refers to the situation in the mid-1800s, before the expulsion of the Circassians from the Russian Empire. (A more detailed map of the languages in the Caucasus, created by GeoCurrents, can be viewed here.)

Papuan Language MapPapuan is also a non-family, as noted in the key of the Wikipedia original map. It encompasses as many as 40 separate families, the largest of which is Trans-New-Guinea. Mapping even the composite “Papuan” category is quite a challenge at this scale, as many coastal areas of New Guinea and nearby islands are occupied by peoples speaking Austronesian languages. I have slightly modified the Wikipedia original by putting the island of Halmahera in the Papuan category, and by adding another Papuan zone in western New Guinea. The extent of “Papuan” on New Guinea is still understated, however, while its extent on New Britain is overstated.

Australian Language MapAmerican Indian Language MapThe Wikipedia key also notes that “Australian” and “American Indian” are composite categories, composed of “several families.” I have followed the original Wikipedia map closely in both cases, although I would note that the mapping here is far from accurate or precise.

Paleo-Siberian Language MapThe original Wikipedia original map depicts northeastern Siberia as an area characterized by one (or more) linguistic isolates. The various languages of this area are often labeled “Paleo-Siberian,” which is another geographical rather than specifically linguistic category. Most of the indigenous languages here fall into the Chukotko-Kamchatkan family.

Vasconic Language MapFinally, I could not resist including a map of the exiguous Vasconic family, long limited to Basque.


Altaic and Related Languages?

Altaic Language Family MapToday’s language-family maps take up the controversial issue of Altaic. Several decades ago, many linguists grouped the Altaic languages with the Uralic languages, but that thesis is no longer tenable. Now many linguists are expressing doubt about the Altaic family itself. Languages placed within this group have a number of common features, but such features seem to many experts to result from borrowing. The farther back in history one goes, the less similar the main branches of the Altaic family appear. To the extent that this is true, Altaic cannot be regarded as a legitimate language family. I have therefore included a conventional map of Altaic, based closely on the Wikipedia language-family map found here. But I have also posted maps of the three main branches of Altaic (Turkic, Mongolic, and Tungusic), which may well be first-order language families themselves. Note again that the mapping is approximate at best, and refers to the situation pertaining in the mid-twentieth century rather than that of today. I have again closely followed the Wikipedia original map, although I did add a small Turkic area in northeastern Bulgaria. I wanted to add one as well in northern Cyprus, but the area is too small to be indicated given the tools that I am using.

Macro-Altaic Language Family MapA few scholars have suggested that Japanese and Korean also fall into the Altaic category, although that view is difficult to support. Others think that both languages have an Altaic superstratum,* but do not belong in the family (it has also been suggested that Japanese has an Austronesian substratum). Although the membership of Japanese and Korean in an Altaic family seems highly unlikely, I have posted a map of “Macro-Altaic” that includes both languages just to be comprehensive.

Japonic language family mapSome scholars have suggested that Japanese and Korean together form a language family of their own, but support for this thesis is also scant. Japanese is usually regarded as the main language of the much more restricted Japonic family. In addition to Japanese, this family includes the languages of the Ryukyu Archipelago, such as Okinawan. These tongues are often classified as dialects of Japanese, but by purely linguistic criteria they are languages in their own right. I have thus added a small dot to the map of the Japonic languages to indicate Okinawan. Note that the Wikipedia original map ignores the Japonic category and instead classifies Japanese as an isolate, or a language that sits alone rather than forming part of a larger family. (On the classification of Japanese, see here and here.)

Koreanic language family mapThe Wikipedia map also classifies Korean as an isolate. I have instead placed it in the Koreanic family, as several extinct languages also fall into this group, and as the tongue of Jeju island is considered by many linguists to be distinct enough from standard Korean to be classified as a language in its own right. I have thus added a dot for Jeju. It is too large, but unfortunately I cannot shrink it any further.

*A “superstratum” refers to linguistic elements imposed on a given language by high-prestige people, often rulers, who spoke a different language, whereas a “substratum” refers to the surviving linguistic elements of a group whose language was supplanted by another tongue.

Turkic Language Family Map


Mongolic Language Family MapTungusic language family map

World Maps of Language Families, Continued

Uralic Language Family MapToday’s post provides five more language family maps, based again on the Wikipedia “Human Language Families Map” found here. I must again warn that the boundaries here are approximate, and that many small areas characterized by languages in a given family have been ignored. Some areas simply defy linguistic mapping at this scale; the scattered Uralic languages found in the middle Volga region of Russia, for example, are not noted. On a few of the maps, I have added several areas not depicted on the Wikipedia original. On the Austro-Asiatic map, for example, I have included shading to indicate the Mon, Khasi, and Munda languages.

Nilo-Saharan Language Family Map



Dravidian Language Family MapAustro-Asiatic Language Family MapTai-Kadai Language Family Map

World Maps of Language Families

Wikipedia Language Families World MapFor teaching a class on the history and geography of the world’s major language families, good linguistic maps are essential. Unfortunately, serviceable maps that depict only language families are difficult to find. Most images available online show a combination of families and sub-families, splitting Indo-European, for example, into its main divisions. Such a portrayal is of little use for demonstrating the significance of the Indo-European family, which encompasses languages spoken by almost half the people of the world.

The best map of language families per se that I have found is a Wikipedia product, found here and posted above. I do have a few quibbles with the map. It portrays “Caucasian,” for example, as a single family, whereas in actuality at least three languages families are found in the Caucasus Mountains and nowhere else. Like most other family-level linguistic maps, it exaggerates the extent of indigenous languages in places such as Canada, Brazil, and Siberia, where English, Portuguese, and Russian respectively are spreading rapidly as many native tongues slowly fade away. Such mapping, however, captures the situation that existed until fairly recently, and therefore has much to recommend it.

But as good as it may be, this map is of limited utility in the classroom. When I lecture on a specific language family, I want it to stand out on the map, rather than hide among a dozen other pastel-colored groupings. I have therefore used this map as a model for creating a series of family-specific depictions. A few of these are posted here, and the others will appear over the next week. I have simplified the mapping to some extent, partly because the simple program that I use (Keynote, Apple’s equivalent of PowerPoint) does not allow fine distinctions. I have generally followed the contours on the Wikipedia prototype closely, even where I know (or suspect) that the depiction is not quite right. I have done so merely for the sake of convenience. After all of the individual maps have been posted, I hope to put up the original Keynote file, which I will also translate onto PowerPoint. This will allow interested users to manipulate the maps as they see fit, moving the borders between language families, for example, or changing the color scheme.

Indo-European Language Family Map

Sino-Tibetan Language Family Map





Niger-Congo Language Family MapAfro-Asiatic Language Family MapAustronesian Language Family Map

Ideological Agendas and Indo-European Origins: Master Race, Bloodthirsty Kurgans, or Proto-Hippies?

This final contribution to the Indo-European series turns once again to the potential ideological agendas lurking behind theories of IE origin and expansion. As was noted previously, no other issue in human prehistory has been so ideologically fraught; the original IE speakers have been recruited to serve a variety of fantasies, ranging in temper from naively benign to unimaginably vile. For Nazis and their ilk, the original Indo-Europeans constituted the Aryan super-race whose descendants were destined to rule the world. Followers of a certain feminist school of prehistory, in turn, have turned the “Aryan thesis” on its head, portraying the same people as the bloodthirsty “Kurgans” overrunning the peaceful, matriarchal civilization of “Old Europe” and ushering in a global age of violence and male domination. As was argued in the earlier post, it is understandable that some scholars would want to discredit all such overreaching interpretations based on the crushing might of the horse-empowered original Indo-Europeans. If it could be demonstrated that the IE languages were actually spread by Neolithic farmers slowly pushing into new areas as their numbers increased, all such troublesome theories would be effectively undermined.

Yet it is one thing to hope for such a paradigm switch and another to push it along by a purposeful manipulation of data and analysis. Doing so would be a blatantly ideological act, and hence a betrayal of science and reason. Assessing scholarly motivations, however, is a hopeless task, and we have no way of knowing whether Bouckaert et al. have intentionally selected their data and skewed their model in order to support the Anatolian thesis of IE origins. We do think that it is possible, however, that they have unconsciously let their own ideological commitments guide their research program. Our evidence here comes from two sources. First, as we have demonstrated over the past two months, both the data selection and the model construction are warped to consistently favor the Anatolian hypothesis, most egregiously by ignoring all ancient IE language spoken in the steppe zone and by ruling out advection as a mechanism of language spread. Second, it seems likely from the comments posted on this website that distaste for the idea of violent incursions, often viewed as a necessary feature of the “steppe hypothesis,” colors the authors’ perspective. Quentin Atkinson, the article’s corresponding author, quotes Larry Trask to make this point:

Nevertheless, the vision of fierce IE warriors, riding horses and driving chariots, sweeping down on their neighbours brandishing bloody swords, has proven to be an enduring one, and scholars have found it difficult to dislodge from the popular consciousness the idea of the PIE-speakers as warlike conquerors in chariots.

Although the desire to wish away the “bloody swords” of the human past is understandable, it is also naïve, as violence unfortunately pervades our history. One does not have to embrace the vision of Thomas Hobbes, recently updated and re-theorized by Steven Pinker in his tome, The Better Angels of Our Nature: Why Violence Has Declined, to accept that this is indeed the case. I suspect that Pinker exaggerates the bloodiness of hunting-gathering societies, a charge made most forcefully by Christopher Ryan, co-author of the intriguing and controversial Sex at Dawn, yet I also suspect that Ryan descends into hyperbole of his own in emphasizing the peacefulness and sexual license of our Paleolithic ancestors. But when it comes to pre-modern agricultural societies, the evidence is overwhelming: enveloping violence was the norm almost everywhere. If one wants to rule out the possibility of bloody swords and other weapons, one would be advised to examine something other than human history.

But even if armed struggle has been pervasive for most of the past 10,000 years, it does not follow that all non-foraging societies have been equally bloody. As is always the case, different groups vary considerably on this score. If one searches the ethnographic literature, one can find a few documented tribal farming societies that shunned warfare and all of its trappings. Yet the unfortunate truth is that such groups were usually victimized by their more aggressive neighbors, and hence were seldom successful in maintaining their numbers and territories.

One of the most interesting groups of historically peaceful peoples is the Hanunó’o of the Philippines, whose social formation was described by the great American anthropologist Harold Conklin roughly a half century ago. The Hanunó’o constitute a small group (roughly 14,000) of tribal cultivators living in the southern interior portion of the lightly populated island of Mindoro. An encyclopedic treatment of Philippine ethnic groups* frames their peaceable inclinations in concise terms: “Warfare, either actual or traditional, is absent.” But Hanunó’o were able to maintain their irenic way of life only by retreating to rugged and inaccessible areas, and even so they were periodically targeted for centuries by slave raiders from the Sulu Archipelago. Intriguingly, the Hanunó’o seem to be a remnant of what was once a much larger and more sophisticated society, evident by the fact that they have long enjoyed widespread literacy in their own script, an essentially unprecedented phenomenon in a small-scale, tribal society. Conflicts between Spain and the Muslim naval powers of the southern Philippines (the so-called Moros) evidently destroyed the formerly prosperous mercantile centers of Mindoro, after which remnant groups fled the bloody swords of both the Spaniards and the Moros into the inaccessible uplands. There they maintained a generally peaceful way of life, although at a fairly significant cost.

But with the exceptions of some hunter-gatherer bands and a few societies of tribal cultivators, nearly continual violence was the common lot of humanity before the contemporary era. Thus even if Indo-European languages spread into Europe and South Asia through the gradual influx of Neolithic farmers, as Bouckaert et al. argue, the process would have almost certainly been marked by generalized conflict and extensive bloodshed as the Mesolithic indigenes were dispossessed of their lands. By the same token, had the IE languages been spread by horse-riders advancing into the lands of the Neolithic farmers, as most versions of the “steppe hypothesis” contend, violence would also have accompanied the process. But would such a scenario have necessarily entailed substantially greater levels of bloodshed than the majority of such cultural “encounters” experienced over thousands of years across the globe? Equestrian warriors would certainly have had profound military advantages over horseless peoples, but that does not necessarily mean that they would have been any more savage than the human norm. It is also quite possible that IE languages spread mostly through gradual incursions supported in large part by economic or other non-military advantages. Anthropological blogger Al West, for example, surmises that the early Indo-European speakers gained power by selling horses and other goods (see below) to other peoples. Certainly the massive non-IE linguistic substrates found in such IE branches as Greek, Germanic, and Indo-Aryan indicate deep levels of cultural exchange with the indigenous inhabitants of the regions into which the early Indo-European speakers moved.

Portraying the early Indo-Europeans as a uniquely fierce or malevolent people, as some of Marija Gimbitas’s followers were inclined to do, involves more ideological projection as sound appraisal. One can certainly stress the violent nature of their social interactions, but one can just as easily place the emphasis elsewhere. In fact, one can even turn the Gimbutas thesis on its head and portray the steppe-dwelling early Indo-Europeans as gender-egalitarian precursors to the hippies of the late 20th century. Although such a portrayal strays again into the realm of fantasy, it is no less reasonable than either the Herrenvolk (“master race”) or the “demonic Kurgan” theses. As such an inversion of the conventional framing of the original Indo-Europeans makes an interesting thought experiment, and I would ask my readers to indulge me here for a few paragraphs.

The prime evidence for “gender egalitarianism” among early Indo-Europeans derives, ironically, from the realm of war. As was mentioned in an earlier post, the Scythians, an Iranian-speaking group who maintained a largely pastoral way of life in the hypothesized IE steppe homeland, were noted for their female warriors. Herodotus famously wrote of the Amazon fighting women of the region, an observation partially conformed by recent archeological finds; as David Anthony reports, twenty percent of the Scythian/Sarmatian “warrior graves” of the lower Don and Volga river valleys include female remains that had been dressed for battle in identical fashion to the males whose skeletons were found in the same graves. The mere presence of women warriors does not, of course, imply actual gender egalitarianism, nor does it say anything about the social relations of the actual proto-Indo-European speakers, who lived in earlier times. It does, however, indicate a significant extent of female empowerment in an important IE group that maintained an equestrian mode of life on the Pontic Steppes.

Imagining the early Indo-Europeans as proto-hippies is made possible by the group’s close association with marijuana and perhaps other psychoactive plants. Building on the works of archeologists Andrew Sherratt and David Anthony, Al West argues that, “it’s possible that proto-Indo-European speakers became rich and powerful through selling … intoxicants,” further claiming that “Indo-European-speaking people traded THC-laden hemp from the steppes all the way down into the Near Eastern cities, which were naturally a major centre for trade from all over Eurasia. … If this scenario is right, then to the people of Babylon the arrival of Indo-European speakers must have seemed like one crazy dream.”

Although West is probably off-track in suggesting that proto-Indo-European speakers were responsible for the spread of cannabis as a recreational or spiritual drug, such an association is reasonably made for the progenitors of one the main branches of the IE family, the proto-Indo-Iranians. Evidence again comes from both Herodotus, who famously wrote of cannabis ingestion among the Scythians, and from archeological digs; Sherratt discovered charred cannabis residue in a Kurgan site dating back some 3,500 years BCE. Linguistic evidence also plays a role. The hemp plant, which produces valuable fibers and seeds in addition to its mind-altering resin, had been known across much of Eurasia for millennia, and thus had undoubtedly been referred to by many different local names. Cognates linked to the word “cannabis,” however, spread across and beyond the Indo-European-speaking realm in the third millennium BCE, which is believe by some experts to indicate that a new pharmaceutical use for the plant had been discovered and was itself expanding. Although the lines of linguistic descent are not clear, the new term for the plant, which eventually gave rise to the Latin word Cannabis, seems to have been associated with proto-Indo-Iranian steppe dwellers (see the discussions here, here, and here).

Cannabis was probably not the only mind-altering substance used by these people. Perhaps the largest mystery in the history of pharmacology is the identification of soma, the ritual intoxicant of the Rigveda, known as haoma in the Avesta (the sacred text of Zoroastrianism). More than a hundred Vedic hymns extol the unknown substance. Linguistic evidence indicates that soma/haoma was probably not cannabis, although it has been speculated that they were often consumed together. Numerous plants and fungi have been proposed as soma candidates, as spelled out in a detailed Wikipedia article. The primary division in the scholarly literature is between those who think that it was a hallucinogenic substance (such as the mushroom Amanita muscaria) and those who think that it was a stimulant, such as ephedra (also known as má huáng or “Mormon tea”). Recent research seems to be inclining in the direction of ephedra.

Regardless of its true identity, “soma” was ensconced in the Western public imagination by the publication of Aldous Huxley’s Brave New World in 1932, in which a drug called soma is used as mechanism of social control. More recently, the name has been embraced by the hippie community of northern California. The Wikipedia includes a “soma” article dedicated to a marijuana breeder of that name; the article itself notes that this particular Soma is “internationally known as a ‘Ganja Guru’ after developing award-winning cannabis strains.” I doubt very much, however, that ancient Indo-Iranian folk pharmacologists would have recognized this Soma as a kindred spirit.

The point of this excursion is not to argue that such a deeply anachronistic “proto-hippie thesis” has any merit. It is rather merely to show that making such an argument is possible. All human cultures are complex assemblages of ideas and practices, any number of which can be selected for emphasis. Especially when it comes to poorly understood cultures of the ancient past, we should be wary of any thesis that is based on any kinds of essential traits.

*Ethnic Groups of Insular Southeast Asia. Volume 2: Philippines and Formosa. Edited by Frank M. LeBar. 1975. New Haven: Human Relations Area Files Press. Page 76.


The Different Modes of Language Spread

In this second-to-last post on Indo-European origins and expansion, we turn once again to language diffusion, a cornerstone of the model employed by Bouckaert et al. A previous post asked whether languages actually spread by diffusion, arguing that the much more rapid process of advection is often more important. As was then pointed out, physical geographical factors, such as impassible mountains and fertile river corridors, guided such advectional movement. Today’s post considers language movement more generally—whether conceptualized as diffusion or advection—focusing more on the social than the natural environment.

A root error of Bouckaert et al. is regarding language expansion as a singular process. Actually, it can operate in two complete different modes: sometimes a language spreads with a group people, and sometimes it does so among different groups of people. To put it in most schematic terms, language movement occurs when a speaker moves from place A to neighboring place B, but it can also happen when a resident of A imparts his or her language to a resident of B. One process is basically demographic, the other conversional. In geohistorical terms, both forms of language expansion have been ubiquitous. They are generally meshed together in a complex manner, but sometimes one or the other process dominates. As they differ so fundamentally, it they could be realistically modeled in the same manner.

The clearest case of demographic expansion occurs when a single human group arrives on an uninhabited landmass and settles it. As the population expands in numbers and spreads geographically, its language will gradually differentiate into dialects and eventually into separate languages, as sub-populations pushing into new areas become socially separate and their forms of speech drift apart. Such linguistic differentiation could be arrested and reversed by state formation or the emergence of over-arching religious or other cultural institutions, but over the long span of the human past, divergence is usually the rule.

The settlement of Madagascar some 1,500 years ago is a prime example of such virgin-land expansion. Linguistic evidence confirms that the original Austronesian-speaking settlers arrived from Borneo in the Malay Archipelago. As their descendents spread over the mini-continent, their original language differentiated into dialects, some of which are regarded by linguistic splitters as separate languages (the Ethnologue lists ten). Later streams of migrants from the African mainland enhanced the island’s genetic diversity while introducing new linguistic elements, but the newcomers always adopted the language of the original settlers. As a result, all the indigenous forms of speech on Madagascar are very closely related, and are usually classified as variants of the single Malagasy macro-language.

Examples of the opposite process of conversional language expansion are common in today’s world. The process occurs whenever parents neglect to pass on their own mother tongue to their children, in favor of the language of one of their neighboring groups. Hundreds of languages have become endangered in over past generation alone by such changes in behavior. Most disappearing American Indian languages in the United States, for example, are in danger not because their populations are dying out or because their lands are being overrun by English speakers, but rather because decisions are made by parents to raise their children as English speakers.

Such processes of language abandonment and replacement are by no means limited to the modern world. A prime ancient example comes from the Philippine archipelago. Almost all Philippine languages belong to one branch of the Austronesian family, which is almost limited to the Philippines (see the map posted here). Such a pattern would seemingly indicate that the Philippines, like Madagascar, had been initially populated by a single group of settlers whose descendants subsequently spread over the archipelago as their language differentiated. But the actual demographic history of the Philippines was completely different. The original Austronesian settlers came to a land that had already been occupied by tens of thousands of years. Its indigenous* inhabitants were collectively called “Negritos” by Spanish authorities, a word meaning “small, dark-skinned people.” Their languages were undoubtedly unrelated to Austronesian, but we cannot say much beyond that. Although the Philippine indigenes have survived to this day, they abandoned their original tongues many centuries ago in favor of the Austronesian speech of the newcomers.

The social interactions between the Austronesian migrants and the indigenous inhabitants of the Philippines are poorly understood, but the key dynamics are evident. The newcomers were an agriculture people with much more highly developed technologies and forms of political integration than those held by the native foragers. The Austronesian migrants demographically overwhelmed most parts of the archipelago in short order, spreading their language(s) and well as their genes. Yet the indigenes held on in a number of rugged areas, particularly those characterized by heavy, year-round rainfall, such as the Sierra Madre Mountains of eastern Luzon** (in the winter dry season, the Sierra Madre catches rain from trade winds forced up-slope). From such redoubts, however, the indigenous foragers interacted extensively with their Austronesian neighbors, exchanging rain-forest products for agricultural and manufactured goods. Eventually, the languages of their trading partners fully “diffused” across their societies and then began to evolve in their own directions. Today, the several surviving “Negrito languages” are much more closely related to the languages of their neighbors than they are to each other. Strikingly similar processes have occurred elsewhere in the world. The most notable case is that of the “Pygmies” of central Africa, another group of diminutive, rainforest hunter-gatherers who long ago abandoned their own languages in favor of the tongues of their more numerous and powerful neighbors, in this case, languages in the Bantu sub-family of Niger-Congo.

The two cases explored above, Madagascar and northeastern Luzon, are best regarded as ends of a spectrum. Most examples of linguistic expansion involve both processes. When one language group expands it usually does so into the territory of a people speaking another language. As communication between natives and newcomers is essential, many individuals acquire a second language. Over time, such a process often leads to the linguistic conversion of the indigenous group—although advancing group are sometimes converted instead, in which case the language frontier retreats. Such encounters are generally accompanied by some conflict, as the native inhabitants typically resent the incursions of the newcomers, who in turn often use force to advance into new lands. To the extent that the indigenes are able to resist the settlers, they will delay the linguistic expansion. The effectiveness of any such resistance in turn depends on the relative numbers of the two groups and on their levels of political and technological development. Any realistic modeling of linguistic spread must take such factors into consideration.

Patterns of physical geographical play an important role here as well, as resistance by native inhabitants is usually more effective in areas of rough or otherwise difficult-to-traverse topography. In some cases, a particular climatic feature can stop language advance; the spreading Bantu-speakers, for example, encountered a firm barrier in the arid and Mediterranean climates of southwestern Africa, which precluded their faming practices and therefore created a refuge for peoples speaking Khoisan languages. Even the geometry of landmasses can play a role. As Anglo-Saxon speech spread across southern England, Celtic speakers were increasingly concentrated in the funnel-shaped peninsula of Cornwall, increasing their population density, shortening their defensive perimeter, and thereby enhancing their ability to resist the spread of English (further north, it was the rugged uplands of eastern Wales that afforded such protection).  Yet again, all such features must also be taken into account by any effective attempt to model language spread.

The movement of one language group into the territory of another typically results in complex and variable linguistic interactions. Outcomes again depend heavily on relative numbers and different levels of technological and political development. When a large group of technically advanced people spreads over a landscape occupied by scant numbers of less technically advanced people, the linguistic impact can be minimal. As English advanced across Australia, for example, it picked up place names, animal designations, and words for unique landscape features (such as billabong) from Aboriginal languages, but not much more. But when two groups with more similar levels of development come into contact, much more intensive linguistic interactions typically result. Sometimes the linguistic substrates bequeathed by vanquished populations can be profound at both the grammatical and lexical levels, at other times they are of little significance, and occasionally they seem to be minor at first glance but turn out to be surprisingly important.***

When a language group moves into the lands of a different people, the initial linguistic development is often that of widespread bilingualism. If the newcomers are dominant, as they often are, the subjugated indigenes will find advantage in learning the new language, but even members of the dominant group sometimes acquire the native tongue. Gender relations typically play a crucial role here as well. Men from the more powerful group often take women from the subordinated people, insisting that their native wives learn their language. Such women do so imperfectly, often imposing upon it sounds, words, and grammatical patterns from their native tongue. When they pass down the transformed language of their husbands to their children, a certain degree of linguistic fusion results.

The preceding discussion only hints at the possible complexities involved in the linguistic interactions that occur when one language group pushes into the territory of another. Even so, it deeply challenges the diffusion model of Bouckaert et al. Rather than advancing by steady progression, an expanding language often moves forward in a spatially dispersed manner, as its speakers establish themselves as a dominant social stratum in a foreign land. Many members of the native population will learn the new language, but they will at first continue rearing their own children in their own tongue. After a number of generations of such bilingualism, most parents in the indigenous group may opt to acculturate their infants in their second languages rather than in their mother tongues. As a result, a language could “spread” almost instantaneously over fairly sizable areas. Over broader areas, however, such a process is likely to be patchy, with some areas “converting” much sooner than others.

A prime example of such uneven processes of language change comes from Anatolia. Most of the region was Greek-speaking in the 11th century when the Turkish influx began. By the 13th century most of Anatolia was firmly under Turkish rule, and by the middle of the 15th century Greek political power had vanished everywhere. Throughout this period, Turkish gradually supplanted Greek, but along both the Black Sea coast and that of the Aegean Sea, largely bilingual but primarily Greek-speaking communities persisted until the expulsions of the early 20th century. And as we saw in an earlier post, mixed “Turkish-Greek” forms of speech emerged in some areas.

A second major challenge to the diffusion model emerging from this analysis involves the unpredictability of language change when two (or more) linguistic communities come to occupy the same general territory. Although one might expect that the language of the dominant group would always prevail, that is obviously not the case—if it were, England would have switched to a Romance language after the Norman conquest, and Russia would have ended up with a North Germanic language of its Variangian rulers. Instead, England kept a Germanic tongue, and Russia—a Slavic one.

Interesting examples of the uncertain nature of language change after a successful invasion come from the Danubian grasslands of central and southeastern Europe. From the fourth century to the ninth century CE, this area experienced four major incursions by non-Indo-European-speaking, militarily dominant, pastoral peoples from the steppe zone to the east: those of the Huns, the Eurasian Avars, the Bulgars, and the Magyars. All four groups built empires of a sort, and all subjugated the much more numerous local inhabitants. The Huns and the Avars, however, disappeared within a century or so with little trace, linguistic or otherwise. The Bulgars, on the other hand, built a kingdom so powerful that vestiges of it survive to this day in the form of Bulgaria, but their Turkic tongue vanished long ago, failing to maintain itself in the heavily Slavic environment over which the Bulgars ruled. The Magyars, on the other hand, were able to firmly establish their language, which is spoken today by roughly 15 million people, even though the Magyars themselves were a relatively small group, substantially outnumbered by the peoples that they dominated.

Could one have predicted the fates of the Hunnic, Avar, Bulgar, and Magyar languages merely from the basic facts of their migrations, conquests, and state formations? I rather doubt it, as far too many contingencies were involved over long periods and broad territories. More to the point, could any such processes be successfully modeled as instances of linguistic diffusion? Here the answer must be a definitive “no.” Of course Bouckaert et al. would object here, as they rule out all episodes involving the “rapid” spread of a single language. Yet over the past several thousand years, the rapid spread of single languages has been the stuff of linguistic history over broad segments of the terrestrial globe. If such processes are ignored, nonsense necessarily results.


*The term “indigenous” becomes problematic wherever multiple waves of settlement have impacted a particular place. The term is used here in the relative sense, referring simply to groups that predated other groups with which they are compared.

**Intriguingly, the most rugged area of northern Luzon, the Cordillera Central, did not serve as a refuge for the indigenous hunter-gatherers, as all of its recorded ethno-linguistic groups are descended from the Austronesian migrants. The Cordillera, the site of my own doctoral research, is an usual area in many respects, as it was historically characterized by higher population densities than those found in the adjacent lowlands to the east; dense populations, in turn, necessitated the construction of some of the world’s most elaborate agricultural terraces (see the photo to the left). In all likelihood, such high population density in the mountains resulted from Spanish pressure; residents of northern Luzon who did not want to submit to Spanish rule and forced Christianization fled to the uplands, where they had to build terraces in order to survive. Prior to this influx, small numbers of “Negritos” may have lived in parts of the Cordillera.

***Intriguingly, substrate influences that seem insignificant at first glance can actually turn out to be important. For decades, linguists looked for Celtic influences on English in the wrong places and thus could not find them; even such a recent, authoritative text as Baugh and Cable’s A History of the English Language (1993) states that, “Outside of place-names the influence of Celtic upon the English language is almost negligible” (p. 85). Currently, however, many of the linguistic peculiarities of English are being attributed to the Celts. These include the do-support construction (where do is required in questions and for negation), the diphthongization of long vowels (possibly, the first push that started the chain reaction of the Great Vowel Shift), expressing possession inside noun phrases, using the same –self items for reflexives (“John cut himself”) and intensifiers (“The president himself will visit”), using the same verb forms for both causative structures (“I broke the vase”) and inchoative ones (“The vase broke”), and the it-cleft (“It was a car that he bought”).



How Large Was the Area in Which Proto-Indo-European Was Spoken?

As the current series on the origin and expansion of the Indo-European languages nears its completion, only a few remaining issues need to be discussed. Today’s post examines once again the mapping by Bouckaert et al. of the area likely occupied by the speakers of Proto-Indo-European (PIE). The focus here, however, is not on the location of this ancestral linguistic homeland, which they situate in southern Anatolia, but rather on the size of the area over which the language was supposedly spoken. The area so depicted on their maps, it turns out, is almost certainly much too large to be credible. By mapping a Neolithic language as covering almost one hundred thousand square kilometers, Bouckaert et al. demonstrate, yet again, a fundamental failure to understand the basic patterns of linguistic geography.   

Bouckaert et al. give a surprisingly precise figure for the area that their model indicates as the probable homeland of proto-Indo-European: 92,000 km2, roughly equivalent to the extent of Hungary or of the American state of Indiana (see the yellow polygon in the map to the left). But given the characteristically opaque phrasing of the authors, it is not immediately clear if this zone is supposed to represent the actual (likely) spatial extent of the PIE-speaking community, or if it is merely supposed to show the broader area in which a much more spatially restricted language group was located. One can deduce, however, that that the former argument is being advanced based on the authors’ framing of the spatial hypotheses supposedly advanced by two different proponents of the steppe theory:

The areas of the hypotheses are approximately 92,000 km2 for the Anatolian hypothesis, 421,000 km2 for the narrow Steppe hypothesis, and 1,760,000 kmfor the wider Steppe hypothesis. So, these areas show a bias toward the Steppe hypothesis; the area covered by the narrow Steppe hypothesis is more than four times larger than that of the Anatolian hypothesis. Likewise, the area covered by the wider Steppe hypothesis is more then (sic) 19 times larger than that of the Anatolian hypothesis.

As can be seen in the map posted here, the area outlined by the “narrow Steppe hypothesis” fits precisely within the area demarcated by the “wider steppe hypothesis.” Such a depiction would not be logical if Bouckaert et al. were proposing that these “areas” were merely the proposed zones in which in a more spatially restricted language had been located, as opposed to the probable zone that such a language actually covered. If the latter meaning had been intended, the “narrow Steppe hypothesis” would merely be a more precise version of the “wider Steppe hypothesis” rather than a different “hypothesis” altogether. One can thus conclude that the authors intend the yellow polygon to indicate the area over which Proto-Indo-European had been spoken, as posited by their model with the given parameters of uncertainty.


In the modern era, and to a significant extent across the past several thousand years, there is nothing unusual in a single language being spoken over a 92,000 square kilometer block of territory. But for such a situation to obtain, expansive spatial connectivity is necessary, which in turn depends on the power of the state or of some other form of social integration. In the world of Neolithic farmers, such regionally integrative institutions were almost certainly lacking, and as a result linguistic communities would have been much more spatially restricted. Such spatial limitations would have been even more pronounced in areas characterized by rough topography and formidable mountain ranges, as such barriers impede communication and thus enhance social and linguistic fragmentation. Yet as can be seen in the map posted here, Bouckaert et al. place the PIE homeland precisely in such a location. A single language spoken by tribal farmers over such a vast expanse of broken topography is all but impossible.

The situation in regard to the homeland identified by the steppe hypothesis would have been different. Under conditions of equestrian-oriented pastoral nomadism, linguistic communities could have occupied much larger territories than those found among agriculturalists living at the same time. The relatively flat topography of the steppe zone, moreover, would have allowed relatively easy communication among scattered groups. Sizable seasonal aggregations, often of a ceremonial nature, are also common under such circumstances, enhancing social solidarity over a broad expanse of land. But even given all of these considerations, the 421,000 km2 and the 1,760,000 km2 figures noted by Bouckaert et al. for the PIE homeland in two versions of the “steppe hypothesis” are still improbable. Geographically aware theorists thus tend to argue only that the original PIE homeland was situated in the western steppe zone, not over its full extent.

We cannot, of course, determine the areal extent of any prehistoric language, as the needed documentary evidence is lacking. It is tempting to associate specific languages with archeologically attested “cultures” that can be mapped, but it must be recalled that language often fails to correspond to groups defined on the basis of shared material culture; consider, for example, the “Pueblo Indians” and the Northwestern cultures of indigenous North America, both of which were highly multilingual, even at the language family level, yet substantially shared the same material cultures. Material culture, after all, is much more dependent on—and serves in part as an adaptation to—the physical environment, whereas languages seldom co-vary with physical geography; there is no way in which a certain word order pattern, or morphological type, or sound system would be more appropriate for any given landscape. All that we can do, therefore, is argue on the basis of contemporary analogues. Here we find that the areas covered by linguistic communities in those parts of the world that maintained “Neolithic” agricultural systems and forms of socio-political organization into modern times were of a restricted spatial scale. The archetypical location here is New Guinea, which is to this day characterized by pronounced linguistic fragmentation, as can be seen in the map posted here. One might object, however, on the basis that New Guinea is an extreme case and as such should not be used for comparative purposes. But in historically stateless areas elsewhere in the world, even where Neolithic technologies were superseded millennia ago, highly restricted linguistic territories remained the rule, as can be appreciated from the language map of central Nigeria posted here.* Maintaining a single language over an area as large as Hungary in such a context is highly unlikely, to say the least.

Similar objections apply to the mapping of the proto-languages of the major IE branches in Bouckaert et al. One must again consider the authors’ intentions in regard to their portrayal of these languages. It is not exactly clear, for example, what they mean by “the inferred location at the root of each subfamily is shown on the map” (see the map caption posted to the left). The “inferred location” of what? Presumably, they mean the inferred location “of the root,” and presumably “the root” refers to the proto-language that later generated each IE branch. It is still not clear, however, whether the colored areas are supposed to indicate the likely locations over which these proto-languages were spoken, or whether they merely show the probable zones in which much more spatially restricted languages were spoken. If the former scenario is indeed the case, the areas depicted are again much too large.

Of the “root languages” mapped on this figure, that of the Indo-Iranian languages is most preposterous. The previous post specified most of the problems associated with this inferred location. The map posted here also shows the extraordinary disconnection between the existing archeological evidence and the spatial hypothesis advanced by Bouckaert et al. I would further note that the area they advance for the origin of the Indo-Iranian languages makes no sense from the standpoint of physical geography. Its western apex is located in the middle of the uninhabitable Dasht-e Kavir (Great Salt Desert), its central portion is situated in the heights of the Hindu Kush, and its eastern extremity lies in the fertile plains of Punjab. It is unthinkable that any sedentary Neolithic population would have occupied such a territory at any given point in time.

*One could, however, argue that New Guinea and central Nigeria are highly linguistically diverse in part as a function of time. Both areas have been inhabited by modern humans for a very long period. Most of Eurasia has been populated by Homo sapiens sapiens for considerably time than West Africa, and to some extent even New Guinea (the presence of Neanderthals probably impeded the movement of modern humans into western Eurasia for millennia). As a result, one might expect somewhat greater linguistic differentiation in those places as compared to southern Anatolia. But it is also true that the Americas, which had been populated by modern humans for less time than western Eurasia, were also characterized by pronounced linguistic diversity. Significantly, agricultural areas in pre-Columbian North and South America that were not occupied by state-level societies were characterized by spatially restricted language groups.


