Language and National Identity, Part 1

(Author’s Note: This is a preliminary draft of a chapter that might be included in a forthcoming book, tentatively entitled Seduced by the Map: How the Nation-State Model Prevents Us from Thinking Clearly About the World. It includes some bibliographical citations, but they are woefully incomplete.)  

There are good reasons why students of nationalism often emphasize language.[1] Building a community, even an imaginary one, requires communication. For ethnonationalists, language has even greater significance, as it is seen as a key indicator of the deeply rooted relatedness that supposedly generates strong and stable nations. Up to the mid-1900s, scholars often used language as a proxy for genetic bonds; many maps of “race” made at the time actually depicted languages or language families. In the ethnonationalist discourse of late nineteenth and early twentieth-century Europe, people who speak a single language that originated in the territory that they inhabit were often viewed as forming a cohesive ethnic group that, if large enough, deserves to have its own state.

Such ideas are no longer easily supported. It is now obvious that languages can spread independently of genes, but it is less commonly realized that they have done so for millennia. The ethnolinguistic communities that supposedly form the age-old bedrock of national solidarity are not as clearly separated from each other as they have been imagined to be, and in many cases their emergence has been relatively recent. As we shall see, the national languages that underpin contemporary ethnonational states were in almost all cases politically molded or even created to enhance state cohesion and national solidarity. They were not, in other words, natural features of preexisting populations.

In countries founded on extensive immigration from distant lands, traditional ethnonationalism is not applicable. The American, Australian, and Brazilian nations, for example, cannot be depicted as rooted in local ancestral populations united by cultural and genetic ties. But that does not mean that language plays no role in their national imaginations, nor does it make them immune from ethnonationalism. But the neoethnonationalism found in some extremist quarters in the United States and other immigration-based countries necessarily rests on different foundations from the older creed. It generally turns to race as the key indicator of relatedness while framing language and religion as cultural adhesives necessary for national bonding.

As a result of this disparity, the racism inherent in the “white ethnonationalist” fringe in United States and a few other immigration-based countries differs markedly from the racism that generally accompanied traditional European ethnonationalism. The notion that the Germans and the Poles, or the English and the Irish, could find political solidarity in their common “Caucasian” racial identity would have struck most nineteenth-century observers as absurd. In the era’s popular wisdom, each linguistically defined national community formed its own race. Scholars of the time, in contrast, distinguished broader races ostensibly based on such physical attributes as head shape and skin color but often linked for convenience to language groups. Until the post-WWII era, however, the emphasis was on dividing Europeans by race, not uniting them. This situation changed, however, in the late twentieth-century as diverse immigration streams began to transform European demography.

To understand the linkage between language and nationalism it is therefore necessary to differentiate countries that have potential grounds for traditional language-based ethnonationalism from those do not. States in the first category have clear majority populations that speak a single language that originated (or is perceived as having originated) within its national territory. As can be seen in the map posted here, such countries are clustered in Europe, East Asia, Mainland Southeast Asia, and Central Asia. But as we shall see, many complications intrude on this simple binary classification. This map shows one such problem: situations in which more than one state can lay claim to the same ethnolinguistic legacy.

The present chapter takes on countries linked to a particular locally rooted ethnolinguistic group. As we shall see, the correspondence between state and language does not reflect ancient conditions but is instead a feature that has been politically engineered over the past several centuries. The more complicated situations faced by countries that have no indigenous ethnolinguistic core group are taken up in the following chapter. Both discussions explore how language has been used to invoke the imagined communities that lie at the heart of the nation-state project. As we shall see, the relations between nation and language can be intricate indeed.

Before delving into specific cases, it is worth noting that most Americas are probably perplexed by the subtleties and importance of language politics in the rest of the world. Distinctions that matter passionately elsewhere may seem arcane or even moot from the standpoint of the speakers of the planet’s key international language. It takes time and patience to drill down into these emotionally charged histories and geographies to appreciate their continuing significance.

            The Deep Historical Development of National Languages

Language is usually imagined as the most important factor in separating “them” from “us” in early human societies, delimiting discrete social groups that conceptualized themselves as a single people denoted by their own ethnonym. The noted linguist Mark Baker has suggested that this differentiating facility is one of the main reasons why languages diverge from each other as quickly as they historically have.[2] But such separation by language is only one side of the coin. Many individuals have always learned the tongues of neighboring peoples, while trade languages have long enabled communication across ethnolinguistic lines. Some evidence suggests that such practices were widespread before the development of agriculture. In some contemporary hunter-gatherer societies, particularly those of Australia, some individuals cross multiple language lines while wandering over vast distances, generally finding themselves accepted into tiny but often multilingual bands.[3] Somewhat similar dynamics may have been at play in the Paleolithic Age over most of the world.



As sedentary social organization emerged and spread, languages probably tended to become more firmly fixed in place. Before the advent of the state and other institutions of broad social integration, local languages tended to be restricted to small territories and were spoken by groups seldom numbering beyond the tens of thousands. To be sure, some languages expanded relatively quickly over vast areas through the demographic expansion and social domination of the peoples who used them. Such a process was usually propelled by some technological advantage, such as the crops and iron tools and weapons of the Bantus in Africa, the seagoing boats and navigational techniques of the Austronesians in the Pacific and Indian oceans,[4] and the horses of the early Indo-Europeans in Eurasia. But the expanding languages of such linguistic “spread zones” were simultaneously differentiating into different dialects and then into separate languages, undermining any linguistic unity that the initial process seemed to promise. To sustain a single, standardized language over an area the size of most modern countries requires integrative mechanisms, which have usually relied on governmental power. In a word, most ethnonational states substantially built the languages that supposedly serve as their primordial glue.






This drive for political-linguistic coherence is relatively new. Not all states have historically sought to disseminate the language of their ruling elites. Pre-modern polities, especially large ones, were often strikingly multilingual. Elites and commoners, moreover, often spoke different languages. A clear example from the early modern period is the Grand Duchy of Lithuania, a vast and powerful polity whose leaders never sought to govern through their own tongue; even in their capital city of Vilnius/Vil’nya, “the dominant administrative language was Slavonic ruski, not Baltic Lithuanian.”[5] Many early states employed multiple languages of administration, often including “extinct” classical tongues of high prestige. Latin was the official language of multi-lingual Hungary until 1844. The earliest known polities of Southeast Asia apparently employed Sanskrit rather than their own tongues. Later, other languages gained prominent roles. The Burmese kingdom/empire of Pagan (849-1297), for example, seems to have used Burmese, Pyu, and Mon, employing the classical Indian language Pali, as well as Sanskrit, for religious purposes. Similar examples are legion in the ancient Near East, as recounted in Nicholas Ostler’s insightful Empires of the Word.[6] The Achaemenid Persian Empire (550-330 BCE), for example, used Elamite, a little-know but once important language, and then Aramaic as its main administrative language, downplaying the Old Persian of its ruling elite.[7] (In the empire’s grand stone inscriptions, the texts are written in Elamite, Akkadian, and Old Persian.) Ostler concludes that “the life and death of languages are in principle detached from the political fortunes of their associated states.”[8]













To be sure, some ancient empires did spread the language of their ruling class widely, generating something like a national tongue in the process. Latin was originally a minor language limited to a small area in west-central Italy, but by the fourth century CE it was spoken over most of the western half of the Roman Empire.[9] Similarly, Arabic was originally restricted to the central Arabian Peninsula, but after the conquests of the Umayyad Caliphate in the early seventh century it spread over much of Southwest Asia and most of North Africa. In South America, Quechua was similarly disseminated far beyond its small homeland in the southern highlands of Peru by the Inca Empire, a process that continued under Spanish authority well into the eighteenth century.


















Despite these precedents, a close fit of state with language is generally a feature of the modern world. To be sure, state-level centralization coupled with linguistic consolidation has proceeded in a fitful manner, but it eventually yielded scores of countries that are closely associated with their own language. This pattern mainly characterizes the Eurasian rimlands of Europe, East Asia, and mainland Southeast Asia. Let us now look at each of these regions in turn.

[1] Elie Kedourie, Nationalism. 1960. London: Hutchison and Co. Page 68.

[2] Mark Baker, The Atoms of Language: The Mind’s Hidden Rules of Grammar. 2002.

Basic Books.

[3] See the discussion in David Graeber and David Wengrow, The Dawn of Everything: A New History of Humanity. 2021. Farrar, Straus and Giroux.

[4] In many parts of their expansion zone, including Madagascar and Polynesia, the early Austronesians were the first people to settle the lands.

[5] Norman Davies, Vanished Kingdoms: The History of Half-Forgotten Europe. 2011. Allen Lane. Page 261. Ruski refers here not so much to Russia as to the language ancestral to modern Belarusian.

[6] Nicholas Ostler, Empires of the Word: A Language History of the World. 2005. Harper Perennial.

[7] Ostler, pages 47, 57

[8] Ostler, page 63

[9] Local languages, however, persisted in odd corners, as attested by the survival of Welsh and Basque, while the eastern half of the empire mostly used Greek and Aramaic/Syriac. The spread of Latin is attributed not merely to its administrative functions, but also to the emulation of local elites and the widespread experience of service in the Roman army. Its final fillip may have been the spread of Christianity, which reached deeper into everyday life than the empire ever had.