Novgorod Birch Bark Documents, Second Slavic Palatalization, and the Wave Model—The ‘Whole’ Story

March 27, 2013  

An earlier GeoCurrents post examined birch bark documents from Veliky Novgorod, Russia. With letters scratched into the inside surface, these scraps of birch bark, well-preserved in water-logged soils near Lake Ilmen, contain a wealth of information for historians and linguists alike. One of the most fascinating puzzles of Slavic historical linguistics was posed by birch bark document #247. It is the oldest birch bark document discovered to date, dating from 1025-1050 CE, which makes it older than Ostromir Gospels, the second oldest extant Russian book (it was considered the oldest before the Novgorod Codex was discovered in 2000). This document was unearthed early on, in 1956, but for a long time its interpretation was subject to fierce debates. Particularly mystifying was the second line, given in English transliteration below:


As mentioned in a previous post, the writing system used for birch bark letters did not employ spaces between words or punctuation, so figuring out where one word ends and another one begins is one of the first tasks of those who try to decipher these documents. In the early years after the discovery of this document, the widely accepted analysis of this line was to break it down as follows (punctuation likewise added for clarity):


The string KѢLEA/KѢLѢA was interpreted as meaning ‘of the room’, making the whole line translatable as ‘and the lock of the room, the doors of the room, the master…’. However, analyzed this way, the sentence is very odd indeed. First, two phrases have subjects but no predicates: the lock of the room what? the doors of the room what? The rest of the document reads as a description of a crime. The second problem concerns the word KѢLEA/KѢLѢA, which was initially interpreted as a misspelled version of KELЬѢ, the genitive singular form of the word KELЬA ‘room’. Certainly, in the context of the words ‘lock’ and ‘doors’ this interpretation appeared sensible, but nonetheless it would later be proven wrong. The misspelling hypothesis, however, was immediately suspect, as misspellings, though not unknown, are relatively rare in birch bark documents. As far as document #247 is concerned, this analysis implied that the word was misspelled twice in two different ways. This is odd for several reasons. First, it means that both times the writer made three errors in a five-letter word, getting all the vowels wrong. Given that the spelling of the time represented pronunciation fairly closely, this possibility seems unlikely. Second, the two instances of the string KѢLEA/KѢLѢA are separated by merely one word, while alternative spellings are more typically found farther apart from each other. Third, as the author did not misspell any other words, is it then reasonable to assume that he (or she?) stumbled over this particular word? If so, why? Finally, phrases in narrative birch bark letters often start with the word A ‘and’ (a comparison to the narrative text of the Bible suggests that this is a common pattern well beyond Old Russian). If the mystery line is broken down as shown above, only the first phrase, the one about the lock, starts with A ‘and’, but not the other two phrases, those about the doors and the master (of the house).

This last problem indicates the probable solution: the line should be broken down as follows:


Now each of the three phrases starts with the word A ‘and’, and the first two phrases acquire a predicate, the same predicate, in fact: KѢLE/KѢLѢ. The two different spellings of this word pertaining to the last vowel, E or Ѣ, is no longer a problem, as these are different case/number endings in agreement with their subjects: the ending -E is the nominative singular masculine ending agreeing with the noun ZAMЪKE ‘lock’  (as in ŽIZNOBOUDE, POGOUBLENE, NOVGORODSKE, and SMЬRDE, in document #607/562, discussed in an earlier post), whereas the ending -Ѣ is the nominative plural feminine ending showing agreement with the noun DVЬRI ‘doors’. While the forms of KѢLE/KѢLѢ now make sense, the meaning of this word was still baffling, especially because of the first consonant. In both Modern Russian and Old Russian documents from other areas, this root is pronounced as [tsel]. Ironically, the English cognate, whole, helps shed some light on the mystery: the word in birch bark document #247 means exactly that, making the line ‘and the lock (is) whole (=unbroken), and the doors (are) whole, and the lord (of the house)…’. The letter in its entirety denies previous reports of a break-in/robbery. (Note that in Old Russian the present tense copula ‘is/are’ was omitted, as it is in Modern Russian.) To relate the Old Novgorod KѢLЪ (the nominative singular masculine form) to its Modern Russian and English cognates, we need to examine the sound changes that resulted in the correspondence between Novgorod [k], elsewhere in Russian [ts], and English [h].

Despite some recent claims to the contrary (e.g. Greenhill & Gray 2012: 525), in comparative linguistics “the distinction between innovations and retentions” is not “an outcome of subgrouping hypotheses ”. Instead, innovations can be identified solely based on the type of change involved, as many phonological (and grammatical) changes work only “one-way”, with reverse changes unattested. Two types of such “one-way” sound changes well-documented in numerous languages around the world are lenition and palatalization. Lenition (or weakening), which may involve (some combination of) voicing, spirantization (i.e. change from stop into fricative), or deletion of consonants, is very common cross-linguistically, while the reverse process, known as fortition, is much rarer (especially in intervocalic position). Consider, for example, the development of the intervocalic consonant of the Vulgar Latin vita ‘life’ in its Romance descendants: Italian retained the voiceless stop (vita), while Portuguese voiced the stop (vida), (European) Spanish both voiced the consonant and turned into a fricative (pronounced [viða]), and French deleted the consonant altogether (vie).

Lenition, that is a change from a stop /k/ into a fricative /h/, explains what happened in the development of the English whole. Despite the spelling with wh-, this word does not derive from an earlier form that had a hw- or hv- in the beginning, as is the case with many other wh-words in English (e.g. what derived from the Old English hwæt and whale derived from the Old English hwæl; compare with the Norwegian cognates hva and kval, respectively). The spelling of whole with wh- developed in the early 15th century. The etymological source of whole is the Old English form hal meaning ‘entire, unhurt, healthy’, which can be traced to Proto-Germanic *khailaz meaning ‘undamaged’. Cognate words from other Germanic languages—all deriving from the same form—include the Old Saxon hel, Old Norse heill, Old Frisian hal, Middle Dutch hiel, Modern Dutch heel, and Modern German heil meaning ‘salvation, welfare’. An even older source is the Proto-Indo-European *koylo-/*koilas. The change from /k/ to /h/ happened as Proto-Germanic developed out of Proto-Indo-European (in fact, the initial change was from /k/ to /x/, with an additional change from /x/ to /h/, another instance of lenition, applying later). This change was part of a larger chain of phonological changes in the history of Germanic languages known as the First Germanic Sound Shift, or Grimm’s Law, named after Jacob Grimm (1785 –1863), of the Brothers Grimm fame, who in 1822 elaborated on the earlier discovery of this law by Rasmus Rask. Additional examples of words that underwent the same change include foot (cf. Latin pēs, pedis), third (cf. the Latin tertius), and heart (cf. Latin corem).

In addition to lenition, palatalization of velar sounds such as /k/, /g/, and /x/ is another sound change that happens to work “one-way”. This change, which can turn /k/ into /s/, /S/, /ts/, or /tS/, is also widely attested cross-linguistically, while the reverse process is unknown. English pairs such as kirk vs. church, dike vs. ditch, and skirt vs. shirt are witnesses of palatalization from the earlier Germanic /k/ into /tS/ or /S/. In the Romance grouping, palatalization turned the /k/ of Vulgar Latin into /S/ in French; hence, the Vulgar Latin causam ‘thing’ and cattum ‘cat’ became the French chose and chat (this change happened only before /a/, so that the /k/ in Vulgar Latin corem ‘heart’ was retained in the French coeur). In Slavic languages, two (or possibly three) waves of palatalization occurred, producing different outcomes and affecting different sets of languages. The wave that is of interest to us here is the so-called Second Slavic Palatalization, which turned /k/ into /ts/ (as well as /g/ into /dz/ or /z/ and /x/ into /s/). According to historical linguists, this change was necessitated by an earlier change that turned /aj/ into /e:/. Slavic languages have a phonotactic (from “phono” = sound, and “tac” = attach) constraint that does not allow velar consonants to appear next to vowels /i/ and /e/. These vowels are pronounced with the tongue in a fronted position, so it is easier to pronounce them next to a consonant that is likewise pronounced with the tongue in a fronted position, whereas velar sounds are are pronounced with the tongue further back in the mouth. (Note that the constraint is not absolute, as English speakers manage words like keep or even kick perfectly well.) In the case of ‘whole’, the Proto-Slavic form was *kajlu ‘whole, healthy’, with a diphthong /aj/ as in the Proto-Indo-European form *koylo-/*koilas. The vowel change turned *kajlu into *kēlu (the macron over a vowel means it’s pronounced as long), in which the velar /k/ appeared next to a front vowel /e:/ in violation of the phonotactic constraint mentioned above. Then, Second Slavic Palatalization applied turning this word into *tsēlu, which in turn produced the Modern Russian целый [tselyj] and the Modern Polish cały.


The Second Slavic Palatalization affected all Slavic languages, including East Slavic (e.g. Russian), West Slavic (e.g. Polish) and South Slavic (e.g. Bulgarian).* The only Slavic variety that is exceptional in this regard is the Novgorod dialect of Old Russian: recall that it had KѢLЪ ‘whole’ with a /k/ rather than /ts/. Until the discovery of the relevant birch bark documents, the accepted view was that the Second Slavic Palatalization happened in Proto-Slavic, the ancestor of all Slavic languages, before it split into daughters from which the three geographical groupings (East, West, and South Slavic) arise. But the unearthing of the birch bark document with the form KѢLЪ, as well as the discovery of some other documents with additional forms further supporting the absence of Second Slavic Palatalization in Novgorod, threw a spanner into the works.

Several hypotheses have been proposed to explain the exceptionality of the Novgorod dialect. According to a birch-bark-letter expert Andrey Zaliznjak, the Old Novgorod dialect must have split off the common Slavic tree earlier than the Second Slavic Palatalization applied. But this theory is problematic as it would make the Old Novgorod dialect far more distinct from its other East Slavic brethren than it really is. According to Zaliznjak’s theory, we would not expect speakers of Modern Russian to be able to get the gist of these birch bark missives, contrary to fact. Moreover, apart from the non-application of the Second Slavic Palatalization, there does not seem to be any evidence for an early split of the Old Novgorod dialect.


Another hypothesis relies on the idea of substrate influence, that is the influence from the language of a neighboring group with whom the speakers of the Old Novgorod dialect were in contact. The groups living in the proximity of Veliky Novgorod were the now-lost “middle Finns” who spoke some now-extinct Finnic languages, such as Merya, Meschera, or Murom (which at the time the birch bark document #247 was written were beginning to be acculturated into the Russian ethnos), or one of the (barely) surviving Finnic languages such as Votic and Veps. As can be seen from the map on the left, the “lost middle Finns” lived immediately to the east of Veliky Novgorod, which is located just north of the Lake Ilmen (find the northern tip of the Slavic speaking territory and you will see a small light blue patch, Lake Ilmen). Unlike Slavic languages, Finnic tongues do not have the same phonotactic constraint that disallows velar consonants next to front vowel, making the substrate theory possible. However, the influence of Finnic-speaking neighbors in the case of the non-application of the Second Slavic Palatalization in Novgorod is difficult to prove beyond reasonable doubt, as is typically the case with substrate theories. It is like the problem of finding a black cat in a dark room, except in this case we do not know what animal we are looking for and we are not even sure which room we should search. In a typical situation, there exists a lag in time between the coexistence of the two languages in the same area and the penetration of the substrate’s features into the superstrate language. Given that, it is not clear if the influence of Finnic languages should be observed in East Slavic (vis-a-vis West and South Slavic), in Russian (vis-a-vis other East Slavic languages), or in northern dialects of Russian. The matter is further complicated by the fact that the languages of the “middle Finns” are themselves extinct; as a result, the rules or properties they might have had are not known directly but can only be reconstructed on the basis of other Finnic languages.


An alternative solution to the abovementioned conundrum—that the Old Novgorod dialect is simultaneously the first tongue to split off the Slavic tree and the member of the East Slavic grouping—is to abandon the tree model of language divergence completely, or at least to supplement it with another model, such as the wave model. According to this theory, innovations do not create sharp divergences of languages but rather spread through related linguistic varieties like waves on the water. It has been observed that innovations often start in the geographic core of a given language family and spread outwards, sometimes leaving the most peripheral areas unaffected. One example of an innovation that started in the geographical core of a language family and spread outwards leaving the most peripheral areas untouched happens to be an instance of palatalization as well, which created the distinction between centum and satem languages, named after the word for ‘hundred’ in representative languages: Latin and Avestan. Centum languages, including those in the Germanic, Romance, and Celtic branches, retained a velar sound of Proto-Indo-European; in contrast, satem languages, including Indo-Iranian, Balto-Slavic, Armenian, and Albanian, turned the Proto-Indo-European velar into an /s/. Originally, it was thought that this distinction offered proof that Proto-Indo-European had split into two daughter languages, corresponding to a division into west-east, but this picture was called into question by the discovery of Tocharian, which was spoken in the east but was nevertheless a centum language. If we hypothesize that Tocharian descended from the western, centum daughter of Proto-Indo-European, its geographic location becomes difficult to explain. A more adequate view is that the innovation that led to the emergence of satem languages affected only the centrally-located branches of the family but not those at the periphery of the Indo-European realm, both to the west and east of the core.

In the case of the Second Slavic Palatalization, the change must have started somewhere to the southwest of Veliky Novgorod and spread through Slavic languages, yet left the far northern reaches of the East Slavic speaking realm unscathed. (Note that the Russian expansion further north, which led to the emergence of northern dialects such as that of the Pomors, did not start until the twelfth century, well after the Second Slavic Palatalization must have applied.)



*The only difference between the various Slavic languages with respect to the Second Slavic Palatalization concerns its application in words where the velar and the front vowel are separated by a glide /w/. In such words, the Second Slavic Palatalization applied only in South Slavic languages (including Old Church Slavonic), but not in West or East Slavic languages. Modern Russian, apparently, borrowed words where /kw/ had been replaced by /tsv/ from Old Church Slavonic; hence, in Modern Russian we have the stem [tsvet] ‘flower’, whereas Polish and Czech have forms with the original /kw/ (kwiat and květ, respectively). Other East Slavic varieties, like West Slavic languages, retained the /kw/: Ukrainian (kvitka), Belarusian (kvetka) and even some Russian dialects (kvet).



  • Barbara H Partee

    Interesting! And have you discussed your alternative hypothesis with Zalizniak? Has he reacted anywhere? (I just saw from one of your links that you wrote about this on your other blog in 2011.) I just saw him yesterday, at a wonderful film evening devoted to interviews and fieldwork expedition films of the late Aleksander Evgenevich Kibrik. Today I sent a link to your first intallment to him and Elena Paducheva, in case they hadn’t seen it. I guess I’d better send them a link to this one, too!

    • Asya Pereltsvaig

      Thank you for sending them the links, Barbara! I don’t know Zalizniak personally, alas, though I’d love to meet him in person some day or at least over email or in this blog. I would love to know what he thinks about my hypotheses, and in generally I am very happy to publicize their work, which I find fascinating. If you speak to him, please encourage him to write me if he would (and it’s okay to share my email with him). Thank you!!!

  • German Dziebel

    “A more adequate view is that the innovation that led to the emergence of satem
    languages affected only the centrally-located branches of the family
    but not those at the periphery of the Indo-European realm, both to the
    west and east of the core.”

    As far as I know, satemization was a convergent process in several branches of IE, hence am not sure the core vs. periphery distinction fully applies here. It may just be another coincidence (west/east or core-periphery doesn’t matter). Tocharian is a heavily palatalized language (more palatalized than satem languages, it seems) but in different ways. Hence, palatalization may have been a more ancient trend in eastern Indo-European languages that affected different branches independently.

    Overall, Novgorod KѢLЪ ‘whole’ is fascinating. But, truth be told, there’re many examples of the violation of palatalization rules in Slavic. E.g., at the earlier stage, *svekru ‘father-in-law’ retained -k- while Lith sesuras attests for k’ there, so Slavic should have svesru**. And it’s not the only example. Many explanations have been ventured but I don’t think it has been resolved. My fear is that KѢLЪ will remain a thorny question.

    • Asya Pereltsvaig

      Thanks for bringing up ‘father-in-law’: what are some of the reflexes of this root in other Indo-European languages, I am curious?

      • German Dziebel

        Gk ‘hekuros, Lat socer, Skrt svasura, OHG swehur, Gothic swaihra, Alb vjeherr, Arm skesur. Skrt and Lith attest for k’, Alb and Arm are aberrant. The Albanian form should be vjether. The Armenian one suggests that k > s transition happened very late in Armenian history, independently of the same transition in Lithuanian and Old Indic, because if it was part of the eastern satemization process, then -s- would have been regularly lost in early Armenian between vowels.

        • Asya Pereltsvaig

          Fascinating! So I take it Albanian, Armenian and Lithuanian still use this word? What do Romance languages have for in-laws, I wonder?

          • German Dziebel

            Spanish suegro, Italian suocero, Portuguese sogro.

          • Asya Pereltsvaig


          • David Marjanović

            Modern German has Schwager (clearly somebody had fun with Verner’s Law) “brother-in-law” and Schwägerin “sister-in-law”, and the other in-laws are constructed with Schwieger- “-in-law”, like Schwiegervater “father-in-law”.

  • TimUpham

    Are not the descendants of the Volga Finns, today’s Mari and Mordvin peoples? That was the reason why I questioned if any of the birch bark documents were in Mordvin or closely related Finno-Ugric language. When the Slavs expanded, they took on the place names of the Finno-Ugric peoples. The Neva River, comes from the Finno-Ugric “nevo,” meaning “sea.” The Finno-Ugric people living there now are the Ingrians, where only 500 are left who still speak the Ingrian language.

    • Asya Pereltsvaig

      Yes, Mari and Mordvin are related to those languages of the “lost Finns” that have been acculturated. Why some Finnic groups were acculturated and others not, that’s an excellent question.

      • TimUpham

        What became the language of trade and commerce, determined its viability. Like when the Saami took on their Finno-Ugric language, that was the language they were trading in. The same for the Baka (Pygmy) people, they were trading in the Bantu languages, so that became their native language. Those in the Mbuti Forest still retain elements of the original Baka language.

  • Jonathan

    Late to the game, but were it a wave that swept most Slavic languages, would we be likely to find that it had influenced some of their neighbors as well?

    • Asya Pereltsvaig

      Good question, Jonathan. In theory it can, but it would be hard to prove. Palatalization of velars is so common cross-linguistically that it’s hard to tell if it applies in a neighboring language independently or as part of “a wave that swept most Slavic languages”, as it won’t apply in exactly the same way…