How to create an “exotic” language: Na’vi and Dothraki

Although Klingon is by far the best known example of a “sci-fi language”, it is not the only one. Two more recent examples of languages artificially created for sci-fi films include Na’vi, the language of the aliens in the film Avatar, and Dothraki, a language created for HBO’s adaptation of George R.R. Martin’s epic fantasy tale, A Game of Thrones. Both languages were created by professional linguists: Na’vi by Paul Frommer, a professor at the University of Southern California and a linguistics consultant, and Dothraki by a professional language creator David J. Peterson, for whom this is his fourteenth linguistic creation.

As was the case for Klingon, the producers of Avatar and of A Game of Thrones wanted a language that is at the same time human-like enough to be manageable for the human actors and yet also alien enough to fit the genre. To achieve that, both Frommer and Peterson turned to linguistic typology for answers. They looked for sounds and structures that are within bounds of what is possible in a human language, but are at the same time fairly exotic in languages of the world.

One example of making an artificial language sound exotic by using an exotic linguistic feature is the use of the so-called ejective sounds in Na’vi. These are sounds like /p/, /t/, and /k/ but with a certain “spat out” quality to them. English does not have ejective sounds so it is difficult for an untrained English-speaker to pronounce them. But let’s consider how they are pronounced anyway.

To pronounce a “plain”, non-ejective version of, say, a /p/ we press the lips together, which prevents the air from escaping the mouth for a short moment, and then release abruptly, which lets the air explode out of the mouth (try it!). To say a non-ejective /t/ we use the same “close-and-release” strategy, but instead of pressing the lips together, we press the tip of the tongue at or behind the upper teeth; a non-ejective /k/ is achieved by the same “close-and-release” technique but pressing the back portion of the tongue against the soft palate (that’s the soft part at the back of the roof the mouth).

Their ejective counterparts are pronounced in the same fashion, but in addition to creating a closure in the mouth (with the lips, tip or back portion of the tongue), the space between the vocal cords in the larynx (called “glottis”) is closed and released as well. While it sounds exotic enough, we are familiar with this glottal closure in English too: we do it in the middle of oh-uh! (and it sounds like a moment of silence). In the case of ejective sounds, it is this additional closure of the glottis that creates the dramatic burst of air that distinguishes an ejective sound from a “plain” one. You can hear some ejective sounds pronounced here.

Do they sound exotic enough? As it turns out, quite a few of the world’s languages have such ejective sounds. In fact, 92 out of 567 languages in the World Atlas of Linguistic Structures have them (see map below). Perhaps the best known examples of languages with ejective sounds come from the Caucasus region, between the Black and Caspian seas. Languages from all four language families indigenous to this region have ejective sounds, including Abkhaz (Northwest Caucasian family), Ingush (Nakh family), Dido (Northeast Caucasian family), and Georgian (South Caucasian, or Kartvelian, family). Moreover, ejectives are found even in languages from other language families that are spoken in the region, most notably (some dialects of) Armenian and Ossetian (more on the latter in a later GeoCurrents post). Outside the Caucasus region, ejective sounds can be heard in Athabaskan, Siouan and Salishan languages of North America; Quechua and Aymara, spoken in Bolivia; Amharic, one of the major languages of Ethiopia; Hadza and Sandawe, two Khoisan languages spoken in Tanzania; their Khoisan relatives in southern Africa; and Itelmen, an endangered language spoken in Kamchatka.

The sound system of the Dothraki language from A Game of Thrones is also rather exotic; however, it is not because of specific sounds that it has, but rather because of the sound that it does not have. Specifically, Dothraki has a 4-vowel system which includes /i/, /e/, /a/ and /o/ sounds, but no /u/ sound. This sort of vowel inventory is very rare for a natural human language. In fact, most of the world’s languages have larger vowel systems, with 5 or more vowels. Thus, World Atlas of Linguistic Structures Online lists 288 languages (52%) as having 5 or 6 vowels and 183 languages (32%) as having 7 or more vowels (English is one of them). Only 93 languages (16%) have as few as 2 to 4 vowels.

Most of the languages in this “small vowel inventory” category feature a 3-vowel system with /i/, /u/ and /a/, which can be schematized like this:

To pronounce the i-sound, the tongue moves to the top front part of the mouth (shown on the top left in these diagrams); for the u-sound, the tongue is in the top back part of the mouth (shown on the top right); and for the a-sound, the jaw is lowered and the tongue is at the bottom of the mouth (that’s why a doctor asks you to say ah rather than eeh or ouh when he wants a glimpse of your throat!).


Richer vowel systems normally include the three basic vowels: /i/, /u/, and /a/. Thus, a 5-vowel system would typically include those three basic vowels, plus the two intermediate vowels, /e/ and /o/, as for example in Spanish (see diagram on the left).



A 7-vowel system has the basic vowels /i/, /u/, and /a/, and also distinguishes a higher and a lower version of the e- and o-sounds, as in Brazilian Portuguese and some southern dialects of Italian (see diagram on the left).



An 8-vowel system, like the one used in Turkish (shown in diagram on the left), would include the basic vowels, /i/, /u/, and /a/ too. Even English, with its 15-20 vowels (depending on the dialect), includes the basic (long) /i/ (peel), /u/ (pool), and /a/ (father). The reason that human languages insist on having /i/, /u/, and /a/ is that these three vowels delineate the three corners of the vowel space: they are pronounced and perceived acoustically as the most distinct. Thus, the Dothraki language lacking the u-sound makes it quite unusual, as far as human languages are concerned.

The grammar of Dothraki also underscores another important point: what may seem exotic to speakers on one language (say, English) may not be exotic if a wider range of languages is considered. Take, for example the English phrase the big house. To an English speaker it seems perfectly natural to place the adjective big before the noun house (and *the house big is perceived as a “word salad”). But as it turns out, many – in fact, most – languages of the world go with the opposite order. For example, the same phrase translates into Spanish as la casa grande. Other Romance languages vary as to which order they use and where. For example, in French most adjectives appear after the noun, like in Spanish (e.g. consider the expression “to have a carte blanche”, which English borrowed from French, where it literally translates to ‘card white’). Yet, other adjectives appear before the noun in French: thus, we speak of the Belle Époque, literally the ‘beautiful era’. And yet other adjectives can appear either before or after the noun, but with a different meaning: for example, while Napoleon can be characterized as un grand homme ‘a great man’, he cannot be truthfully called un homme grand ‘a large man’.

However, if we consider a bigger sample of human language, as in the WALS data, we quickly realize that the noun-adjective order is far more common than the opposite, adjective-noun order, familiar from English. The former is found not only in Spanish (and to some extent in other Romance languages), but also in Hebrew and Arabic (Semitic), Malagasy and Rapanui (Austronesian), as well as Hixkaryana (spoken in northern Brazil). Altogether 768 languages in the WALS sample feature this word order, whereas the opposite, adjective-noun order is found in only 341 languages (with another 104 language indeterminate in this respect). Thus, English is on the more exotic side in terms of adjective-noun order. And the Dothraki language, the creator of which aimed to make it as exotic as possible, actually exhibits the noun-adjective order, which is exotic to an English ear but not from a wider cross-linguistic perspective.

[Readers interested in word order in Yoda-speak might want to read a piece written by my colleague Geoffrey K. Pullum in the Language Log]

  • Maju

    Doesn’t standard Arabic have only three vowels (/a/, /i/, /u/)… plus three long versions of the previous? When I tried to learn some Arabic years ago, I got that quite clear impression. Depending on dialect vowels can slide to a nearby mode (/i/ -> /e/ and such) but that doesn’t change the basic concept, just its real implementation.

    Do hence linguists consider long and short vowels distinct? I would not, sincerely, because they are in the same vortexes of the vocal triangle, regardless of length.

    • Asya Pereltsvaig

      You are right about Arabic having three vowels (/a/, /i/, /u/) but in both short and long forms.

      More generally, yes short and long vowels are distinct, as they can be used to distinguish meaning: for example, in Estonian sada means ‘hundred’ and saada means ‘to get’ (actually, Estonian makes a 3-way length distinction, with the medium length here meaning ‘send!’). The vowel diagrams in this post (and the WALS map provided) concern only the so-called “vowel quality” (i.e. the position of the tongue and the lips), but other features (known as “suprasegmetal features”), including length, may be superimposed on the vowel quality to create more meaningfully distinct vowels. Besides length, the same vowel quality, say [a], may be pronounced nasal or non-nasal, with high/low/etc. tone, stressed or unstressed, and so on. All of these features allow to have a larger vowel inventory than the one provided by the vowel quality alone (so languages with 3-5 vowel qualities often rely on length, just like Arabic does).

      • Maju

        “The vowel diagrams in this post (and the WALS map provided) concern only
        the so-called “vowel quality” (i.e. the position of the tongue and the

        That’s why I asked, Asya, because the map suggests that Arabic (Egypt) is “average (5-6)”, when it only has three vowels, not counting length distinction.

        Of course I know that length and other variations like tone can change meaning and are important but do not make a different vowel as such, i.e. a different “vowel quality”.

        “so languages with 3-5 vowel qualities often rely on length”

        Well, not the ones I am familiar with: Basque, Spanish and Italian. All them have a 5-vowel system that does not make any difference re. length – nor any other aspect, except stress (in Romances only), which concerns more the syllabe as such than than the vowel. Ancient Latin however did make such short-long distinction for what I’ve read.

        • Asya Pereltsvaig

          Regarding Egyptian Arabic (a colloquial variety, not the Modern Standard Arabic one would learn at school) has six vowel qualities: /i, e, æ, u, o, ɑ/, three of them front, three back. That’s exactly what WALS lists there. And it is different from Modern Standard Arabic which employs only three vowel qualities: /i, u, a/.

          And I didn’t say that *all* languages with up to 5 vowel qualities employ length, only that many of them do (in fact, it is languages with 3 vowels qualities that rely on length the most, I’d say).

        • John Cowan

          In fact, Italian has seven vowel qualities, as Spanish once did also until the close e and o broke to the diphthongs ie ue, allowing the open e and o to move up to their current close positions.

          • Maju

            Can you illustrate with examples? While I cannot really speak Italian I am familiar with the language (my grandfather was Italian and my mother is half-so and born in Italy) and I can’t think of any example of what you say (I wonder if it’s some Sicilian dialectal variant or what: I can’t understand closed Sicilian speech but I have no big problem with standard Italian, specially as spoken in the North).

            Whatever the case, the map says of Italian “average: 5-6″ and I think it is correct. I’d say it’s also the case with Castilian (Spanish) since “always” – pronunciation of some consonants like “c” has changed probably but otherwise… Wouldn’t be for Italian and Latin, I’d dare say it’s yet another Basque influence in Spanish but Italian and Latin are 5-vowel so surely not. 

          • Asya Pereltsvaig

            It is my understanding that Standard Italian (the kind one would learn in school abroad) has a 5-vowel system not unlike what is shown for Spanish in the post. It is southern dialects (perhaps Sicilian among them, but I’d have to check some sources to say for sure) that have a seven vowel system.

          • Maju

            Wikipedia claims 7 vowels (two variants of e and o). Suspecting the claim as maybe biased I headed for the Italian Wikipedia, which holds that notion. So guess we have to concede to what John says. 

            The differences are however not written in any way and don’t seem to indicate any distinction of meaning but rather they appear to be more comfortable variants of a single letter, depending on the surrounding consonants and such.

          • Asya Pereltsvaig

            John and I are talking about vowel phonemes, that is vowels that serve to distinguish meaning and are not determined contextually. In Standard Italian the situation is a bit more complicated though, as the more open and the more close versions of e and o are distinguished in stressed syllables and the distinction disappears in unstressed syllables. Whether this distinction is to be treated as a phonemic one or not is a matter that can be debated among phonologists and the decision depends in part on various theoretical assumptions one makes.

          • John Cowan

            Asya: As you know, it’s common for there to be fewer vowel phonemes in unstressed syllables: in standard Russian there are only /a/, /i/, and /u/, and after palatalized consonants only /i/ and /u/.  There are exceptions for foreign words and certain endings; see

          • Asya Pereltsvaig

            Of course.

  • Anonymous

    Martin, off topic but I thought this may be of interest

    The Long-run Effects of the Scramble for Africa

    PHIL D

    • Martin W. Lewis

      Many thanks!  Interesting — and concise–  article with great  maps. 

  • Britton Watkins

    Lì’upam aylì’fyayä alu Seylìsyì lu stxong nìfya’o a heiek to pum leNa’vi nìwotx. The pronunciation of the Salishan languages is far more fascinating and exotic than that of Na’vi.

    • Miles Rout

      in your opinion. Remember Na’vi was also required to be pronounceable with little training to english-speaking actors. 

      • Asya Pereltsvaig

        Great point! Given how badly many English-speaking actors pronounce other (human) languages, it’s not an easy task.

    • Asya Pereltsvaig

      I am not sure it’s a fair statement: after all, how do we meansure “fascinating” and “exotic” objectively. Personally, I am big fan of the way languages of the Caucasus sound — more on that next week.

  • Stefano Lazzaro

    1 – Italian & .

    in Standard/Neutral Italian the distinction between /e, ɛ/ and /o, ɔ/ is lost
    in unstressed syllables (as a consequence of half-opening or half-closing, which
    yield two intermediate vowel qualities), as far as stressed syllables are
    concerned the distinction between /e, ɛ/ and /o, ɔ/ is in fact phonemic, i.e. there
    exist minimal pairs. For instance,


    /ˈpɛska/ (peach-F.SG),
    /ˈpeska/ (fishing-F.SG);

    2) /ˈfosse/ (beIMPF.SBJV-3SG), /ˈfɔsse/ (pit/grave-F.PL);


    /korˈresse/ (run-IMPF.SBJV-3SG), /korˈrɛsse/ (correctPRET-3SG).

    Contrary to what Maju suggested, these
    differences are not “more comfortable variants of a single letter [=grapheme],
    depending on the surrounding consonants”.

    It might be true that, as Asya wrote, “whether
    this distinction is to be treated as a phonemic one or not is a matter that can
    be debated among phonologists and the decision depends in part on various
    theoretical assumptions one makes.” Nevertheless, I am not aware of any
    theoretical generalization that, within the domain of stressed syllables,
    allows us to treat /e, ɛ/ and /o, ɔ/ as phonetic realizations of only two
    phonemes and confidently make empirically sound predictions regarding their
    occurrences. Can anyone provide useful references?

    (An above-average sophisticated descriptive
    analysis of Italian phonology and phonetics can be freely downloaded from

    PART 2 – Sci-fi languages: the Engineer
    language in Prometheus

    There have been discussions about the sci-fi
    language created by a team of linguists from SOAS and used in the movie Prometheus
    directed by Ridley Scott as a language spoken by an alien species called Engineers.
    An interesting exchange of ideas can be found on Language Log (,
    where many comments (including some comments by Anil Biltoo, one of the
    linguists who created the language) have been posted below Mark Liberman’s
    article Proto-Indo-European in
    Prometheus? .

    While the discussions about issues related to
    Historical Linguistics are fascinating, nowhere could I find a comment about a
    peculiar fact I noticed when I first watched the movie.

    The Engineers are believed to have created
    humans. A team of scientists is hired by a powerful corporation to travel into
    outer space in search of the Engineers’ home planet. During the voyage an
    android studies ancient human languages in order to reconstruct the language
    supposedly spoken by the Engineers when they created humans and originally
    spoken by humans themselves.

    The underlying assumption is that the Engineers’
    language has remained roughly the same for thousands of years – whereas during
    the same time span it has undergone profound changes and diversification within
    human society.

    • Asya Pereltsvaig

      Thank you, Stefano, for your comments. I might have been wrong in that hi mid and lo mid vowel distinction is not phonemic in standard Italian. Perhaps it is phonemic in standard Italian but not in some other dialects.

      And thank you for the interesting point about Engineers language. If indeed it remained unchanged over millennia, that makes it very different from human language!

      • Stefano Lazzaro

        Asya: The
        pronunciation of Standard Italian ‘e’ and ‘o’ does vary considerably among
        speakers as a result of their different dialectal backgrounds. In the
        north-eastern area of Italy where I grew up, for instance, my example 1) above
        is almost invariably pronounced /ˈpeska/ and the actual meaning is derived from
        context. Nevertheless, to my ear it would sound quite odd to use one and the
        same pronunciation for other Standard Italian minimal pairs such as those in my
        examples 2) and 3).

        would be useful to come up with a theoretical generalization as powerful as the
        one I mentioned in my previous post – hence my question about useful references
        on the subject (the scientific literature on Italian phonology that I am
        familiar with always presents Italian as a system with 7 vowel phonemes).

        The Vowel
        Quality Inventories map on WALS shows no data for the Italian language. But in
        the relevant chapter Ian Maddieson writes that “large vowel inventories in some of the other languages in this area [Europe]
        came about (in part, at least) as a
        result of earlier distinctions between sets of long and short vowels being
        transmuted into contrasts of vowel quality. This occurred (subject to other influences
        as well) in English, German and Italian, amongst others.” According to the
        legend on the map, a large vowel
        inventory comprises 7 – 14 elements. Therefore, we can infer that Italian is
        classified as a system with 7 phonemic vowels – it is certainly not larger.

        off topic:

        While searching
        the WALS website for official data on Italian, I also noticed that the chapter
        on Polar Questions contains the following statement: “The sixth type shown on the map involves the same words, morphemes and
        word order as the corresponding declarative sentence, but with a distinct
        intonation pattern as the sole indication that it is a question. An example of
        such a language is colloquial Italian (Maiden and Robustelli 2000: 147)”. It
        seems to me that the use of the term colloquial
        may be rather confusing. If colloquial
        Italian simply means spoken Italian (as opposed to the written form of the
        language where one can see question marks), the statement is correct. But if
        colloquial means what is usually means, i.e. informal, then a misleading distinction
        is drawn because interrogative intonation is the sole feature that
        differentiates polar questions from declaratives, in both informal and formal (spoken)

        apologize for writing long posts on this minor subject.

        • Asya Pereltsvaig

          Thanks for your contributions, Stefano!

          I am not entirely sure what the WALS authors meant by colloquial Italian, as the term is ambiguously used in linguistics.

          As for the vowel phoneme inventory in Italian, I don’t have any references off the top of my head, but I think that standard Italian has a 7-vowel system, but certain dialects reduced it to five.