The Architecture of Words

Generating a vocabulary for an invented language is a stupendous task. For basic functionality, a natural language probably needs about 5000 to 10,000 words; a language that’s fully capable of dealing with all sorts of specialized and technical topics may have upwards of 100,000 words.

Naturally, when I started work on neo-Khuzdul, I did not have such a complete vocabulary. Nor, after many years, does such a vocabulary exist today. What I created instead was something like what Tolkien had done with his Elvish languages; instead of making a dictionary of thousands of words, he created a system by which new vocabulary items could be generated based on existing words and “roots” — the sounds which carry the fundamental meaning of each word.

But Khuzdul was very different in structure from Elvish. In this, as with the phonology of Khuzdul, Tolkien had left clues that I was bound to follow. In The Lord of the Rings itself, Tolkien had had little to say about Khuzdul as a language; only that it was a “strange tongue, changed little by the years,” and “a tongue of lore rather than a cradle-speech” which few had succeeded in learning. We recall that, at the West-gate of Moria, Gandalf speculates that it will be unnecessary for him to ask Gimli for “words of the secret dwarf-tongue that they teach to none.” Obviously, even Gandalf is not a master of Khuzdul!

A vital clue about the nature of Khuzdul came from a text that is not really concerned with Khuzdul or the Dwarves at all. This is “Lowdham’s Report on the Adunaic Language” (in Sauron Defeated, pp.413-440). This text purports to be a description, by the fictional character Alwin Arundel Lowdham, of the languages of Númenor — an “Atlantis” of a distant, semimythical past, which he has been able to view by means of entering into the experiences of his remote ancestors – who may include Elendil of Númenor! The machinery by which all this is justified is extremely complex, and is described in The Notion Club Papers (also in Sauron Defeated); it also raises all sorts of intriguing issues which are beyond the scope of this particular discussion. Suffice it to say that Lowdham provides some elementary grammatical information about the primary language of Númenor, Adûnaic (or, as he spells it, Adunaic), and contrasts it with the Elvish or “Nimrian” languages — nimir being the Adunaic word for “elf.”

Adunaic, Lowdham speculates,

came under some different influence [than Elvish]. This influence I call Khazadian; because I have received a good many echoes of a curious tongue, also connected with what we should call the West of the Old World, that is associated with the name Khazad. Now this resembles Adunaic phonetically, and it seems also in some points of vocabulary and structure; but it is precisely at the points where Adunaic most differs from Avallonian [sc. Quenya] that it approaches nearest to Khazadian.

Lowdham does not identify Khazadian specifically as a language of Dwarves, doubtless because he does not know; his psychic information is largely focused on language-details, and occasionally visions of manuscripts, with other aspects of the visualized culture being scant or absent. But there is no doubt that, within Tolkien’s mythology, Khazad refers to the Dwarves and that Khazadian is Khuzdul.

This is all well and good, but one would like to know more precisely in what points of structure “Khazadian” resembles Adunaic. Lowdham happily comes through:

The majority of the word-bases of Adunaic were triconsonantal. This structure is somewhat reminiscent of Semitic; and in this point Adunaic shows affinity with Khazadian rather than Nimrian.

No more is said about “Khazadian” in this text, but this is enough. It echoes, somewhat obliquely, a comment made by Tolkien in a letter to Naomi Mitchison: that the Dwarves were “like Jews… speaking the languages of the country [i.e., of whatever country they happen to be living], but with an accent due to their own private tongue.” It’s difficult to reconcile this with Tolkien’s statement that Khuzdul was “a tongue of lore rather than a cradle-speech”; and in reality it’s more likely that the pronunciation of Hebrew was influenced by “the languages of the country” than the other way around, and that such accents as the Jews of Tolkien’s acquaintance may have had more likely came from Yiddish than from Hebrew.

But to return to the main point: it seemed evident that Tolkien intended Khuzdul to be somewhat Semitic in structure, particularly as regarded the system of roots. The Semitic language family is a large but fairly tightly-knit group of languages found mostly in the Middle East. Its representatives with the most speakers today are the Arabic languages (descended from Classical Arabic), modern Israeli Hebrew, and some but not all of the languages of Ethiopia and Eritrea. Extinct varieties include the Akkadian languages spoken in Mesopotamia, including Assyria and Babylon; Phœnician, spoken along the coasts of what are now Syria, Lebanon, and Northern Israel, and its descendant, the Punic of Carthage; and Aramaic, spoken originally in Syria but later throughout the Middle East. Aramaic is not quite extinct; some descendant dialects are still spoken in a few villages, though more than a century of upheaval has not been kind to them, and they are now on the edge of extinction.

What these languages have in common is a peculiar structure, in which basic meaning is carried by a group of consonants (normally three, but sometimes 1, 2, or 4) which are then modified by the addition of prefixes, suffixes, infixes, doubling of consonants (normally the second), and, most notably, the insertion or deletion of vowels between these consonants. For instance, in Arabic the three consonants k-l-m carry the notion of “speaking” or “speech”. From this root are derived (among othes) the verbs kallama “address”, kâlama “converse”, takallama “utter”, and the nouns kalimah “word, speech”, kalâm “expression”, mukâlama “discussion”, takallum “talk”, and the adjectives kalâmî “pertaining to speech”, tiklâm “eloquent”, and mutakallim “speaking”. The sequence k-l-m (which, for convenience’s sake, I’ll express in capital letters henceforth without dashes, thus: KLM) is the “root”, which in Arabic is called jidhr and in Hebrew shoresh.

A standard set of affixes or pattern of vowels can be applied to many different roots. These patterns (called wazn in Arabic and binyan in Hebrew) can indicate the part of speech, the person, number, mood, or tense of the verb, the comparative or superlative forms of the adjective, and so forth. For instance, in Arabic, the adjectives meaning “big” and “near” have the pattern CaCîC, where C=one of the consonants of the root: kabîr, qarîb. The superlatives of these same adjectives have the pattern aCCaC: akbar “biggest”, aqrab “nearest”.

Can we verify that Khuzdul has this kind of construction, in general, if not in detail? The word Khuzdul itself is evidently related to Khazâd “dwarves”, the prefixed form Khazad– in “Khazad-dûm” (“Mansion of the Dwarves”) and probably also Nulukkhizdîn*, a Dwarvish name for Nargothrond (where Nuluk probably = Narog, the name of the river on which Nargothrond was built). Each of these shows the same root KhZD (remember that kh is a single consonantal sound in Khuzdul) with a variety of vowel patterns and suffixes: CaCaC, CaCâC, CuCCul, CiCCîn. The ending -ul in Khuzdul is probably the same as that seen in Mazarbul and Fundinul — in the latter case appended to a name of Mannish origin. We also have the word Rukhs “orc”, plural Rakhâs “orcs” in The War of the Jewels, p. 391, which shows that patterns are repeated; Rakhâs has the same pattern as Khazâd, but with the root RKhS. If the patterns are consistent, then most likely the singular of Khazâd is Khuzd, which in term explains Khuzd-ul — basically equivalent to Dwarf-ish.

Assuming a Semitic style of construction, generating a Khuzdul vocabulary was therefore — in principle — as simple as producing a lot of triliteral roots and a suitable set of patterns, like (but not identical to) those found in real Semitic languages. In actual application, things were a little more complicated.

*Misspelled in The Silmarillion; see The War of the Jewels, p. 180.

5 Responses to “The Architecture of Words”

  1. Michelle

    Great linguistic explanations! Having attempted to learn Hebrew and the ‘shoresh’ root system, and having heard Aramaic in Coptic villages in Egypt and the city of Mardin in Turkey (a modern day Minas Tirith!!!), I can understand now how sophisticated your research in Khuzdul is. Of the Semitic languages, is there one particular language that influenced you most?

    • David Salo

      I think that the greatest single influence on neo-Khuzdul, structurally at any rate, was Arabic — before neo-Khuzdul started having a life of its own and developing in unexpected ways, that is! But the Semitic language which I studied most, and most enjoyed, was Aramaic (from ancient Aramaic through to Syriac — I know pretty much nothing about the modern Aramaic languages).

  2. Menelion Elensúle

    Thanks for your great explanations!
    I have a question about phonetic: Tolkien states that kh is an aspirated K sound. And that’s OK if we have a vowel after it (as in Khazad). But what will we do in the case of words like “rukhs”? This is an after-stress position without a vowel, so I find this quite difficult to pronounce correctly. Do you think the sound would somewhat become more plain and closer to K or, maybe, there will be a schwa between kh and s?

  3. Mad Latinist

    Modern Hebrew allows 5-letter roots in loaned verbs, such as √פלרטט PLRṬṬ “to flirt” (presumably the extra Ṭ is from “flirtation,” as if the verb were “to *flirtate.”)

    • Mad Latinist

      Digressing further: I asked an Israeli friend if he knew of any other 5-letter verbs. He suggested √סספנד SSPND “to suspend disbelief.” This one is particularly interesting in that it’s only used in the fandom community, and even there it’s virtually limited to the infinitive form. (I gave him an example sentence with the hypothetical future tense, and he said he could maybe imagine someone saying that, but, he said, the past tense would be so awkward he could only see it being used as a joke.)


Leave a Reply to David Salo

Click here to cancel reply.

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>