September 4, 2016

Why is it rare to see Chinese etymology?

People speaking English as the native language are used to dictionaries in which each headword contains not only the definition of the word and example phrases or sentences, but also brief etymology, as in this example in the Merriam-Webster dictionary for the word word.

Middle English, from Old English; akin to Old High German wort word, Latin verbum, Greek eirein to say, speak, Hittite weriya- to call, name
First Known Use: before 12th century

A Chinese dictionary, on the other hand, almost never gives the etymology. In this blog posting, I'll try to explain why.

For the sake of discussion, we need to make a distinction between two types of Chinese dictionaries. Due to the nature of the Chinese language, the English word dictionary (or its equivalent in most other languages) can mean either "字典" (literally "character-dictionary") or "词典" also written as "辞典" (literally "word-dictionary") in Chinese. I have not seen a dictionary for general Chinese words published by anyone that contains etymological information for the headwords.[note1] Thereinafter, a Chinese etymological dictionary only refers to a character-dictionary.

The disappointment at lack of an etymological dictionary of Chinese words does not extend to that for a dictionary of Chinese characters or 字典. Back in the Eastern Han dynasty (25–220 AD), the scholar Xu Shen (c. 58 – c. 147 CE) wrote the monumental dictionary Shuowen Jiezi (literally "Explaining Graphs and Analyzing Characters" according to Wikipedia). Since Xu lived in a period only one thousand or less years after a large number of Chinese characters were invented, the etymology he gave in the book for each of the 9000 plus characters is mostly trustworthy. Take the character "秦" (qín) as an example. (This character is significant in that it is the ultimate source for the word China in English or its equivalent in most other languages in the world. Two other sources of the word referring to China are Khitan as in the case of Russian, and silk.)

(The fief given to the descendant of Boyi. The land is suitable for crops. The character has a meaning based on "禾" ("crop") and contains an abbreviation or syncope of the character "舂". Another theory claims that this character is the name of a crop. This character in Zhouwen script [a script used just before the time of the First Emperor], "𥠼", is based on "秝". Pronounced with the initial consonant of 匠 combined with the final of 鄰.)

This is an excellent example of Chinese character etymology; it not only describes the source of the character but also analyzes the morphology or form of the character, as evidenced by the construction of "秦" through "禾" and part of "舂". The significance of Xu's book in the history of the Chinese language is such that almost two millennia later, scholars are still using his book in research. The only major revision came after the 1899 discovery of oracle bones, which the Shang dynasty (c. 1600 BC–c. 1046 BC) people used for divination. The oracle bone script predates Xiaozhuan script, the primary source for Xu Shen's character etymology because the latter is the earliest script known to Xu. Owing to this gap of knowledge, Xu inevitably made numerous mistakes in his otherwise near-perfect dictionary. One good example can illustrate the point. In the article 许慎为何将象释成母猴——“为”字趣释 (Why did Xu Shen interpret an elephant as a female monkey: interesting interpretation of character "为"), the author explained how the simple character "为", meaning "for" or "to do" nowadays, evolved from the oracle-bone pictograph depicting a man holding an elephant leash but mistaken for a female monkey by Xu Shen. (By the way, elephants indeed roamed around middle and northern China three thousand years ago, but the species was not the same as in southern China or India today.)

With all the background information, now we may answer the question why it is rare to see Chinese etymology. By that I don't mean you can't find character etymology at all. Books such as 《汉语字源字典》 ("Dictionary of Chinese Character Etymology") and the Web site Chinese Etymology by Richard Sears are available. But this is almost never incorporated into a Chinese dictionary other than a specialized etymological dictionary. If a general English reader is not more academically inclined than a Chinese reader, why does a common English dictionary such as the Webster, American Heritage, or OED (Oxford English Dictionary) include etymology without hesitation? The reason may be that Chinese (character) etymology almost never helps a reader in studying the Chinese language due to the long history and evolution of the character. (Can you stretch your imagination far enough to associate the scene of a man and an elephant with the sense of "for" or its slightly older sense of "to do"? See above.) In addition to the long history, I believe there's another, more subtle, element in clouding the Chinese etymology. Most languages in the world take the alphabetic writing system. Studying the internal history of its vocabulary primarily means analyzing phonological and morphological changes through time; e.g., there was a systematic change of f to h in Spanish for a large number of words. Secondly, less conducted is the semantic evolution of words; it's less done because it is "more hazardous to attempt to reconstruct meaning than to reconstruct linguistic form" as linguist Calvert Watkins said. And yet, the Chinese characters rarely went through systematic morphological changes that apply to a large number of characters and, since Chinese is not based on an alphabetic writing system, phonological changes are not conducive to the study of etymology per se. This leaves a large part of Chinese etymology to the study of semantic evolution, which is, as stated, more error-prone in scholarly reconstruction.

There is another reason for not incorporating etymology in Chinese dictionaries. Many characters originate from pictographs or pictograph-like glyphs such as Xiaozhuan script. Publication has to render them as images instead of text, which is an editorial inconvenience. The images with their explanatory texts take a significant amount of space relative to the definitions and examples in usage, which a regular user cares more about. This is in contrast with the etymology in an English dictionary, which can be made brief and still makes sense to the minority of interested readers. And yet a third reason may be that it's just the custom of Chinese lexicography, i.e. no etymology except in specialized dictionaries. This is probably also the reason why dictionaries of other languages than English lack etymology. (Try to find etymology in any dictionary of Spanish, French, German or Italian in a bookstore or library!) But nobody knows the original cause or reason for this custom.

Therefore, unlike a language where a student may make use of etymology in vocabulary study optionally combined with some mnemonics (as demonstrated in my book for Spanish), the Chinese characters have to be studied in a different way. Etymology comes in handy only for the very first few characters, such as "火" ("fire"), "山" ("mountain"), which are frequently used to impress complete beginners. After 10 or 20 such "pictographs", rote memory is commonly adopted, but books such as Tuttle Learning Chinese Characters that laboriously make up mnemonics are helpful. Fortunately, a large portion of the character repertoire consists of characters combining two parts, one more or less representing its meaning and the other representing the sound. However, in none of these cases would etymology play any role.

[note1] By emphasizing "general", I'd like to point out that a special group of Chinese words, 成语 (idioms), are an exception, in that dictionaries of Chinese idioms almost always give the first occurrence of the idioms and sometimes even briefly describe the sense development as well.
With regard to dictionaries of words in general, one may think of the book 《辭源》, literally "word origin". First published in 1925, it takes a misleading title because it's no more than a dictionary (albeit of high-quality) of Chinese words with no etymology. In fact, even if we take an alternative interpretation of "辭源" as "first occurrence of word", this book fails as well; e.g., the entry for "中国" does not list its first occurrence in the Book of Documents, or the bronze inscription which the Book records. Another book we can even more readily dismiss is the 《詞源》 by Zhang Yan in the Song dynasty because the book is on the subject of the literary genre , not "words".


XoF at October 24, 2016 at 8:50 AM said...

I find the article interesting, but I'm still wondering why there seem to be no study of Chinese word etymology, at least in English.

I've always been told that 2-character words are a recent creation of the Chinese vocabulary, but I cannot find any documented reference to that, with examples of words with its first appearance in a textual document, with an author, publishing date etc...

When were coined apparently old words using simple characters like ma3shang0 (immediately) or tian1cai2 (genius) ? What about semi-modern words, like gong4he2 (republic) or very recent words jin1zhuan1si4guo2 for the BRIC countries. Who coins them ? I can't find any information about this...

If you have information, even in Chinese, I would be very curious to know about it !



Yong Huang at October 24, 2016 at 7:07 PM said...

Hi Christophe,

Excellent questions! Chinese *word* etymology is indeed an area no scholar or even amateur has attempted to explore, except on a sporadic basis. I can't think of a good explanation. Maybe people just never gave serious thought to it.

Two-character words are not a recent creation. According to Wikipedia, polysyllabic or multi-character words started to appear even during the Western Zhou dynasty (one millennium BC). Of course, they were not nearly used as much as in the last few centuries. The Wikipedia page has some references, but they may not meet your requirement. If I find any more in the future, I'll post it here. If you find any, please let me know as well.

The Chinese character 源 as in 词源 literally means "origin", and so 词源 means "origin of words", not necessarily "etymology", which is the study of not only the word origin but also its evolution especially the rules involved in this evolution. If we only want to find the origin or first occurrence of a word instead of its etymology in its true sense, 21st century technology is our friend. Take 马上 as an example.
(1) On Google or any major search engine, search with keywords "马上"
and browse through the results to see which one is from the oldest book.
(2) On, search for 马上. In this case, I see that "漢書.卷四十三.陸賈傳:「乃公居馬上得之,安事詩書。」", which may be one of the earliest, if not the earliest, occurrences.
(3) Search the word on Google Ngram (note: do not include quotation marks). But Google Ngram doesn't tell us which book or document the word occurs in.

Some important words have been extensively studied even etymologically. I know off the top of my head that 共和 was first used in the pre-Qin times. As far as finding the first occurrence of a word coined in recent ten to thirty years, important words are probably already studied, such as Wikipedia for 金砖四国, or my short study of the origin of 民主. Otherwise, I would laboriously read through each of the Google search pages to see which page that contains this word has the earliest timestamp. In fact, I do that quite often to find the origin of a specific quote (see one example).

