September 4, 2016

Why is it rare to see Chinese etymology?

People speaking English as the native language are used to dictionaries in which each headword contains not only the definition of the word and example phrases or sentences, but also brief etymology, as in this example in the Merriam-Webster dictionary for the word word.

Middle English, from Old English; akin to Old High German wort word, Latin verbum, Greek eirein to say, speak, Hittite weriya- to call, name
First Known Use: before 12th century

A Chinese dictionary, on the other hand, almost never gives the etymology. In this blog posting, I'll try to explain why.

For the sake of discussion, we need to make a distinction between two types of Chinese dictionaries. Due to the nature of the Chinese language, the English word dictionary (or its equivalent in most other languages) can mean either "字典" (literally "character-dictionary") or "词典" also written as "辞典" (literally "word-dictionary") in Chinese. I have not seen a dictionary for general Chinese words published by anyone that contains etymological information for the headwords.[note1] Hereinafter, a Chinese etymological dictionary only refers to a character-dictionary.

The disappointment at lack of an etymological dictionary of Chinese words does not extend to that for a dictionary of Chinese characters or 字典. Back in the Eastern Han dynasty (25–220 AD), the scholar Xu Shen (c. 58 – c. 147 CE) wrote the monumental dictionary Shuowen Jiezi (literally "Explaining Graphs and Analyzing Characters" according to Wikipedia). Since Xu lived in a period only one thousand or less years after a large number of Chinese characters were invented, the etymology he gave in the book for each of the 9000 plus characters is mostly trustworthy. Take the character "秦" (qín) as an example.[note2]

(The fief given to the descendant of Boyi. The land is suitable for crops. The character has a meaning based on "禾" ("crop") and contains an abbreviation or syncope of the character "舂". Another theory claims that this character is the name of a crop. This character in Zhouwen script [a script used just before the time of the First Emperor], "𥠼", is based on "秝". Pronounced with the initial consonant of 匠 combined with the final of 鄰.)

This is an excellent example of Chinese character etymology; it not only describes the source of the character but also analyzes the morphology or form of the character, as evidenced by the construction of "秦" through "禾" and part of "舂". The significance of Xu's book in the history of the Chinese language is such that almost two millennia later, scholars are still using his book in research. The only major revision came after the 1899 discovery of oracle bones, which the Shang dynasty (c. 1600 BC–c. 1046 BC) people used for divination. The oracle bone script predates Xiaozhuan script, the primary source for Xu Shen's character etymology because the latter is the earliest script known to Xu. Owing to this gap of knowledge, Xu inevitably made numerous mistakes in his otherwise near-perfect dictionary. One good example can illustrate the point. In the article 许慎为何将象释成母猴——“为”字趣释 (Why did Xu Shen interpret an elephant as a female monkey: interesting interpretation of character "为"), the author explained how the simple character "为", meaning "for" or "to do" nowadays, evolved from the oracle-bone pictograph depicting a man holding an elephant leash but mistaken for a female monkey by Xu Shen. (By the way, elephants indeed roamed around middle and northern China three thousand years ago, but the species was not the same as in southern China or India today.)

With all the background information, now we may answer the question why it is rare to see Chinese etymology. By that I don't mean you can't find character etymology at all. Books such as 《汉语字源字典》 ("Dictionary of Chinese Character Etymology") and the Web site Chinese Etymology by Richard Sears are available. But this is almost never incorporated into a Chinese dictionary other than a specialized etymological dictionary. If a general English reader is not more academically inclined than a Chinese reader, why does a common English dictionary such as the Webster, American Heritage, or OED (Oxford English Dictionary) include etymology without hesitation? The reason may be that Chinese (character) etymology almost never helps a reader in studying the Chinese language due to the long history and evolution of the character. (Can you stretch your imagination far enough to associate the scene of a man and an elephant with the sense of "for" or its slightly older sense of "to do"? See above.) In addition to the long history, I believe there's another, more subtle, element in clouding the Chinese etymology. Most languages in the world take the alphabetic writing system. Studying the internal history of its vocabulary primarily means analyzing phonological and morphological changes through time; e.g., there was a systematic change of f to h in Spanish for a large number of words. Secondly, less conducted is the semantic evolution of words; it's less done because it is "more hazardous to attempt to reconstruct meaning than to reconstruct linguistic form" as linguist Calvert Watkins said. And yet, the Chinese characters rarely went through systematic morphological changes that apply to a large number of characters and, since Chinese is not based on an alphabetic writing system, phonological changes are not conducive to the study of etymology per se. This leaves a large part of Chinese etymology to the study of semantic evolution, which is, as stated, more error-prone in scholarly reconstruction.

There is another reason for not incorporating etymology in Chinese dictionaries. Many characters originate from pictographs or pictograph-like glyphs such as Xiaozhuan script. Publication has to render them as images instead of text, which is an editorial inconvenience. The images with their explanatory texts take a significant amount of space relative to the definitions and examples in usage, which a regular user cares more about. This is in contrast with the etymology in an English dictionary, which can be made brief and still makes sense to the minority of interested readers. And yet a third reason may be that it's just the custom of Chinese lexicography, i.e. no etymology except in specialized dictionaries. This is probably also the reason why dictionaries of other languages than English lack etymology. (Try to find etymology in any dictionary of Spanish, French, German or Italian in a bookstore or library!) But nobody knows the original cause or reason for this custom.

Therefore, unlike a language where a student may make use of etymology in vocabulary study optionally combined with some mnemonics (as demonstrated in my book for Spanish), the Chinese characters have to be studied in a different way. Etymology comes in handy only for the very first few characters, such as "火" ("fire"), "山" ("mountain"), which are frequently used to impress complete beginners. After 10 or 20 such "pictographs", rote memory is commonly adopted, but books such as Tuttle Learning Chinese Characters that laboriously make up mnemonics are helpful. Fortunately, a large portion of the character repertoire consists of characters combining two parts, one more or less representing its meaning and the other representing the sound. However, in none of these cases would etymology play any role.

Note: This posting has a sequel, Comparison of Chinese and Western Etymology.

[note1] By emphasizing "general", I'd like to point out that a special group of Chinese words, 成语 (idioms), are an exception, in that dictionaries of Chinese idioms almost always give the first occurrence of the idioms and sometimes even briefly describe the sense development as well.
With regard to dictionaries of words in general, one may think of the book 《辭源》, literally "word origin". First published in 1925, it takes a misleading title because it's no more than a dictionary (albeit of high-quality) of Chinese words with no etymology. In fact, even if we take an alternative interpretation of "辭源" as "first occurrence of word", this book fails as well; e.g., the entry for "中国" does not list its first occurrence in the Book of Documents, or the bronze inscription which the Book records. Another book we can even more readily dismiss is the 《詞源》 by Zhang Yan in the Song dynasty because the book is on the subject of the literary genre , not "words".
[note2] Incidentally, the character "秦" is significant in that traditionally many scholars including Paul Pelliot believed that it is the ultimate source for the word China in many languages in the world, although more recent research attributed the origin to "晋". Two other sources of the word referring to China are Khitan as in the case of Russian, and silk.)


XoF at October 24, 2016 at 8:50 AM said...

I find the article interesting, but I'm still wondering why there seem to be no study of Chinese word etymology, at least in English.

I've always been told that 2-character words are a recent creation of the Chinese vocabulary, but I cannot find any documented reference to that, with examples of words with its first appearance in a textual document, with an author, publishing date etc...

When were coined apparently old words using simple characters like ma3shang0 (immediately) or tian1cai2 (genius) ? What about semi-modern words, like gong4he2 (republic) or very recent words jin1zhuan1si4guo2 for the BRIC countries. Who coins them ? I can't find any information about this...

If you have information, even in Chinese, I would be very curious to know about it !



Yong Huang at October 24, 2016 at 7:07 PM said...

Hi Christophe,

Excellent questions! Chinese *word* etymology is indeed an area no scholar or even amateur has attempted to explore, except on a sporadic basis. I can't think of a good explanation. Maybe people just never gave serious thought to it.

Two-character words are not a recent creation. According to Wikipedia, polysyllabic or multi-character words started to appear even during the Western Zhou dynasty (one millennium BC). Of course, they were not nearly used as much as in the last few centuries. The Wikipedia page has some references, but they may not meet your requirement. If I find any more in the future, I'll post it here. If you find any, please let me know as well.

The Chinese character 源 as in 词源 literally means "origin", and so 词源 means "origin of words", not necessarily "etymology", which is the study of not only the word origin but also its evolution especially the rules involved in this evolution. If we only want to find the origin or first occurrence of a word instead of its etymology in its true sense, 21st century technology is our friend. Take 马上 as an example.
(1) On Google or any major search engine, search with keywords "马上"
and browse through the results to see which one is from the oldest book.
(2) On, search for 马上. In this case, I see that "漢書.卷四十三.陸賈傳:「乃公居馬上得之,安事詩書。」", which may be one of the earliest, if not the earliest, occurrences.
(3) Search the word on Google Ngram (note: do not include quotation marks). But Google Ngram doesn't tell us which book or document the word occurs in.

Some important words have been extensively studied even etymologically. I know off the top of my head that 共和 was first used in the pre-Qin times. As far as finding the first occurrence of a word coined in recent ten to thirty years, important words are probably already studied, such as Wikipedia for 金砖四国, or my short study of the origin of 民主. Otherwise, I would laboriously read through each of the Google search pages to see which page that contains this word has the earliest timestamp. In fact, I do that quite often to find the origin of a specific quote (see one example).

Yong Huang at OCTOBER 25, 2016 AT 6:02 AM said...

I must add that any dictionary of Chinese idioms (成语) is a truly etymological one; it not only gives the first occurrence or origin of the idiom in ancient literature, but in most cases also briefly explains the development of the new senses. Nevertheless, dictionaries of Chinese words in general do not follow this tradition.

XoF at OCTOBER 27, 2016 AT 4:48 AM said...

Hi Yong Huang,

Thank you for the precise and documented answer! I'm still in the process of discovering the resources you're pointing me to. It's very interesting indeed.

Why not try to derive a new resource from this material, oriented on words origin ?

I'll come back to you when I understand better where to go from there.


Yong Huang at NOVEMBER 6, 2016 AT 11:08 AM said...

I have to make a correction to my blog posting. Although not widely known, Chinese dictionaries of *word* etymology do exist (in addition to dictionaries of idioms). After I posted a question to (2018 note: that website has been inaccessible for a long time), the following dictionaries have been identified, in reverse chronological order of publication:
作者: 岑麒祥
出版社: 商务印书馆
出版年: 2015-8
页数: 450

作者: 庄钦永 / 周清海
出版社: 新加坡青年书局
出版年: 2010-8
页数: 321

2020-11-06 Update 黄河清《近现代辞源》,上海辞书出版社,2010年6月 (该网站可搜索词典所收词,网页显示词条前十几至几十个字) 黄河清2020年出版的《近现代汉语辞源》是前者几十倍的扩充。
《近现代汉语词源词典》 作者: 香港中国语文学会 出版社: 汉语大词典出版社 出版年: 2001-2 页数: 426 《现代汉语词名探源词典》 作者: 王艾录 出版社: 山西人民出版社 出版年: 2000-01-01 《汉语外来词词典》 作者: 刘正埮 / 高名凯 等编 出版社: 上海辞书出版社 出版年: 1984 页数: 422
Some day, when I have reviewed all these dictionaries, I will blog about them.

XoF at NOVEMBER 8, 2016 AT 3:39 AM said...

This is a really interesting list of dictionaries !

It looks as though, with the possible exception of 《近现代汉语词源词典》, all these dictionaries focus on Chinese words of foreign origin. Isn't this somehow implying that there is only one Chinese root to all "non foreign" words ? Or maybe that's how most native Chinese speakers feel... Can you confirm ?


Yong Huang at NOVEMBER 8, 2016 AT 6:47 AM said...

王艾录《现代汉语词名探源词典》 may also be a dictionary of words in general, not limited to those of foreign origin. I'll have to read and confirm. But I agree that Chinese scholars seem to be more interested in compiling etymological dictionaries of foreign origin than of native Chinese. I'd love to know the origin and sense development of, e.g., 马上 in the sense of "immediately". I'm not sure if I can find it in these dictionaries.

I'm not sure what you mean by only one Chinese root to all "non foreign" words. (Does "root" here mean "origin", or "word stem" or something similar?) Loan words imported into Chinese in the past two or three centuries are clearly considered of foreign origin. But those coming in earlier, e.g. 番茄 ("tomato"), 刹那 ("a short moment"), are not so obvious unless reminded by a specialist.

XoF at NOVEMBER 8, 2016 AT 8:33 AM said...

I meant origin, in the sense of a unique language that all the Chinese vocabulary would derive from. Actually, I cannot believe that it is the case...

If scholars are mostly interested in tracking words of foreign origin, it somehow implies that ancient Chinese texts were less influenced by linguistic variations, due to geographical or political circumstances - dynasties, kingdoms etc - that would characterize different lexicons within the Chinese language and be worth describing. Of course I believe that such variations existed and my point is that it looks as though these variations were not acknowledged.

But they probably are by some authors. I will eagerly follow what you have to say about these dictionaries when you have the opportunity to enlighten us about their content ;-)

And thank you for 刹那... I will try to use it !

Mircea at DECEMBER 26, 2016 AT 2:33 AM said...

XoF, I am really impressed by your astute observation. It seems that the connotation of etymology is different in the western tradition to that of Chinese tradition. Until the turn of the 19th century the Chinese did not even have a grammar in the Western sense. Much can be learnt about etymology by reading about the origins of words and other etymological issues reading books on Chinese historical linguistics authored by Westerners such as Karlgren, Gabelentz, Boltz and other outstanding authors. There is also a superb book by a Chinese tandem published by OUP which deals with the history of Chinese lexicography from its inception until 1911. You should also find it quite informative.

Yong Huang at DECEMBER 30, 2016 AT 7:42 PM said...

Hi Mircea,

You're absolutely correct about lack of Chinese grammar in the Western sense until a little over a century ago (specifically until the publication of 《馬氏文通》 in 1898). But grammar is not the same as etymology. I'm somewhat familiar with Karlgren's work, which is one of the most important contributions to Chinese phonology. But as alluded to earlier, the peculiarity of the Chinese language (or any language not using an alphabetic writing system) dissoaciates phonology from etymology. I'll blog about this dissociation unique to Chinese.

I'm not familiar with Gabelentz or Boltz. Just did some reading. Mr. Gabelentz has a very interesting view about the Beijing dialect as inappropriate for science!, according to the Wikipedia page. By the way, what is the "superb book by a Chinese tandem published by OUP"? Thank you.

Contact me by email or form
To my English for Chinese Page