Characteristics of Languages

In the early 1990's, I was an English translator at the government-designated agency to translate United Nations documents in China. When we received a document in a foreign language other than English, we passed it on to those translators that know the language. The challenge was to identify the language so we would know who to assign the document to. We had plenty of books and dictionaries, but no computer database or software for this purpose, and of course no Internet. It didn't take me long to figure out some patterns in various languages and create the following table, which was seen on other translators' desks soon later. With this table, you can easily identify the language given part of a document. Nowadays though, translation software such as Google Translate obviates the need of consulting this table if language identification is all you want.

Language Common Words Characteristics
English the of a(n) and  
Spanish
(español)
el*,la(s),lo(s) de(l) un,uno,una y,e
  • Characteristic letters: ñ*
  • Typical words: en [in], por [for], otro(-a) [other], a [to, for, at]
  • -ción
  • "¿" prepended to inquisitive statements*
French
(français)
le(s),la,l'*(also see Italian) de,d',du,des un(e) et* (also see Latin)
  • Characteristic letters: ç, é, è, ê
  • Typical words: en [at,in], C'est* [This is]
  • -aire, -oir, -aux, -eau
German
(Deutsch)
der,den,des,das,die,dem von (also for "from") ein(e) und*
  • Characteristic letters: ü, ö, ä, ß
  • Typical words: -en, -ung, (-)sch(-), -cht, -tz(-), (-)eu(-)
  • Uppercase for initial letter of noun (so you'll see many)*
Portuguese
(português)
o* de,da   e
  • Characteristic letters: é, ó, ã*, á, ô, ç
  • Typical words: é [be], a [to,for,with...]
  • -ção
Italian
(italiano)
il,i*,la,le,lo,l' de,di,dei,del un,uno,una e
  • Characteristic letters: è
  • Typical words: è [be]
  • Words end with vowel* (also see Swahili and some words in Japanese)
Norwegian
(norsk)
  til en,ei,et og*
  • Characteristic letters: Ø, å, æ
  • Typical words: mot [toward], fra [from], eller [or]
Swedish
(svensk)
de(n),det,dem,dom,dens,deras   en och*
  • Characteristic letters: å, ä*, ö* (Note: Finnish has many ä but has no ö)
  • Typical words: om [if,whether,of], i [at]
  • (-) jvowel, e.g. (-)sjö(-), (-)hja(-), (-)tjär(-)
Dutch
(Nederlands)
Flemish
(Vlaams)
de,het* van een en
  • Characteristic letters: ç, é
  • Typical words: dat [this]
  • -en, -ij(-), words containing consecutive vowels are common (e.g. aan [to])*
Danish
(dansk)
de(n),det,des,desto   en,et,én,ét og*
  • Characteristic letters: æ, ø, å
  • Typical words: den [he,she,it], de [they], til [to,for,at,with,into,until]
Polish
(polski)
      i
  • Characteristic letters: ł*, ę*, ą*, ć, ś, ź ń, ó; no q, v, x
  • Typical words: w [in,into,at]*, z [ ]
  • (-)szcz(-), prz-, (-)brn(-), (-)sr(-), (-)drz(-)
Hungarian
(Magyar)
a     és,meg
  • Characteristic letters: ö, ó, á, é, ő*, ű*
  • (-)sz(-), (-)èk(-), (-)cs(-)
Vietnamese
(tiẽng Vìệt)
      va
  • Characteristic letters: Ð, đ, ê, ô, ỏ*, ủ*
  • Complicated diacritics above vowels, e.g. ắ, ổ, ữ*
Turkish
(Türkçe)
    bir ve*
  • Characteristic letters: ı, ç, ö, ş, ü, ğ, İ*; no q, w, x
Latin       et
  • -us, -um
Swahili
(Kiswahili)
      na
  • No letters of c, q, x; no symbols above or below letters
  • vowel as word end* (also see Italian and Japanese)

Note
1. * indicates a very important characteristic.
2. If the characteristic letter is composed of a regular letter plus a symbol above or under it, then the regular letter also exists. E.g., Spanish has "ñ" then there is also "n".
3. Brackets after typical words contain their English translation.
4. If a hyphen precedes a typical letter combination, this combination is a word suffix; if it follows it, a prefix. A parenthesis indicates the hyphen is optional.
5. (Outdated by today's software) If you still can't identify the language by using this table, please try Doug Beeferman's Stochastic Language Identifier. Failing that, read Kenneth Katzner's excellent book, The Languages of the World, Routledge; 3rd ed., 2002.

To my Miscellaneous Page