The Dgraph uses a language code to identify a language for a specific attribute.
Arabic: ar | Danish: da | Indonesian: id | Norwegian Bokmal: nb | Spanish, Latin American: es_lam |
Afrikaans: af | Divehi: nl | Italian: it | Norwegian Nynorsk: nn | Spanish, Mexican: es_mx |
Albanian: sq | Dutch: nl | Japanese: ja | Oriya: or | Swedish: sv |
Amharic: am | English, American: en | Kannada: kn | Persian: fa | Swahili: sw |
Armenian: hy | English, British: en_GB | Kazakh, Cyrillic: kk | Persian, Dari: prs | Tagalog: tl |
Assamese: as | Estonian: et | Khmer: km | Polish: pl | Tamil: ta |
Azerbaijani: az | Finnish: fi | Korean: ko | Portuguese: pt | Thai: th |
Bangla: bn | French: fr | Kyrgyz: ky | Portuguese, Brazilian: pt_BR | Telugu: te |
Basque: eu | French, Canadian: fr_ca | Lao: lo | Punjabi: pa | Turkish: tr |
Belarusian: be | Galician: gl | Latvian: lv | Romanian: ro | Turkmen: tk |
Bosnian: bs | Georgian: ka | Lithuanian: lt | Russian: ru | Ukrainian: uk |
Bulgarian: bg | German: de | Macedonian: mk | Serbian, Cyrillic: sr_Cyrl | Urdu: ur |
Catalan: ca | Greek: el | Malay: ms | Serbian, Latin: sr_Latn | Uzbek, Cyrillic: uz |
Chinese, simplified: zh_CN | Gujarati: gu | Malayalam: ml | Sinhala: si | Uzbek, Latin: uz_latin |
Chinese, traditional: zh_TW | Hebrew: he | Maltese: mt | Slovak: sk | Valencian: vc |
Croatian: hr | Hungarian: hu | Marathi: mr | Slovenian: sl | Vietnamese: vn |
Czech: cs | Icelandic: is | Nepali: ne | Spanish: es | unknown (i.e., none of the above languages): unknown |
The language codes are case insensitive.
Note that an error is returned if you specify an invalid language code.
With the language codes, you can specify the language of the text to the Dgraph during a record search or value search query, so that it can correctly perform language-specific operations.
A country locale code is a combination of a language code (such as es for Spanish) and a country code (such as MX for Mexico or AR for Argentina). Thus, the es_MX country locale means Mexican Spanish while es_AR is Argentinian Spanish.
If you specify a country locale code for a Language element, the software ignores the country code but accepts the language code part. In other words, a country locale code is mapped to its language code and only that part is used for tokenizing queries or generating search indexes. For example, specifying es_MX is the same as specifying just es. The exceptions to this rule are the codes listed above (such as pt_BR).
Note, however, that if you create a Dgraph attribute and specify a country locale code in the Language field, the attribute will be tagged with the country locale code, even though the country code will be ignored during indexing and querying.
The Dgraph has two spelling correction engines. If the Language property in an attribute is set to en, then spelling correction will be handled through the English spelling engine (and its English spelling dictionary). If it is set to any other value, then spelling correction will use the non-English spelling engine (and its language-specific dictionaries). All dictionaries are generated from the data records in the Dgraph, and therefore require that the attribute definitions be tagged with a language code.
All dictionary files are stored in the index directory.
Note that you cannot set languages on a per-attribute basis.