If most of the text in an application is in a single language, you
can specify a global language ID by providing the
--lang option and a
<lang-id> argument to the Dgidx and dgraph
components. The MDEX Engine treats all text as being in the language specified
by
<lang-id>, unless you tag text with a more
specific language ID (that is, per-record, per-dimension, or per-query language
IDs). The
<lang-id> defaults to
en (US English) if left unspecified.
For example, to indicate that text is English (United Kingdom),
specify:
--lang en-GB.
In addition to specifying a language
identifier, you can optionally specify a collation order using an argument to
the
--lang option. A collation is specified in the form:
--lang
<lang-id>-u-co-<collation>, where:
<lang-id>is the language Id and may also include a sub-tag. If unspecified, the value of<lang-id>isen(US English).-uis a separator value between the language identifier portion of the argument and the collation identifier portion of the argument.-<collation>is the collation type of eitherendeca,standard, or in some cases, other language-specific ICU collations such asphonebk. If unspecified, the value of<collation>defaults toen-u-co-endeca.
For example,
--lang de-u-co-phonebk
instructs Dgidx and the dgraph to treat all the text as
German and collate the text in phonebook order.
Dgidx sorts records by the value of particular properties and/or
dimensions. The properties and dimensions used for sorting are specified by the
--sort option of the Dgidx command.
The
--sort option also specifies whether ascending or
descending sort order is used with each property or dimension. For information
about the
--sort
option, refer to the
MDEX Engine Developer's Guide. If you do not specify a
particular sort order through the
--sort option, Dgidx sorts records by their internal
record IDs.

