If most of the text in an application is in a single language, you
can specify a global language ID by providing the
--lang
option and a
<lang-id>
argument to the Dgidx and dgraph
components. The MDEX Engine treats all text as being in the language specified
by
<lang-id>
, unless you tag text with a more
specific language ID (that is, per-record, per-dimension, or per-query language
IDs). The
<lang-id>
defaults to
en
(US English) if left unspecified.
For example, to indicate that text is English (United Kingdom),
specify:
--lang en-GB
.
In addition to specifying a language
identifier, you can optionally specify a collation order using an argument to
the
--lang
option. A collation is specified in the form:
--lang
<lang-id>-u-co-<collation>
, where:
<lang-id>
is the language Id and may also include a sub-tag. If unspecified, the value of<lang-id>
isen
(US English).-u
is a separator value between the language identifier portion of the argument and the collation identifier portion of the argument.-<collation>
is the collation type of eitherendeca
,standard
, or in some cases, other language-specific ICU collations such asphonebk
. If unspecified, the value of<collation>
defaults toen-u-co-endeca
.
For example,
--lang de-u-co-phonebk
instructs Dgidx and the dgraph to treat all the text as
German and collate the text in phonebook order.
Dgidx sorts records by the value of particular properties and/or
dimensions. The properties and dimensions used for sorting are specified by the
--sort
option of the Dgidx command.
The
--sort
option also specifies whether ascending or
descending sort order is used with each property or dimension. For information
about the
--sort
option, refer to the
MDEX Engine Developer's Guide. If you do not specify a
particular sort order through the
--sort
option, Dgidx sorts records by their internal
record IDs.