Oracle recommends that you limit each MDEX Engine to one language whenever this is feasible. Before using a single MDEX Engine to process more than one language, be sure to consider the combined advantages and disadvantages of doing so, as described in this section.

An MDEX Engine is a Dgraph process that uses as input a single index – that is, a set of files produced by a Dgidx process. This set of files is the customer's source data, indexed by the Dgidx process for use by Guided Search operations.

An index can contain customer data in a single language or in more than one language.

The following table lists some of the factors that make a single index (and MDEX Engine) for all languages, or a separate index (and MDEX Engine) for each language, the better choice for processing your data.

 

Reasons to use a single index

Reasons to use a separate index for each language

Data Ingest

Data for multiple languages comes from the same data source and has the same structure.

Consistent pipeline logic and data manipulation can be applied to all your languages.

Data for multiple languages comes from separate data sources with different data structures.

Each of your languages requires its own pipeline logic and data manipulation.

Baseline and Partial Updates

Data for all languages can be updated at the same time, making the development of control scripts simpler and easier.

 

Deployment

Deployment is simpler and less likely to require co-ordination of updates.

 

Dimensions and Languages

Your customer data includes a single or small number of languages, and/or, your application defines a small number of dimensions. In these cases, providing a version of each dimension in each language within a single index is unlikely to slow a customer's access to the data.

Your customer data includes several languages and your application defines a large number of dimensions. The large number of dimension versions required (one for each dimension in each language) may slow the customer's access to the data if provided within a single index. Access time is less likely to be affected if a separate index is used for each language.

Query Results

Hit rates on queries may be higher when there is only one cache on the server where the MDEX Engine is installed.

Note: Each index requires its own cache.

 

Records

 

For most purposes, multiple languages require separate indexes only when the amount of indexed data in each language is large – for example, 100 GB.

Keyword Redirects

 

An index can support only a single set of keyword redirects. When an index contains keywords in different languages, keyword redirects may be executed inappropriately. This can happen when a keyword in one language is mistaken for a keyword in another language, and the keywords have different redirect behaviors. When each language is handled by a different index, there is no possibility of confusion among keywords written in different languages.

Merchandising Triggers

 

An index can support only a single set of keywords. If an index contains keywords in different languages, business rules may be executed inappropriately, because keywords in different languages can be mistaken for each other.

When each language is handled by a different index, there is no possibility of confusion among keywords written in different languages.

Note: Keywords that trigger merchandising rules must be specified in each supported language. For example, if English, French, and Spanish are supported by a single index, the English keyword "pants" must also be specified in French ("pantalon") and in Spanish ("pantalones").

OLT Languages

Your source data includes text in several languages that do not require or benefit from OLT analysis, such as English, French, Spanish, and Italian.

Your source data includes more than one language that is better processed using OLT language analysis, such as Chinese, Japanese, Korean, or German. In this case, use a different index for each OLT language.

Stop Words

 

An index can use only one set of stop words. When an application uses a single index for more than one language, it is possible for a word in one of these languages to be mistaken for a stop word in one of the other languages. For example, the French word "thé" (tea) might be mistaken for the English stop word "the" (the definite article). This type of confusion is avoided when a separate index is used for each language.

Thesaurus

 

An index can use only one thesaurus. When an application uses a single index for more than one language, it is possible for a word in one language to be mistake for a thesaurus entry in another. For example, the German word "Gift" (poison) might be mistaken for the English word "gift" (a present). This type of confusion is avoided when a separate index is used for each language.


Copyright © Legal Notices