The MDEX Engine 6.1.2 uses a new mechanism for processing wildcard search queries that greatly simplifies user configuration. In most cases, the size of the on-disk index is reduced considerably, and at the same time indexing performance is improved compared with previous releases.
The new mechanism replaces the regular- and dictionary-based wildcard search methods utilized in previous releases.
The following changes describe the new method and the differences with the previous releases:
Configuration. The configuration to enable wildcard search remains the same as in previous releases. You should enable wildcard search using the Developer Studio. For configuration information, see the MDEX Engine Development Guide, or the Developer Studio Help.
Fewer tuning settings. The new wildcard search mechanism deprecates a number of command-line flags and attributes that existed previously for tuning purposes.
The following wildcard configuration attributes in the
searchindex.dtd
are deprecated:Note
Do not remove these settings from the XML configuration files since they continue to be part of the
searchindex.dtd
used to validate the interface between Developer Studio and Dgidx. If these attributes are present in the XML configuration files, the Dgraph ignores them during startup without issuing a warning.Performance. For optimal performance, Oracle recommends using wildcard search queries with at least two or three non-wildcarded characters in them, such as
abc*
andab*de
, and avoiding wildcard searches with one non-wildcarded character, such asa*
. Wildcard queries with extremely low information, such asa*
, require more time to process.wildcard_max
remains the only tuning option for wildcard search. For the majority of wildcard search patterns, the MDEX Engine does not rely on--wildcard_max
and you should not adjust it.The maximum number of matching terms of a wildcard expression is set to 100 by default. You can continue to modify this value with the
--wildcard_max
flag for the Dgraph to balance performance of the wildcard search with the desired completeness of results.Consider increasing this value only if you have wildcard search queries that use punctuation syntax, such as
ab*c.def*
, and you would like to receive more complete wildcard query results, and can afford slower running wildcard search queries in such cases. This value does not affect other wildcard search queries.If in previous releases you used
--wildcard_max
in cases other than the one described above, such as fora*b*
queries (that do not contain punctuation), after upgrading to this release, consider resetting the value of this flag back to its default (100), and testing performance of the MDEX Engine (it should be improved). For detailed information about tuning--wildcard_max
, see the Performance Tuning Guide.
Related links
The MDEX Engine ignores
MAX_NGRAM_LENGTH
,
DICTIONARY_MAX_NGRAM_LENGTH
, and
DICTIONARY_WILDCARD
settings in the XML configuration
files. The
--wildcard_approx
Dgraph flag is deprecated and is ignored
by the MDEX Engine. The
--ngram_min
Dgidx flag is deprecated and ignored.
The following settings and flags for wildcard search have been deprecated or their usage has been changed:
Setting or flag that is deprecated in 6.1.2 |
Description |
---|---|
|
This setting is deprecated and ignored by the MDEX Engine, as it is no longer necessary for the wildcard search implementation. In previous releases, this setting represented the maximum substring length that was being indexed. It belongs to the
|
|
This attribute is deprecated and ignored by the MDEX Engine. In previous releases, this attribute enabled dictionary-based wildcard search, and indicated whether the dictionary-based index had to be created. The dictionary-based index is no longer used by the new wildcard mechanism. It belongs to the
|
|
This setting is deprecated and ignored. In previous releases, this setting represented the maximum substring length that was indexed for the dictionary-based wildcard index. It belongs to the
|
|
This Dgraph flag is deprecated and ignored. The Dgraph issues a warning if it is specified. In previous releases, you could use this flag in some cases to improve performance of wildcard search by allowing approximate wildcard search query matching and not validating substring match results. The new wildcard method significantly reduces the complexity associated with post-filtering of the result set. This eliminates the need for this flag. |
|
This Dgidx flag is deprecated and ignored since it no longer applies to wildcard indexing. Dgidx issues a warning if it is specified. |