Cluster Discovery configuration is passed as a part of the MDEX Engine configuration from a CAS-based application.
You
must create the refinement configuration at
<
and make an entry for the configured dimension.
EAC_APP
>\config\mdex\<EAC_APP
>.refinement_config.xml
In the following example of the refinement configuation file,
P_Terms
is the dimension that is mapped to the extracted
terms property - OUTPUT_PROP_NAME
. The
CLUSTERS
tag contains the clustering parameters that the
MDEX Engine reads during indexing.
<EndecaApp>.refinement_config.xml <?xml version="1.0" encoding="UTF-8" standalone="no" ?> <!DOCTYPE REFINEMENTS_CONFIG SYSTEM "refinement_config.dtd"> <REFINEMENTS_CONFIG> ...... <REFINEMENTS DVAL_COLLAPSE_THRESHOLD="" NAME="P_Terms" SORT_TYPE="ALPHA"> <STATS NUM_RECORDS="FALSE"/> <DYNAMIC_RANKING COUNT="10" MORE="FALSE" TOP_REFINEMENTS_SORT="DEFAULT" TYPE="FREQUENCY"/> </REFINEMENTS> <CLUSTERS COHERENCE="8" MAX_CLUSTERS="10" MAX_CLUSTER_OVERLAP="5" MAX_CLUSTER_SIZE="5" MAX_REFINEMENT_PRECISION="0.250000" NAME="P_Terms" REC_SAMPLE_SIZE="500"/>
The following topics provide descriptions of the parameters and some guidelines for tuning them.