Cluster Discovery configuration is passed as a part of the MDEX Engine configuration from a CAS-based application.

You must create the refinement configuration at <EAC_APP>\config\mdex\<EAC_APP>.refinement_config.xml and make an entry for the configured dimension.

In the following example of the refinement configuation file, P_Terms is the dimension that is mapped to the extracted terms property - OUTPUT_PROP_NAME. The CLUSTERS tag contains the clustering parameters that the MDEX Engine reads during indexing.

<EndecaApp>.refinement_config.xml

 

<?xml version="1.0" encoding="UTF-8" standalone="no" ?>
<!DOCTYPE REFINEMENTS_CONFIG SYSTEM "refinement_config.dtd">
<REFINEMENTS_CONFIG>

......

<REFINEMENTS DVAL_COLLAPSE_THRESHOLD="" NAME="P_Terms" SORT_TYPE="ALPHA">
    <STATS NUM_RECORDS="FALSE"/>
    <DYNAMIC_RANKING COUNT="10" MORE="FALSE" TOP_REFINEMENTS_SORT="DEFAULT" TYPE="FREQUENCY"/>
  </REFINEMENTS>

  <CLUSTERS COHERENCE="8" MAX_CLUSTERS="10" MAX_CLUSTER_OVERLAP="5" MAX_CLUSTER_SIZE="5" MAX_REFINEMENT_PRECISION="0.250000" NAME="P_Terms" REC_SAMPLE_SIZE="500"/>

The following topics provide descriptions of the parameters and some guidelines for tuning them.


Copyright © Legal Notices