This appendix describes the procedure to configure the Similarity Score Calculation.
The approach to estimate item to item similarity is to estimate the fraction of attributes that match between a new item and a potential like item. When calculating similarity score, it is important that item to item similarity calculations take into account attribute weights. The similarity calculations to follow consume user supplied attribute ranks to derive attribute weights. You are allowed to override attribute weights if such weights are available readily from an upstream system
The following libraries must be registered in any domains that will use the clone solution extension:
.RdfFunctions
The following sections detail the Similarity Score Calculation procedure.
The following notes provide information about Similarity Score Calculation functionality.
Refer to the appropriate input parameters and output measures when using the Similarity Score Calculation procedure:
A mask measure is used to define when Similarity Score Calculation is performed. When the mask measure is True, a Similarity Score Calculation is performed; setting the mask to False stops the Similarity Score Calculation process. A business rule may be defined (using RPAS rules) to set the mask measure to False when it is desired to stop calculating similarity score for the new item/like item.
There are two destination arrays of Similarity Score Calculation. One is summarized score between a new item and a like item. Another is detail score for each attribute between a new item and a like item.
Attribute weight need to provide for each attribute as input parameter of Similarity Score Calculation expression.
Attribute Type need to provide for each attribute. The valid attribute type are: string or numeric. For numeric attribute, Attribute Tau need to provide as input parameter. It served as threshold to see if a new item is same as a like item on a numeric attribute.
The syntax for using the Similarity Score Calculation procedure is shown in Example L-1 The input and output parameter tables explain the specific usage of the parameters names use in the procedure.
Table L-1 provides the input parameters for the Similarity Score Calculation procedure and special expressions.
Table L-1 Input Parameters for the Similarity Score Calculation Procedure
Parameter Name | Description |
---|---|
SIM_MASK |
Mask of pair of new item and like item (pre calculated by rule expression) |
PROD_ATTR |
Product attribute values for both new items and like items |
ATTR_TYPE |
Product attribute values type (string or numeric) |
ATTR_WEIGHT |
Attribute weight for new items |
ATTR_TAU |
Tau of numeric attribute for new items |
SIM_SCORE |
Output Array contains similarity score between new item and like item |
ATTR_SCORE |
Output Array contains similarity score between new item and like item for each attribute |