Targeted Items Recommendation

23 Targeted Items Recommendation

This chapter explains the steps needed in AIF to use the Targeted Items Recommendation (TIR) service from XStore or any other relevant app.

TIR can be primarily classified into four types:

Out-of-stock substitutions: similar items that are available/allocatable in the specific store/channel.
Up-sell: higher-margin, comparable alternatives, respecting price bands and customer context.
Cross-sell: complementary items using item-item and basket affinity across categories.
Repetition/replenishment: based on individual or cohort purchase cadence and preferences.

Each of these four TIR types involve specific kinds of ML models and approaches to recommend appropriate items. The recommendations may also vary by stores and customer segments.

Attribute Similarity Based TIR

This section explains the science, workflow steps, data requirements, and validation checks to use the Attribute Similarity based TIR. The science is the same as the Attribute Similarity calculation in the Demand Transference (DT) module.

Currently, only the Up-sell feature is supported using this approach.

Attribute Similarity Science

The idea is to compare products based on their similarities calculated from product attributes in order to determine what, if any, products customers might buy if the product they want to buy is for some reason unavailable. In this way, planning, ordering, and other business decisions can be optimized.

Role of Attributes in Calculating Similarities

The similarity of two SKUs is based in part on how many attribute values they have in common (the more in common, the higher the similarity of the two SKUs). Because the attributes play such an important role in calculating similarities, attribute quality is important when performing the calculation. The similarities calculated are only at the Customer Segment-Chain and Location-Company level.

Attribute Data Requirements

The attribute values for the similarity calculation must meet the following requirements:

Set of Attributes: Each product category is characterized by a unique set of attributes. These attributes differ from category to category. For example, for yogurt, the attributes might be size, flavor, brand, fat percentage, and pack size. For chocolate, the attributes might be size, brand, milk/dark, nut type, and package type. Two categories can both have brand, but the brand attribute will have different values for each of the categories. So, brand is actually a different attribute for each category.

Mapping: Each item in the category must be mapped to its set of attribute values. This information must be obtained from the retailer. Null values are acceptable as long as they are not too numerous. The calculation can still run even if some attribute values are listed as null for some items in a category, but too many null values decrease the reliability of the generated similarities. In particular, too many SKU pairs may come out as less similar than they should be. Null values have a particular use in accommodating categories that are actually a union of more than one category.

Significance: The attributes for a category must be the ones that the customers actually pay attention to when shopping in the category. They are attributes that actually affect the customers' purchasing decisions. Note that the similarity calculation will still complete even with attributes that do not affect customer behavior, but the similarities produced will be less distinguishing. For example, a category has a Supplier attribute, which indicates for a given product which supplier shipped an item to the grocer. This attribute may be important to the grocer for accurate bookkeeping, but it has no effect on the customer's purchasing behavior because it is not reflected in the item itself nor is it something that the customer is concerned about. However, if it is included when setting up attributes, then the effect would be to increase the similarity of items that were from the same supplier. This is a false similarity, since it does not reflect how the customer actually views these items. In particular, if the supplier is a duplicate of the Brand attribute, then the similarity of products within the same Brand would be unintentionally increased. The process of obtaining attributes for a category and performing a mapping of items in the category to attribute values is likely to require a significant amount of time and labor, even if the retailer has the information available, since this must be done for every category.

Guidelines on Number of Attributes and Attribute Values

The number of attributes and attribute values must be enough to distinguish the SKUs within a category. That is, for a given set of attribute values, the number of SKUs in the category all having those values must be a small number. A maximum of seven SKUs is recommended. For example, the Cookie category at a grocer has only three attributes, Brand, Package Size, and Organic. If Brand has seven values, Package Size has three values, and Organic has two values (either Yes or No), then the total number of combinations of attribute values is 7 x 3 x 2 = 42. For 600 different cookie SKUs, the average for each combination of attribute values will represent 600 / 42 = approximately 14 different SKUs. The distribution of SKUs among the 42 different sets of values will not be an even 14, as some sets of values will have much more than 14, while others will have less. The three attributes alone are not enough to provide enough distinguishing power among the cookie SKUs. If Flavor is an important determinant of customer purchases, it should be added to the Cookie category.

The guideline of a maximum of seven indicates that additional attributes are necessary. It is worth examining those sets of attribute values that have the largest number of SKUs associated with them in order to see what attributes can be added to reduce the number of SKUs. It is not just the number of attributes that is important, but how many values each attribute has. For example, if Brand had 100 values instead of seven values, then the total number of attribute-value combinations is 100 x 3 x 2 = 600. It might seem that an easy way to achieve the maximum of seven is to expand the number of values in the attributes. However, this results in each SKU being similar to only a small number of other SKUs. For a single attribute for cookies with 600 different values, it might then be possible to assign one value to each cookie SKU, separating all of the 600 cookies SKUs with a single attribute. However, this would make each cookie SKU completely dissimilar (similarity of 0) from all other cookie SKUs. Putting all 600 SKUs each into a separate Brand causes a complete loss of any similarity information among the SKUs.

For an opposite example, consider 11 attributes, each with only two values. There is a total of 2^11 = 2,048 combinations of values, so that may be enough to encode 2,000 SKUs, even though there are only 2 x 11 = 22 distinct attribute values over the 11 attributes. In general, having more attributes is better, and it is better to increase the number of attributes rather than increase the number of attribute values of a single attribute. However, this is not always possible, and it is better to have the attribute with many values than not have the attribute at all. Flavor, for example, can have many values, as can Color. The more SKUs in the category, the more attributes and attribute values will be needed to achieve the maximum of seven.

Effect on Similarity Values

Suppose the set A of SKUs consists of 22 SKUs, all with the same attribute values, and the set B of SKUs consists of 25 SKUs, all with the same attribute values (but a different set of attribute values from set A). If the set A consists of cookie SKUs all with a package size of Small, and set B consists of cookie SKUs with the same attribute values as A except the size is Medium, then every SKU in A has a similarity of 1 to every other SKU in A, and every SKU in A is similar to at least 22 other SKUs. Every SKU in A is similar to every SKU in B, since they only differ in one attribute value (namely size). So, a SKU in set A is similar to at least 21 + 25 = 46 SKUs. It is possible that a SKU in A being similar to 46 other SKUs in fact represents reality, but if it does not, then using additional attributes that distinguish the SKUs in A and in B will reduce the number of similar SKUs.

Avoiding Attributes with Many Values

Attributes with a large number of values occur frequently. For example, a color attribute in any clothing category might have several shades of each color. Midnight blue, sea blue, and sky blue may all be separate attribute values of the Color attribute; the problem is that in the similarity calculation, a midnight blue item and a sea blue item would be considered completely dissimilar colors, because these two color attribute values are different; in reality, because they are both shades of blue, they should be somewhat similar. One solution is to split the color attribute into two separate attributes, a primary color attribute and a modifier. In this example, the primary color would be blue and the modifiers midnight, sea, and sky.

Functional-Fit Attributes

A functional fit attribute is one where there is no substitution across the attribute's values. For example, batteries of different sizes cannot be substituted for one another. Any category where size determines the functional suitability of the item will have size as a functional-fit attribute.

The functional-fit attributes are used to set the similarity of two SKUs to be zero if the SKUs differ in any functional-fit attribute. Without the functional-fit information, the two SKUs may have non-zero similarity, and that would be erroneous, such as batteries of different sizes.

Designating an attribute as functional fit can also be useful any time the attribute is unlikely to have substitution across it (for example, caffeinated versus decaffeinated coffee). This is not exactly functional fit; however, substitution is unlikely, so it is better to mark the attribute as functional fit.

One approach to avoiding having to define large numbers of attributes and attribute values is to use functional-fit attributes. This approach does not help achieve the maximum of seven, but it can help decrease the number of SKUs that are similar to a given SKU. For example, with the sets A and B of cookie SKUs, if size were designated as functional fit, then the similarity between SKUs in A and SKUs in B would become zero. However, that designating size as functional fit does nothing about the 22 SKUs in A that all have a similarity of 1 to each other, since their attribute values are all the same. (Similar comments apply to set B.)

If the attributes and attribute values are insufficient to reach the maximum of seven SKUs per set of attribute values, functional-fit attributes can be used to decrease the number of SKUs that will share a positive similarity. This is a second-best approach, and it is better to design a proper set of attributes and attribute values, in order to:

Achieve the maximum of seven SKUs.
Provide positive similarity between SKUs that should have a positive similarity. Using functional-fit attributes reduces the number of SKUs that will share a positive similarity, but it may reduce it too much and remove similarities from pairs of SKUs that should have a positive similarity. For example, in the sets A and B, the similarity between a SKU in A and a SKU in B becomes zero, which does not reflect reality since the SKUs in A and in B share common attribute values except for size.
Keep the second-best approach as a last resort, in case time is insufficient for designing a good set of attributes for a category.

Using Null as an Attribute Value

Null is a legitimate attribute value for an item to have; it denotes either of the following:

The attribute value is unknown. In this case, the item has a value for this attribute, but the value is unknown. Attribute data for a category is rarely perfect, it is entirely possible that attribute values for some items were never recorded or are known to be incorrect. The use of null because of imperfections in the attribute data must only be for those cases where it is truly necessary.
The attribute does not apply to the item. Not all attributes in a category apply to all items, especially when the category consists of sub-categories. The ideal is to define the items in the category so that every attribute for the category applies to every item, but this may not be possible. This usually occurs when an attribute applies to an item only if the item has a particular value for another attribute. For example, in the meat snacks category, the attribute grass fed only applies if the item has beef for the product type attribute. If the item is turkey, then the grass-fed attribute must be set to null for the item. The grass-fed attribute must be true or false only for beef products.

Using null is not the ideal way of handling the second case listed above. If possible, it is preferable to split the items in the category into separate categories. For example, this means putting beef products into their own category, so that the grass-fed attribute applies only to this new category and can be removed from the non-beef products. This is only reasonable if the beef products are not substitutes for the other products in the meat snacks category (which is probably not true since the customer might reasonably substitute turkey jerky for beef jerky). In the case of meat snacks, the best solution is to use null for the grass-fed attribute for non-beef products.

Effect of Null Attribute Values on Similarity Values

A null attribute value for an item means that the item does not match any other item in that attribute, even if the other item also has null for the same attribute. Null as the value of an attribute makes the item less similar to every other item, including items that also have null for the same attribute. If an item has null for every attribute, then this item has a similarity of zero to every other item, regardless of what values the other items have for their attributes. A value of null for an attribute is thus treated as special in similarity calculations. It is recommended to keep the use of null to a minimum in order to avoid spuriously indicating that items are dissimilar when they are actually similar.

If the intention is to have null for an attribute value, it is essential to actually use null and not create a string such as "none" to use where null is actually meant. Designating a string such as "none" to use in place of null actually does the opposite of what is described above, because two items with a value of "none" for the same attribute will now be considered similar in that attribute since they have the same value for the attribute. This is directly opposite to the intention of null, as described above.

Categories Containing Sub-Categories

It is possible for a category to consist of items from several categories. As it may be impractical to separate the category into its component categories, it is necessary to create attributes for the category that can approximate the separation. The following process describes how to do this, using the category Oral Care. This category consists of toothbrushes, toothpaste, dental floss, and mouth wash; it is really four separate categories whose items have all been classified together into a single category.

Create one attribute that can separate the category into its component categories. For example, for Oral Care, create an attribute called Product Type that has four values: "toothbrush," "toothpaste," "floss," and "mouthwash." The values of this separating attribute should describe the function of the product. It should separate the category into groups of items in which the items within each group are unlikely to substitute for items in another group. For example, toothbrushes are unlikely to substitute for toothpaste. Constructing a separating attribute is not always as simple as this, as discussed below.
Designate this separating attribute as functional fit. This means that there will be no similarity between items that are in the different groups specified by the separating attribute. In this case, there is no similarity between the four component groups of Oral Care.
Create a single set of attributes suitable for all of the component categories. This typically involves taking the union of the attributes that describe each component category separately. For example, the set of attributes that describe toothpaste are different than the set of attributes that describe toothbrushes, but in this scheme, just take the union of both sets of attributes. Toothbrushes have an attribute called "Bristle hardness," which does not apply to toothpaste, and likewise toothpaste has a flavor attribute that does not apply to toothbrushes. Nonetheless, take all of the attributes together, including the ones for floss and mouthwash. It is possible that an attribute can apply across several of the component categories. For example, only a single flavor attribute is necessary, instead of having three attributes called Floss-Flavor, Toothpaste-Flavor, and Mouthwash-Flavor. Either approach is correct, but having three separate flavor attributes may require more work to create.
Assign null values appropriately. For example, for attributes that apply only to toothpaste, set them to null for any toothbrush. Likewise, for attributes that apply only to toothbrushes, set them to null for any toothpaste.

Here is an example of the Oral Care category and how to set null for its attributes. Assume a separating attribute for Oral Care as described above, and assume, in addition, the following attributes: Thickness, Length, Contains Alcohol, and Flavor. (In reality, there would be more.) The following table shows which attributes apply to which products. Null indicates that the attribute does not apply to the product, and therefore, should be set to null for the product.

Table 23-1 Oral Care Category Example

Product Type	Thickness	Length	Contains Alcohol	Flavor
Floss	Applies	Applies	Null	Applies
Mouthwash	Null	Null	Applies	Applies
Toothpaste	Null	Null	Null	Applies
Toothbrush	Null	Null	Null	Null

In Step 1, you may not find a separating attribute. Instead, you may need to use an attribute that separates the items into groups but in which some of the groups may have items that can substitute for each other. For example, consider a category that consists of Shampoo, Shampoo and Conditioner, and Conditioner, where Shampoo and Conditioner are products that have shampoo and conditioner in one bottle. In this case, a separating attribute has three values, Shampoo, Shampoo and Conditioner, and Conditioner. If this attribute is set as functional fit, there is no similarity between Shampoo and Shampoo and Conditioner. In this example, the approximation may be good enough, since although substitution can take place between Shampoo and Shampoo and Conditioner, it is probably not large.

Setting Up Categories

In general, a category is a set of items that are substitutable with each other (if there are no functional-fit attributes). The categories at a retailer can all be derived by choosing the correct level of the merchandise hierarchy at the retailer. The configuration supports choosing which level of the merchandise hierarchy is to be used as the category level.

Similarity calculations only occur within the category, since the categories define the sets of items that substitute each other.

Frequently, retailers will have categories that are actually unions of categories. For example, a retailer might have a Hair Care category that contains shampoo, conditioner, and hair oil. The retailer may not want to separate out this category into three separate categories of Shampoo, Conditioner, and Hair Oil, if, for example, a single person in the organization is responsible for all three. The problem is that these three types of products do not share a common set of attributes. The attributes describing Hair Oil are not the same ones needed for describing Shampoo or Conditioner. These types of products may share common attributes, such as Scent, but each type of product also needs its own set of attributes. The solution is to define, in addition to the common attributes, a set of attributes for each product type. If an attribute applies only to Shampoo, and not to Conditioner or Hair Oil, then Conditioner SKUs and Hair Oil SKUs should have Null for the value of that attribute. This is a common use of null attribute values and makes it possible to handle the case of a category that is really the union of smaller sub-categories.

Historical Sales Data

Attribute Similarity calculation requires SKU-store-week sales-units aggregates. Typically, the data should not be aggregated to a higher location level than store because different stores usually have different assortments. Some atypical cases can occur in which aggregation across some stores is legitimate because the assortments are the same or nearly so, but this is generally not the case.

In addition to the SKU-store-week sales-units aggregates, it also requires promotion data. A flag indicates which SKU-store-weeks contain major promotions (ones that caused a very large increase in sales units, such as three times normal). It uses the promotion data to flatten the promotional spikes in the SKU-store-week sales-units aggregates. The flattened data is called the baseline sales. An alternative to flattening the promotions is to ignore the SKU-store-weeks where promotions occurred. However, to implement this, it is necessary to ignore all SKU-store-weeks in any week where a SKU-store was promoted, because it is not clear what the effect is on the other items when one SKU is promoted and the others are not. Removing that many SKU-store-weeks can leave little data remaining, especially since many retailers promote quite frequently. For this reason, it is better to flatten promotions and keep more data instead.

Similarly to the way in which promotions are handled, the calculation of the baseline sales also involves removing short-term downward spikes that are due to very short-term stock outs. Note that inventory information is not an input, and so the algorithm finds large but short-lasting dips in weekly sales units and fills those in. This handling of out-of-stock is not related to long-term out-of-stock conditions.

Loading Product Attribute Data into AIF

The following table shows the relevant parameters from RSE_CONFIG in the Manage System Configurations screen.

Table 23-2 Relevant Parameters from RSE_CONFIG

APPL_CODE	PARAM_NAME	PARAM_VALUE	DESCR
RSE	CMGRP_LEVEL_ID	Default value is 5.	The hierarchy level id which contains the level of the product hierarchy where the Category Management (CM) Group level exists, that is, the level at which the product attribute groups are loaded into AIF.
PMO	PMO_PROD_HIER_TYPE	Default value is 3.	The hierarchy type id to use for the product. Keep the value at 3 (if Extended Merchandise Hierarchy has been loaded into AIF) or change to 1 (otherwise, for Normal Merchandise Hierarchy).

The following table shows the relevant jobs to load product attribute data into AIF. The same can also be achieved by the RSE_MASTER_ADHOC_AIF_JOB (rse_master.ksh) with parameter -P (process Product Attribute specific steps) or -A (process all AIF steps).

Table 23-3 Relevant Jobs to Load Attribute Data

JobName	Description	RmsBatch
RSE_CDA_ETL_LOAD_PROD_JOB	Populates the RSE_PROD_ATTR and RSE_ATTR_DESCR tables with product attributes pulled from the W_PRODUCT_D and W_RTL_ITEM_GRP1_D tables.	rse_cda_etl_load_product.ksh
RSE_PROD_ATTR_GRP_VALUE_STG_JOB	Stages associations of product attribute groups to product attributes from rse_prod_attr_grp_value_stg.txt	rse_prod_attr_grp_value_stg.ksh
RSE_PROD_ATTR_GRP_VALUE_COPY_JOB	Copies rse_prod_attr_grp_value_stg.txt file previously extracted from a ZIP to an internal location so it can be loaded through an external table.
RSE_PROD_ATTR_GRP_VALUE_STG_CNE_JOB	Stages associations of product attribute groups to product attributes from rse_prod_attr_grp_value_stg.txt file into the RSE_PROD_ATTR_GRP_VALUE_STG table through an external table.
RSE_PROD_ATTR_GRP_VALUE_LOAD_JOB	Populates the RSE_PROD_ATTR_GRP and RSE_PROD_ATTR_GRP_VALUE tables with product attribute group data pulled from the RSE_PROD_ATTR_GRP_VALUE_STG table.	rse_prod_attr_grp_value_load.ksh
RSE_PROD_ATTR_VALUE_XREF_STG_JOB	Stages cross references of product attribute values to product attribute groups from rse_prod_attr_value_xref_stg.txt file into the RSE_PROD_ATTR_VALUE_XREF_STG table through SQL Loader.	rse_prod_attr_value_xref_stg.ksh
RSE_PROD_ATTR_VALUE_XREF_COPY_JOB	Copies rse_prod_attr_value_xref_stg.txt file previously extracted from a ZIP to an internal location so it can be loaded through an external table.
RSE_PROD_ATTR_VALUE_XREF_STG_CNE_JOB	Stages cross references of product attribute values to product attribute groups from rse_prod_attr_value_xref_stg.txt file into the RSE_PROD_ATTR_VALUE_XREF_STG table through an external table.
RSE_PROD_ATTR_VALUE_XREF_LOAD_JOB	Populates the RSE_PROD_ATTR_GRP_VALUE_MAP table with cross references of product attribute values to product attribute groups pulled from the RSE_PROD_ATTR_GRP_VALUE/RSE_PROD_ATTR_VALUE_XREF_STG tables.	rse_prod_attr_value_xref_load.ksh
RSE_CDA_ETL_LOAD_PROD_ADHOC_JOB	Ad hoc job that populates the RSE_PROD_ATTR and RSE_ATTR_DESCR tables with product attributes pulled from the W_PRODUCT_D and W_RTL_ITEM_GRP1_D tables.	rse_cda_etl_load_product.ksh
RSE_PROD_ATTR_GRP_VALUE_COPY_ADHOC_JOB	Ad hoc job that copies rse_prod_attr_grp_value_stg.txt file previously extracted from a ZIP to an internal location so it can be loaded through an external table.
RSE_PROD_ATTR_GRP_VALUE_STG_CNE_ADHOC_JOB	Ad hoc job that stages associations of product attribute groups to product attributes from rse_prod_attr_grp_value_stg.txt file into the RSE_PROD_ATTR_GRP_VALUE_STG table through an external table.
RSE_PROD_ATTR_GRP_VALUE_LOAD_ADHOC_JOB	Ad hoc job that populates the RSE_PROD_ATTR_GRP and RSE_PROD_ATTR_GRP_VALUE tables with product attribute group data pulled from the RSE_PROD_ATTR_GRP_VALUE_STG table.	rse_prod_attr_grp_value_load.ksh
RSE_PROD_ATTR_VALUE_XREF_COPY_ADHOC_JOB	Ad hoc job that copies rse_prod_attr_value_xref_stg.txt file previously extracted from a ZIP to an internal location so it can be loaded through an external table.
RSE_PROD_ATTR_VALUE_XREF_STG_CNE_ADHOC_JOB	Ad hoc job that stages cross references of product attribute values to product attribute groups from rse_prod_attr_value_xref_stg.txt file into the RSE_PROD_ATTR_VALUE_XREF_STG table through an external table.
RSE_PROD_ATTR_VALUE_XREF_LOAD_ADHOC_JOB	Ad hoc job that populates the RSE_PROD_ATTR_GRP_VALUE_MAP table with cross references of product attribute values to product attribute groups pulled from the RSE_PROD_ATTR_GRP_VALUE/RSE_PROD_ATTR_VALUE_XREF_STG tables.	rse_prod_attr_value_xref_load.ksh

Table 23-4 Relevant Queries for Data Validation

-- query 1: gives the BUSINESS_OBJECT_MD_ID for product data

select * from RSE_BUSINESS_OBJECT_MD where NAME = 'PRODUCT';

-- query 2: gives the BUSINESS_OBJECT_DB_SRC_ID for product attribute groups

select * from RSE_BUSINESS_OBJECT_DB_SRC where DB_OBJECT_NAME = 'W_RTL_ITEM_GRP1_D';

-- query 3: gives the BUSINESS_OBJECT_ATTR_MD_ID based on BUSINESS_OBJECT_MD_ID and BUSINESS_OBJECT_DB_SRC_ID

select * from RSE_BUSINESS_OBJECT_ATTR_MD where BUSINESS_OBJECT_MD_ID = <<ID from query 1>> and BUSINESS_OBJECT_DB_SRC_ID = <<ID from query 2>> and EXCLUDE_FLG = 'N';

--query 4: gives the PROD_ATTR_GRP_ID based on BUSINESS_OBJECT_ATTR_MD_ID, HIER_TYPE_ID, and PROD_HIER_ID (note that the PROD_HIER_ID in this table is at the level of CMGRP_LEVEL_ID in RSE_CONFIG)

select * from RSE_PROD_ATTR_GRP where BUSINESS_OBJECT_ATTR_MD_ID = <<ID from query 3>> and HIER_TYPE_ID = <<PMO_PROD_HIER_TYPE from RSE_CONFIG>> and PROD_HIER_ID = <<any particular product ID>> and CURRENT_FLG = 'Y' and DELETE_FLG = 'N';

--query 5: gives the PROD_ATTR_GRP_VALUE_ID based on PROD_ATTR_GRP_ID

select * from RSE_PROD_ATTR_GRP_VALUE where PROD_ATTR_GRP_ID = <<ID from query 4>> and CURRENT_FLG = 'Y' and DELETE_FLG = 'N';

--query 6: gives the ATTR_VALUE_EXT_CODE based on PROD_ATTR_GRP_VALUE_ID

select * from RSE_PROD_ATTR_GRP_VALUE_MAP where PROD_ATTR_GRP_VALUE_ID = <<ID from query 5>>;

--query 7: gives the ATTR_DESCR based on BUSINESS_OBJECT_ATTR_MD_ID (BOAM_ID) and ATTR_VALUE_EXT_CODE

select * from RSE_ATTR_DESCR where BOAM_ID = <<ID from query 3>> and ATTR_VALUE_EXT_CODE = <<ATTR_VALUE_EXT_CODE from query 6>>;

--query 8: gives the leaf-level PROD_HIER_ID (SKUs) based on BUSINESS_OBJECT_ATTR_MD_ID, ATTR_VALUE_EXT_CODE, and having the same PARENT_PROD_HIER_ID at the level of CMGRP_LEVEL_ID

select * from RSE_PROD_ATTR where BUSINESS_OBJECT_ATTR_MD_ID = <<ID from query 3>> and ATTR_VALUE_EXT_CODE = <<ATTR_VALUE_EXT_CODE from query 6>> and PROD_HIER_ID in (select CHILD_PROD_HIER_ID

from RSE_PROD_HIER_TC where HIER_TYPE_ID = <<PMO_PROD_HIER_TYPE from RSE_CONFIG>> and PARENT_HIER_LEVEL_ID = <<CMGRP_LEVEL_ID from RSE_CONFIG>> and PARENT_PROD_HIER_ID = <<PROD_HIER_ID from query 4>> and CURRENT_FLG = 'Y');

--query 9: checking the HIER_LEVEL_ID for the relevant product

select * from RSE_PROD_HIER where HIER_TYPE_ID = <<PMO_PROD_HIER_TYPE from RSE_CONFIG>> and ID = <<PROD_HIER_ID from query 4 or query 8>> and CURRENT_FLG = 'Y' and DELETE_FLG = 'N';

Functional-Fit Attributes are very important for TIR. Size may be a functional-fit attribute, but Color may not be. Whether an attribute is functional-fit or not may depend on the specific business needs of the retailer. The idea is that if an attribute is functional-fit then an item should not be recommended if the value of the functional-fit attribute does not match that of the main item. For example, if Size is considered a functional-fit attribute, then the recommended item’s Size must match that of the main item. Functional-Fit Attributes must be loaded as FUNC_ATTR_FLG = ‘Y’ into the table RSE_PROD_ATTR_GRP by using the table RSE_PROD_ATTR_GRP_VALUE_STG. All other (non-functional-fit) attributes must be loaded as FUNC_ATTR_FLG = ‘N’.

Calculating Product-Location Range Data and Baseline Sales in AIF

The following table shows the relevant parameters from RSE_CONFIG in the Manage System Configurations screen.

Table 23-5 Relevant Parameters from RSE_CONFIG

APPL_CODE	PARAM_NAME	PARAM_VALUE	DESCR
RSE	RSE_PROD_LOC_STATUS_SOURCE	Default value is CALC.	The source of data for product location ranging data (RI or CALC).
RSE	IMPLY_OUT_OF_RANGE_AFTER	Default value is 8.	The number of weeks with no sales required before a SKU is considered to be out of range for a location.

The following table shows relevant jobs to load or calculate product-location range data and calculate baseline sales in AIF.

Table 23-6 Relevant Jobs to Load or Calculate

JobName	Description	RmsBatch
RSE_PROD_LOC_RANGE_SETUP_JOB	Sets up a queue for the RSE_PROD_LOC_RANGE service that populates the RSE_PROD_LOC_STATUS table with product-location range data using data in the W_RTL_IT_LC_D/RSE_SLS_PR_LC_WK tables. Default parameters for this job are (-n 2 -f Y).	rse_prod_loc_range_setup.ksh
RSE_PROD_LOC_RANGE_PROCESS_JOB	Processes a queue for the RSE_PROD_LOC_RANGE service that populates the RSE_PROD_LOC_STATUS table with product-location range data using data in the W_RTL_IT_LC_D/RSE_SLS_PR_LC_WK tables.	rse_prod_loc_range_process.ksh
RSE_BASELINE_SETUP_JOB	Sets up a queue for the RSE_BASELINE service that executes baseline sales calculation using data in the RSE_SLS_PR_LC_CS_WK/RSE_PROD_LOC_STATUS tables and populates the RSE_BL_SLS_PR_LC_CS_WK table.	rse_baseline_setup.ksh
RSE_BASELINE_PROCESS_JOB	Processes a queue for the RSE_BASELINE service that executes baseline sales calculation using data in the RSE_SLS_PR_LC_CS_WK/RSE_PROD_LOC_STATUS tables and populates the RSE_BL_SLS_PR_LC_CS_WK table.	rse_baseline_process.ksh
RSE_PROD_LOC_RANGE_SETUP_ADHOC_JOB	Ad hoc job that sets up a queue for the RSE_PROD_LOC_RANGE service that populates the RSE_PROD_LOC_STATUS table with product-location range data using data in the W_RTL_IT_LC_D/RSE_SLS_PR_LC_WK tables. Default parameters for this job are (-n 2 -f Y).	rse_prod_loc_range_setup.ksh
RSE_PROD_LOC_RANGE_PROCESS_ADHOC_JOB	Ad hoc job that processes a queue for the RSE_PROD_LOC_RANGE service that populates the RSE_PROD_LOC_STATUS table with product-location range data using data in the W_RTL_IT_LC_D/RSE_SLS_PR_LC_WK tables.	rse_prod_loc_range_process.ksh
RSE_BASELINE_SETUP_ADHOC_JOB	Ad hoc job that sets up a queue for the RSE_BASELINE service that executes baseline sales calculation using data in the RSE_SLS_PR_LC_CS_WK/RSE_PROD_LOC_STATUS tables and populates the RSE_BL_SLS_PR_LC_CS_WK table.	rse_baseline_setup.ksh
RSE_BASELINE_PROCESS_ADHOC_JOB	Ad hoc job that processes a queue for the RSE_BASELINE service that executes baseline sales calculation using data in the RSE_SLS_PR_LC_CS_WK/RSE_PROD_LOC_STATUS tables and populates the RSE_BL_SLS_PR_LC_CS_WK table.	rse_baseline_process.ksh

Table 23-7 Relevant Queries for Data Validation

--query 10: checking the presence of a particular SKU-store in the product-location range data

select * from RSE_PROD_LOC_STATUS where PROD_HIER_ID = <<leaf level PROD_HIER_ID>> and LOC_HIER_ID = <<leaf level LOC_HIER_ID>>;

--query 11: checking the last sales of a particular SKU-Store

select * from RSE_PROD_LOC_LAST_SLS where PROD_HIER_ID = <<leaf level PROD_HIER_ID>> and LOC_HIER_ID = <<leaf level LOC_HIER_ID>>;

--query 12: getting the CHAIN customer segment and all other child segments

select * from RSE_CUSTSEG_HIER order by ID;

--query 13: checking the presence of a particular SKU-store in the baseline sales data

select * from RSE_BL_SLS_PR_LC_CS_WK where PROD_HIER_ID = <<leaf level PROD_HIER_ID>> and LOC_HIER_ID = <<leaf level LOC_HIER_ID>> order by CUSTSEG_HIER_ID, CAL_HIER_ID;

--query 14: getting the summary statistics of a particular SKU-store from the baseline sales data

select CUSTSEG_HIER_ID, min(CAL_HIER_ID), max(CAL_HIER_ID), sum(BL_SLS_QTY)

from RSE_BL_SLS_PR_LC_CS_WK

where PROD_HIER_ID = <<leaf level PROD_HIER_ID>> and LOC_HIER_ID = <<leaf level LOC_HIER_ID>> group by CUSTSEG_HIER_ID order by CUSTSEG_HIER_ID;

Presence of a SKU-Store in the product-location range data and baseline sales tables are critical to calculate attribute similarities. If there is difficulty in populating the RSE_PROD_LOC_STATUS table with specific product-location records when RSE_PROD_LOC_STATUS_SOURCE = CALC in RSE_CONFIG, then try changing the values of IMPLY_OUT_OF_RANGE_AFTER in RSE_CONFIG and/or the parameter -n (number of latest historical weeks to process) in RSE_PROD_LOC_RANGE_SETUP_JOB/ RSE_PROD_LOC_RANGE_SETUP_ADHOC_JOB and then re-run the jobs to re-populate the table. For the population of RSE_BL_SLS_PR_LC_CS_WK with specific product-location records, those records must be present in RSE_PROD_LOC_STATUS and RSE_SLS_PR_LC_CS_WK.

Attribute Similarity Calculation as Part of AIF Forecasting

The expectation here is that the implementer is familiar with the AIF Forecasting module in the Manage Forecast Configurations screen. This subsection will go over the specific parameters and settings required for the attribute similarity calculation for TIR. The main goal is here is not to generate an accurate forecast, rather just generate the attribute similarities as part of the forecasting process.

The following table shows relevant parameters from RSE_CONFIG in the Manage System Configurations screen.

Table 23-8 Relevant Parameters from RSE_CONFIG

APPL_CODE	PARAM_NAME	PARAM_VALUE	DESCR
PMO	MAX_NUM_WEEKS_FOR_SIM_MDL_SLS	Default value is 104.	The maximum number of weeks which should be used during Similarity model sales calculation.
PMO	PMO_SIM_MIN_REQ_LC_SLS_WKS	Default value is 10.	The minimum number of weeks per location required for Similarity model build.
PMO	PMO_SIM_MIN_REQ_PR_LC_SLS_WKS	Default value is 10.	The minimum number of weeks per product-location required for Similarity model build.
PMO	SIM_CALC_INT_LENGTH	Default value is 8.	The number of weeks to group together in an interval for the Similarity, Attribute Weight calculation.

In the Manage System Configurations screen, set the USED_BY_FCST_APP_FLG value to Y for the TIR row in the RSE_APP_SOURCE table.
In the Setup train stop of the Manage Forecast Configurations screen, create a new forecast run type with the following parameters:
- Data Source: Store Sales
- Forecast Method: Sales and Promo or Life Cycle
- Forecast Measure: Any available option
- Forecast by Customer Segments: Turned Off
- Forecast by Price Zone: Turned Off
- Hierarchy Type - Merchandise: Extended Merchandise Hierarchy (if PMO_PROD_HIER_TYPE = 3 in RSE_CONFIG) or Normal Merchandise Hierarchy (if PMO_PROD_HIER_TYPE = 1 in RSE_CONFIG)
- Hierarchy Type - Location: Normal Location Hierarchy or any Alternate Location Hierarchy
- Hierarchy Type - Calendar: Fiscal Calendar Hierarchy
- Forecast Level - Merchandise: Style Color Size (if using Extended Merchandise Hierarchy) or SKU (if using Normal Merchandise Hierarchy)
- Forecast Level - Location: Location (if using Normal Location Hierarchy) or the Leaf Level (if using Alternate Location Hierarchy)
- Forecast Level - Calendar: Week
If a run type already exists with these configurations, then that run type could also be potentially used for TIR and there is no need to create a new one in that scenario. After the run type is created, data aggregation for historical sales in PMO_ACTIVITIES and historical cumulative sales in PMO_CUM_SLS (this table is only needed for the Life Cycle Forecast Method) needs to be done from the UI, if the respective AGGR_LVL_KEY does not have the aggregated data.

Before using the Extended Merchandise Hierarchy (RSE_PROD_SRC_XREF, RSE_PROD_HIER, RSE_PROD_HIER_TC, RSE_PROD_HIER_DH) and/or an Alternate Location Hierarchy (RSE_LOC_SRC_XREF, RSE_LOC_HIER, RSE_LOC_HIER_TC, RSE_LOC_HIER_DH), the respective hierarchy tables in AIF must be verified to ensure that proper and complete hierarchy data exists.
In the Map train stop of the Manage Forecast Configurations screen, map the run type to the TIR app. At any point in time, at most one run type should be mapped to TIR. If another run type is mapped to TIR, then it is important to delete the mapping of the previous run type from TIR. Otherwise, among the run types that are mapped to TIR, the one with the greatest run type ID will be used.
In the Test train stop of the Manage Forecast Configurations screen, select the TIR run type row in the top table. In the bottom table, create a new run by opening the ‘Scope’ tab and selecting the radio button ‘Run estimation only’ or ‘Run estimation and base demand’ and optionally turning ON the ‘Run Forecast’ toggle button. Note that for the purpose of TIR, only creating an estimation run is sufficient. The base demand and forecast steps are optional. The run should be set up with appropriate parameters for a Sales and Promo or Life Cycle execution. While creating the run, make the following choices under the ‘New Item Forecast Method’ section of the ‘New Item Fcst’ tab:
- Radio button selection: Calculate like items and generate forecast for new items
- Like item calculation method: Attribute similarity
- Source model for attribute similarity: Re-estimate attribute weights with New Model
Submit the new run from the UI. It suffices for the purpose of TIR if the run only completes the estimation portion successfully and the relevant attribute similarity tables get populated. The execution and results of the base demand and forecast generation steps do not influence the attribute similarity calculations.
Once the data is validated in the attribute similarity tables, approve the estimation run by clicking on the ‘Approve Demand Parameters’ button on the UI. There is no need to approve the base demand and forecast for the purpose of TIR. Only one estimation run remains approved under a run type. When a new estimation run is approved, the previous approved estimation run gets unapproved automatically. TIR uses the attribute similarity data from the approved estimation run under the TIR run type.

Table 23-9 Relevant Queries for Data Validation

--query 15: gives merchandise hierarchy type ID for the TIR run type

select * from RSE_FCST_DFLT_PARAMETER where FCST_RUN_TYPE_ID = <<ID of the TIR run type>> and PARAMETER_NAME = 'PROD_HIER_TYPE';

--query 16: gives location hierarchy type ID for the TIR run type

select * from RSE_FCST_DFLT_PARAMETER where FCST_RUN_TYPE_ID = <<ID of the TIR run type>> and PARAMETER_NAME = 'LOC_HIER_TYPE';

--query 17: gives the PROD_ATTR_GRP_ID that has been copied over from RSE_PROD_ATTR_GRP to the TIR run type based on BUSINESS_OBJECT_ATTR_MD_ID (BOAM_ID), CM_GRP_ID (this is PROD_HIER_ID at the level of CMGRP_LEVEL_ID in RSE_CONFIG), and RSE_PROD_ATTR_GRP.FUNC_ATTR_FLG = 'N'

select * from RSE_FCST_RUN_TYPE_ATTR_XREF where FCST_RUN_TYPE_ID = <<ID of the TIR run type>> and BOAM_ID = <<ID from query 3>> and CM_GRP_ID = <<PROD_HIER_ID from query 4>>;

--query 18: gives the list of estimation runs that are suitable for attribute-similarity based TIR usage (the run of interest should be in this list)

select * from PMO_RUN_HDR where FCST_RUN_TYPE_ID = <<ID of the TIR run type>> and ATTR_WGT_FLG = 'Y' and RUN_STATUS_ID = (select ID from PMO_RUN_STATUS where NAME = 'COMPLETE') order by ID desc;

--query 19: gives the PROD_ATTR_GRP_ID that has been copied over from the TIR run type to the estimation run based on BUSINESS_OBJECT_ATTR_MD_ID (BOAM_ID) and CM_GRP_ID (this is PROD_HIER_ID at the level of CMGRP_LEVEL_ID in RSE_CONFIG)

select * from PMO_RUN_ATTR_XREF where RUN_HDR_ID = <<ID of the estimation run>> and BOAM_ID = <<ID from query 3>> and CM_GRP_ID = <<PROD_HIER_ID from query 4>>;

--query 20: gives the SIM_ID/SRC_ID based on the estimation run and CM_GRP_ID (note that the TRADE_AREA_HIER_ID in this table is the Company/top-most level of the Local Hierarchy)

select * from PMO_SIM where RUN_HDR_ID = <<ID of the estimation run>> and CM_GRP_ID = <<PROD_HIER_ID from query 4>> and ACTIVE_FLG = 'Y';

--query 21: gives the SIM_INTERVAL_ID based on the groupings of the historical weeks (INTERVAL_LENGTH comes from SIM_CALC_INT_LENGTH in RSE_CONFIG)

select * from PMO_SIM_INTERVAL order by START_FISCAL_WK_ID;

--query 22: gives the leaf-level LOC_HIER_ID (Stores) based on the estimation run and SIM_ID

select * from PMO_SIM_LOC_CUSTSEG where RUN_HDR_ID = <<ID of the estimation run>> and SIM_ID = <<ID from query 20>>;

--query 23: gives the leaf-level PROD_HIER_ID (SKUs) and LOC_HIER_ID (Stores) pairs based on the estimation run, SIM_ID, and having the same PARENT_PROD_HIER_ID at the level of CMGRP_LEVEL_ID

select * from PMO_SIM_PR_LOC where RUN_HDR_ID = <<ID of the estimation run>> and SIM_ID = <<ID from query 20>> and PROD_HIER_ID in (select CHILD_PROD_HIER_ID from RSE_PROD_HIER_TC

where HIER_TYPE_ID = <<PARAMETER_VALUE from query 15>> and PARENT_HIER_LEVEL_ID = <<CMGRP_LEVEL_ID from RSE_CONFIG>> and PARENT_PROD_HIER_ID = <<PROD_HIER_ID from query 4>> and CURRENT_FLG = 'Y');

--query 24: gives the leaf-level PROD_HIER_ID (SKUs) and LOC_HIER_ID (Stores) pairs based on the estimation run, SIM_ID, SIM_INTERVAL_ID, and having the same PARENT_PROD_HIER_ID at the level of CMGRP_LEVEL_ID (SLS_IDX is Baseline Sales for the SKU-Store-Interval / Total Baseline Sales for the Store-Interval; SLS_SHARE is Baseline Sales for the SKU-Store-Interval / Average Baseline Sales for the Store-Interval)

select * from PMO_SIM_SLS where RUN_HDR_ID = <<ID of the estimation run>> and SIM_ID = <<ID from query 20>> and SIM_INTERVAL_ID = <<ID from query 21>> and PROD_HIER_ID in (select CHILD_PROD_HIER_ID from RSE_PROD_HIER_TC

where HIER_TYPE_ID = <<PARAMETER_VALUE from query 15>> and PARENT_HIER_LEVEL_ID = <<CMGRP_LEVEL_ID from RSE_CONFIG>> and PARENT_PROD_HIER_ID = <<PROD_HIER_ID from query 4>> and CURRENT_FLG = 'Y');

--query 25: gives the ATTR_WGT based on SIM_ID, BUSINESS_OBJECT_ATTR_MD_ID (BOAM_ID), and PROD_ATTR_GRP_ID

select * from PMO_SIM_ATTR_WGT where SIM_ID = <<ID from query 20>> and BOAM_ID = <<ID from query 3>> and PROD_ATTR_GRP_ID = <<ID from query 4>>;

--query 26: gives the ID of the source for attribute similarity

select * from PMO_PROD_SIM_SRC where SRC_NAME = 'PMO';

--query 27: gives the SIM_VALUE for pairs of leaf-level PROD_HIER_ID (SKUs) based on PROD_SIM_SRC_ID, HIER_TYPE_ID, SRC_ID, and having the same PARENT_PROD_HIER_ID (CM_GRP_ID)

select * from PMO_PROD_SIM where PROD_SIM_SRC_ID = <<ID from query 26>> and HIER_TYPE_ID = <<PARAMETER_VALUE from query 15>> and SRC_ID = <<ID from query 20>> and CM_GRP_ID = <<PROD_HIER_ID from query 4>> and SIM_VALUE > 0;

Fetching Similar Items Based on Attribute Similarity Calculated in AIF

Given the list of items in a shopping cart at a particular store, the TIR for Up-sell service fetches the most similar items from the same store for each item in the cart based on the attribute similarity results.

Table 23-10 Relevant Queries for Data Validation

--query 28: gives the leaf-level LOC_HIER_ID for the input Store

select * from RSE_LOC_SRC_XREF WHERE APP_SOURCE_ID = (select ID from RSE_APP_SOURCE where NAME = 'RA') and HIER_TYPE_ID = <<PARAMETER_VALUE from query 16>> and CURRENT_FLG = 'Y' and LEAF_NODE_FLG = 'Y' and LOC_EXT_KEY = <<input Store>>;

--query 29: gives the leaf-level PROD_HIER_ID for the input SKU

select * from RSE_PROD_SRC_XREF WHERE APP_SOURCE_ID = (select ID from RSE_APP_SOURCE where NAME = 'RA') and HIER_TYPE_ID = <<PARAMETER_VALUE from query 15>> and CURRENT_FLG = 'Y' and LEAF_NODE_FLG = 'Y' and LEVEL_ID = <<input SKU>>;

--query 30: gives the similar items ranked in descending order of similarity

with SIMS as (

SELECT PS.PROD_HIER_ID1 AS DRIVING_PROD_HIER_ID, PS.PROD_HIER_ID2 AS SIM_PROD_HIER_ID, PS.SIM_VALUE

from PMO_PROD_SIM PS

join PMO_SIM_PR_LOC SPL

on ( PS.PROD_HIER_ID2 = SPL.PROD_HIER_ID )

where SPL.LOC_HIER_ID = <<LOC_HIER_ID from query 28>>

and PS.PROD_HIER_ID1 = <<PROD_HIER_ID from query 29>>

and SPL.RUN_HDR_ID = <<ID of the approved estimation run>>

and SPL.SIM_ID in ( select ID from PMO_SIM S where S.RUN_HDR_ID = <<ID of the approved estimation run>> and S.ACTIVE_FLG = 'Y' )

and PS.SRC_ID = SPL.SIM_ID

and PS.PROD_SIM_SRC_ID = <<ID from query 26>>

and PS.SIM_VALUE > 0

union all

select PS.PROD_HIER_ID2 as DRIVING_PROD_HIER_ID, PS.PROD_HIER_ID1 as SIM_PROD_HIER_ID, PS.SIM_VALUE

from PMO_PROD_SIM PS

join PMO_SIM_PR_LOC SPL

on ( PS.PROD_HIER_ID1 = SPL.PROD_HIER_ID )

where SPL.LOC_HIER_ID = <<LOC_HIER_ID from query 28>>

and PS.PROD_HIER_ID2 = <<PROD_HIER_ID from query 29>>

and SPL.RUN_HDR_ID = <<ID of the approved estimation run>>

and SPL.SIM_ID in ( select ID from PMO_SIM S where S.RUN_HDR_ID = <<ID of the approved estimation run>> and S.ACTIVE_FLG = 'Y' )

and PS.SRC_ID = SPL.SIM_ID

and PS.PROD_SIM_SRC_ID = <<ID from query 26>>

and PS.SIM_VALUE > 0

)

, RANKED_SIM as (

select SIM.*, dense_rank() over (partition by SIM.DRIVING_PROD_HIER_ID order by SIM.SIM_VALUE desc, SIM.SIM_PROD_HIER_ID) as RNK

from SIMS SIM

)

select PSX.LEVEL_ID as PROD_LEVEL_ID, PSXS.LEVEL_ID as SIM_PROD_LEVEL_ID

from RANKED_SIM RS

join RSE_PROD_SRC_XREF PSX

on ( PSX.HIER_TYPE_ID = <<PARAMETER_VALUE from query 15>>

and PSX.APP_SOURCE_ID = (select ID from RSE_APP_SOURCE where NAME = 'RA')

and PSX.PROD_HIER_ID = RS.DRIVING_PROD_HIER_ID

and PSX.CURRENT_FLG = 'Y'

and PSX.LEAF_NODE_FLG = 'Y')

join RSE_PROD_SRC_XREF PSXS

on ( PSXS.HIER_TYPE_ID = <<PARAMETER_VALUE from query 15>>

and PSXS.APP_SOURCE_ID = (select ID from RSE_APP_SOURCE where NAME = 'RA')

and PSXS.CURRENT_FLG = 'Y'

and PSXS.LEAF_NODE_FLG = 'Y'

and PSXS.PROD_HIER_ID = RS.SIM_PROD_HIER_ID )

where RS.RNK <= <<Number of similar items to fetch per item in the shopping cart>>;