Advanced Clustering

4 Advanced Clustering

This chapter describes the Advanced Clustering Cloud Service module.

Introduction

Advanced Clustering is an enterprise-specific clustering solution that uses data mining to create store groupings at different product levels using a variety of inputs. These inputs include performance data (sales dollars, sales units, and gross profit), product attributes (brand, color, and size/fit), store attributes (climate, store format, size, and servicing distribution center), third-party data such as demographics (income, ethnicity, and population density), and customer segments.

The application's embedded science and automation helps you to identify unique patterns within your data that you can use to create the necessary customer-centric and targeted clusters. These can be used by the assortment planning, allocation and replenishment, pricing, and promotion processes.

It optimizes clusters in order to determine the minimum number of clusters that best describes the historical data used in the analysis and that best meets your business objectives, which you define during the design of your clusters.

You can use Advanced Clustering to execute localized or customer-centric assortments and for pricing. In addition, the application can help you when forecasting, for example, if you want to cluster stores based on similar seasonal patterns. You can also use the application for allocation, by clustering stores based on similar selling patterns.

Features

The key features of Advanced Clustering Cloud Service include:

Scenario-based cluster generation, based on store or product attributes, customer segment profiles, or performance.
Three-step cluster-generation process.
What-if capabilities that can be used to create multiple clustering scenarios and then measure them against one another. This can help ensure that the most appropriate clusters are used by the applicable planning and execution processes.
Automatic ranking of cluster scenarios to support what-if comparisons. Recommendations for the optimal cluster scenario and number of clusters are provided.
Dynamic nesting of clusters, in which nested or mixed attribute clusters are created based on multiple attributes, performance data, and customer segments.
Two types of algorithms are used.
- Proprietary BaNG (Batch Neural Gas) algorithm for convergent cluster parameters
- K-means approach for creating clusters in a hierarchical manner, which automatically determines the best attributes to split into an additional cluster.
A variety of distance metrics that are suitable for real-value attributes, categorical attributes, profile-based measurements, and time-based performance.

Overview of Advanced Clustering Process

To use Advanced Clustering, follow this general process to create and manage clusters, working in the Generate Store Clusters tab and the Manage Store Clusters tab:

Cluster Criteria. View all available clusters for specified merchandise, location, and calendar. Review cluster criteria or scenario details for each cluster. Use existing cluster as the basis for creating a new cluster.
Explore Data. Examine the input data for the cluster. Review multiple data points and significant attributes using the contextual information.
Cluster Setup. Define multiple what-if scenarios. Such scenarios can be compared with one another throughout the clustering process.
Cluster Results. View the scenario results and compare scenarios.
Cluster Insights. Gain an understanding about cluster results and cluster performance prior to approval by examining the contextual information.
Manage Store Clusters. Manage existing cluster criteria. Perform manual overrides and approve clusters.

Cluster Criteria Overview Tab

The Cluster Criteria Overview tab displays a list of the most recently defined cluster criteria and provides the status, clustering criteria, applicable merchandise, location, and calendar nodes. You can click on the criteria name in order to access it within the Generate Store Clusters tab.

Figure 4-1 Cluster Criteria Overview Tab

Description of "Figure 4-1 Cluster Criteria Overview Tab"

Table 4-1 Cluster Criteria Overview Tab

Field	Description
Name	The criteria ID and user-assigned name of the cluster.
Cluster By	A predefined group of attributes that include Consumer Profile, Product Performance, Store Attribute, Product Attribute, and Mixed Attribute. These criteria types are sets of attributes. For example, store attributes are the properties of a store. These properties can include ethnicity, store format, and store size.
Created By	The name of the user who created the cluster.
Last Updated By	The name of the user who most recently updated the cluster.
Last Updated On	The date when the cluster was most recently updated.
Status	Created, Ready for Approval, Completed with Errors, Approved, Rejected.
Period Count	The number of calendar nodes defined for the criteria. Hover over the count in order to see a list of the calendar keys associated with the criteria.
Merchandise Count	The number of merchandise nodes defined for the criteria. Hover over the count in order to see a list of the merchandise keys associated with the criteria.
Location Count	The number of location nodes defined for the criteria. Hover over the count in order to see a list of the location keys associated with the criteria.

Clustering Criteria

The following clustering criteria (which are also called "Cluster by") are the defaults:

Consumer Profile

Cluster stores based on the similarities in the customer profile mix whose members shop in the stores or trading areas. These clusters form the basis for additional analysis that can provide an understanding of which customers shop in which stores and how they shop. Information from market research firms such as the Nielsen Corporation can help retailers develop customer profiles. Such information can be provided via a data interface.

Location Attributes

Cluster stores based on how shopping behavior varies by store attribute. In combination with the profile mix, this provides an understanding of demographic details such as income level, ethnicity, education, household size, and family characteristics. Such knowledge can help the retailer to make assortment and pricing decisions. By analyzing cluster composition and studying business intelligence, the retailer can make informed decisions based on shopper demographics.

Product Attributes

Store share is generated based on product attributes. The store clusters produced can be used in an assortment. In this type of cluster, stores with a similar share of sales for one or more attributes are grouped together. For example, for the product coffee, stores can be differentiated by the sales patterns for premium, standard, and niche brands. The percentage of each store contribution is calculated using Sales Retail $ for each product attribute value to the total sales retail for the category or subcategory in a specified location. Product attributes can only be configured at the category or subcategory level.

Performance Criteria

Cluster stores based on the historical sales metrics by performance at various merchandise levels. Determine how shopping behavior varies by category. This information can be helpful in identifying low, medium, and high-volume stores that all have similar sales patterns.

Mixed Criteria

Mixed criteria combine discrete and continuous attributes together. This allows a retailer to cluster stores using attributes from all the first four listed cluster criteria at the same time.

Generate Store Clusters Tab

The Generate Store Clusters tab is used to create clusters and then model the clusters with various scenarios in order to determine the best clusters. It consists of three stages: Cluster Criteria, Cluster Results, and Cluster Insights.

Figure 4-2 Generate Store Clusters Tab

Description of "Figure 4-2 Generate Store Clusters Tab"

Cluster Criteria Stage

In this stage, you can view summary data about existing clusters and define the characteristics of new clusters.

Process

Here is the high-level process for defining a cluster.

Provide a unique name for the cluster.
Define the type of data used to characterize the cluster.
Select merchandise and location nodes.
Define the time period for the cluster.
Define the historical time period for the data.

All Cluster Criteria

In this area of the page you can view information about existing clusters.

Use the View list to select existing cluster criteria. You can tailor your search for existing clusters by Merchandise, Location, and Calendar. Once you select a cluster, the defining details for that cluster are displayed in the Worksheet area.

Figure 4-3 All Cluster Criteria

Description of "Figure 4-3 All Cluster Criteria"

In addition, you can use the toolbar buttons to:

Figure 4-4 All Cluster Criteria Toolbar

Description of "Figure 4-4 All Cluster Criteria Toolbar"

Create a cluster criteria. You can define an initial cluster. The criteria include cluster name, the category for the cluster, the effective date for the cluster, and the history to use.
Created a nested cluster criteria. In this way you can subdivide an existing cluster in order to analyze it further. Once you create a nested cluster, the name "Nested of <name of original cluster>" appears in the Cluster Criteria area. You can then define its characteristics in the same way you define any cluster.
Create or edit a scenario. Modify the configured primary scenario or create another scenario to perform what-if analysis for a selected cluster criteria.
View or edit a cluster criteria. The characteristics of the cluster are displayed in the Cluster Criteria pop-up.
Copy a cluster criteria. Once you have copied it, you can modify it.
Delete a cluster criteria. Delete the selected cluster criteria.
Execute a cluster criteria. Execute all non-executed scenarios for the selected criteria.
Execute all cluster criteria. Execute the entire cluster hierarchy criteria for the selected criteria at once.
Refresh all. Refresh all cluster criteria in order to view any updates to the existing cluster criteria.

All Cluster Criteria Filter

The All Cluster Criteria filter provides the following:

Merchandise allows the user to filter the existing cluster criteria by searching for or selecting the merchandise for the supported hierarchies.
Location allows the user to filter the existing cluster criteria by searching for or selecting the location for the supported hierarchies.
Calendar allows the user to filter the existing cluster criteria by searching for or selecting the calendar for the supported hierarchies.

Figure 4-5 All Cluster Criteria Filter

Description of "Figure 4-5 All Cluster Criteria Filter"

All Cluster Criteria Summary

Once you have highlighted a cluster criteria to examine, details about that cluster criteria are displayed in a pop-up. The details include information about the cluster criteria and the scenarios created for that cluster criteria.

Table 4-2 Pop-Up Details

Field Name	Description
Cluster By	A predefined group of attributes that include Consumer Profile, Product Performance, Store Attribute, Product Attribute, and Mixed Attribute. These criteria types are sets of attributes. For example, store attributes are the properties of a store. These properties can include ethnicity, store format, and store size.
Shared Criteria	A check mark indicates that more than one merchandise or location node is used in the cluster criteria.
Merchandise Type	The merchandise type.
Scenario Created	The number of scenarios created for the cluster.
Scenario Executed	The number of scenarios executed for the cluster.
Location Type	The location type.
Parent Cluster Level	The name of the ancestor cluster that has been further clustered.

All Cluster Criteria Scenario List

This displays the scenarios for the selected cluster criteria in the All Cluster Criteria tree.

Table 4-3 Scenario List

Field Name	Description
Name	The name assigned to each scenario that has been created for the cluster.
Status	Created, Ready for Approval, Completed with Errors, Approved, Rejected.
User Preferred	Indicates whether or not the user prefers the cluster.
System Preferred	Indicates whether or not the system prefers the cluster.
# of Attributes	The number of attributes that were used in the cluster.
Max. # of Clusters	A user-provided value for the maximum clusters centers that the clustering process should consider.

Cluster Criteria

In this pop-up, you define the initial clustering parameters for the cluster criteria of a new cluster. Note that multiple hierarchies are supported in order to facilitate comparisons between clusters. For example, you can compare clusters for the market and retail location hierarchy.

Figure 4-6 illustrates how to use a simple approach to clustering by selecting attributes from a Cluster by. For example, you can select the performance Cluster by and generate clusters using store sales units or revenue.

Figure 4-6 Cluster Criteria

Description of "Figure 4-6 Cluster Criteria"

Figure 4-7 illustrates the use of a nested approach to clustering. Select a Cluster by hierarchy using the dynamic hierarchy pop-up or the pre-configured template hierarchies. For example, you can first create a cluster using the performance Cluster by and then further cluster using location attributes.

Figure 4-7 Cluster Criteria - Selection

Description of "Figure 4-7 Cluster Criteria - Selection"

The following information defines a cluster:

Table 4-4 New Cluster Definition

Field Name	Description
Name	A unique name to identify the cluster.
Cluster By	A predefined group of attributes that include Consumer Profile, Product Performance, Store Attribute, Product Attribute, and Mixed Attribute. These criteria types are sets of attributes. For example, store attributes are the properties of a store. These properties can include ethnicity, store format, and store size.
Merchandise	Once you choose the merchandise level for the cluster, you must select the hierarchy type, the hierarchy level, and the hierarchy node. These are specific to the merchandise level you select.
Location	Once you choose the location level for the cluster, you must select the hierarchy type, the hierarchy level, and the hierarchy node. These are specific to the location level you select.
Template	Select by name a predefined template that can be used to create a cluster hierarchy.

Effective Period

You can define a time interval for the cluster by either choosing a period from the list provided or by selecting a start date and an end date.

To define the Effective Period, you select either Planning Period, Fiscal Period, or Select Date:

Table 4-5 Effective Period

Option	Description
Fiscal Period	If you select this option, choose the period and the subdivisions of that period from the drop-down lists.
Planning Period	Select from the range of values provided for the period. Planning periods are user-defined buying periods for a season or a season subset.
Select Dates	If you select this option, choose the start and end dates using the calendar pop-up.

Summarization

Data summarization is available when you select the product performance Cluster by. Select the dimensions of the hierarchy (merchandise or calendar) to summarize the data and consider the dimension position in the clustering process. For example, when you use the category/week sales data to generate store clusters, you can select the week dimension in the calendar hierarchy summarization. The clustering process considers all weeks (week1, week2 ,... week52) as attributes and clusters stores based on their weekly sales patterns.

Figure 4-8 Summarization

Description of "Figure 4-8 Summarization"

Source Time Period

Select historical sales data for clustering and view contextual data to analyze cluster performance. You can specify more than one time period and assign different weights to different periods in order to place more or less emphasis on different periods.

Source time periods are available for all cluster criteria. With product and performance or mixed criteria, when performance metrics are used for clustering, these define the historical data used for the calculation. This time period also defines the historical data used to display BI when sales metrics are shown.

Table 4-6 Source Time Period

Field	Description
Period Level	Select from Fiscal Year, Fiscal Quarter, Fiscal Period, or Fiscal Week.
Start Period	Once you select the Period Level, you select the starting subdivision within that period.
End Period	Once you select the Period Level, you select the ending subdivision within that period.
Weight (%)	Used to define the weight given to the historical data from the defined time period.

Contextual Area

When you are creating a new cluster criteria, you can see details about the following parameters that can help you understand the cluster you are creating.

Cluster By Hierarchy

The following information is displayed when you select a template or use the icon to select the Cluster by hierarchy.

Template

Table 4-7 Template Display

Property	Description
Template Name	Name of template configured during deployment.
Description	Description of template.
Cluster By	A predefined group of attributes that include Consumer Profile, Product Performance, Store Attribute, Product Attribute, and Mixed Attribute. These criteria types are sets of attributes. For example, store attributes are the properties of a store. These properties can include ethnicity, store format, and store size.

Hierarchy

A dynamic Cluster by hierarchy is displayed. For example, the template PE-ST-ST has a Cluster by hierarchy of performance/store attribute/store attribute.

Cluster By Primary Scenario

You see this when you select Cluster by in the Criteria panel when you are setting the cluster parameters or when you select Cluster by in the contextual area for the hierarchy.

The attributes configured and the primary scenario properties defined for the selected cluster are displayed. The attributes listed are those that are significant for the clustering defined during deployment.

The primary scenario is the default scenario defined during deployment. The following information is displayed.

Table 4-8 Primary Scenario

Property	Description
Name	The name of the primary scenario.
Status	Created, Ready for Approval, Completed with Errors, Approved, Rejected.
Maximum # clusters	The maximum number of clusters. The default value is 100. This is used for analyzing the clusters.
Minimum # clusters	The minimum number of clusters. The default value is 1. This is used for analyzing the clusters.
Attribute	A list of the attributes configured during clustering.
Attribute weight	The weights associated with each attribute. This is used to calculate distance.

Planning Period

This list displays the time period you selected for the cluster definition. This information is available only for planning periods, where it provides the start and end dates of the planning period. This content changes whenever planning period is selected in Effective Period when you are setting cluster parameters.

Figure 4-9 Contextual Information

Description of "Figure 4-9 Contextual Information"

Explore Data

Use the Explore Data pop-up to examine data for the cluster you defined. You can view the store that provides input into the clustering process.

Process

In this pop-up you can only view the data, so the only actions you can perform are drilling down through the data in the table and altering the arrangement of the table.

Summary

This area lists the criteria you initially selected to define the cluster.

Figure 4-10 Cluster Criteria Summary

Description of "Figure 4-10 Cluster Criteria Summary"

Table 4-9 Explore Data: Summary

Field	Description
Name	The name you provided for the cluster in the Cluster Criteria stage.
Cluster By	A predefined group of attributes that include Consumer Profile, Product Performance, Store Attribute, Product Attribute, and Mixed Attribute. These criteria types are sets of attributes. For example, store attributes are the properties of a store. These properties can include ethnicity, store format, and store size.
Merchandise	The merchandise level and nodes for the cluster.
Location	The location level and nodes for the cluster.
Fiscal Period	The time period for the cluster.
Is Nested	Indicates whether or not the cluster is nested within another cluster.
Merchandise Hierarchy Type	Provides details about which type of hierarchy the cluster criteria have been created for.

View Stores

This area displays a nested list of the stores in the cluster you have defined and data for each store for each of the relevant attributes for the Cluster by option you selected to define the cluster. You can see data at the aggregated level as well as at the individual level. Filters are provided so that you can filter the display, for example, by category. You can see aggregated data at a higher level as well as at the store level.

Figure 4-11 Explore Data

Description of "Figure 4-11 Explore Data"

Product Performance

This section displays stores, store sales metrics, location attributes, customer profiles, and product attribute profiles.

Table 4-10 Product Performance

Field	Description
Sales Unit	The sales units for the merchandise, location, and source time period selected when setting up clustering parameters.
Sales Average Unit Retail	The average unit retail sales for the merchandise, location, and source time period selected when setting up clustering parameters.
Sales Retail	The sales revenue for the selected merchandise, location, and source time period.
Gross Margin	The retail sales minus the cost of goods sold for the merchandise, location, and source time period selected when setting up clustering parameters.
Gross Margin Percent	The retail sales minus the cost of goods sold divided by the retail sales for the merchandise, location, and source time period selected when setting up clustering parameters.
Location Attribute	The retailer-configured location attributes.
Customer Profile	The retailer-configured customer profiles.
Product Attribute	The retailer-configured product attributes.

Contextual Area

This area provides a graphical illustration of the detailed data distribution about the cluster.

Analyze Stores

In Explore Data, the BI displays the data distribution of the location by each participating attribute as well as other configured informational attributes. Advanced Clustering identifies the bins based on the underlying data and displays the histograms. It provides the percentage of stores that are present in a location. For example, a company may have 45 percent of stores in cold regions.

Figure 4-12 Clustering Analyze Stores

Description of "Figure 4-12 Clustering Analyze Stores"

Category Variability

By analyzing store variability, you can determine if it is worth creating store clusters for the selected categories in the selected location. Three sections are displayed.

A grid is displayed for the selected categories and the sales contribution for a selected location.

Table 4-11 Category Variability

Property	Description
Categories	A list of the selected categories that are used for store variability analysis.
Variability	The relative standard deviation of the stores in the category. A larger value for the standard deviation indicates greater store variability for the category. Such a category is a possible candidate for store clustering.
Index to average	For a selected location, an indication of how the store performs compared to the all store base. A value close to 1 indicates that the selected location is similar to the all store base. If the value is lower or higher, it indicates that the sales averages for the stores in the selected location are different from the all store base and that you should consider creating store clusters for the selected location.
Average store retail	Average store retail $ for the category for the selected location.
Average store unit	Average store units for the category for the selected location.
Positive/negative index to average	The difference in value for the index to average for the all store base to selected location. For example, a value of 1-index to average < 1 or a value of index to average -1 > 1.

A graph is displayed for the index to average. This shows how the selected location performs compared to the all store base if the average sales metric is below, above, or the same when compared to the all store base average. A red color indicates a value below the all store base average. A blue color indicates a value above the all store base average.

A graph for standard deviation is displayed. This shows the standard deviation for the selected category. If the store value is greater than two standard deviations, then store clustering should be considered for the selected merchandise because the stores sales variability is sufficient.

Figure 4-13 Category Variability

Description of "Figure 4-13 Category Variability"

Attribute Group Variability

This is used to analyze the most significant attributes (product or location) for a specific merchandise in a selected location.

The attribute group variability section shows attribute graphs for location and product, indicating the key attributes that are driving sales performance.

The attribute variability section, for each attribute group, shows the sales variability for each attribute value in the selected stores. It calculates the index for the attribute group, indicating attribute significance.

Figure 4-14 Attribute Group Variability

Description of "Figure 4-14 Attribute Group Variability"

Cluster Setup Stage

You can use this stage to perform what-if analysis by defining one or more scenarios that are based on a specified number of clusters and attributes. You can select one or more attributes and assign different weights to the attributes. The attributes and weights you assign are then fed to the clustering analytics to calculate the weighted distance. Using these scenarios, you can experiment with different numbers of clusters, participating attributes, and their weights. You can either define the maximum number or the minimum number of clusters or alternatively define a specific number of clusters that you want to be generated. Once the scenarios are generated, different scenarios can be compared. You can also use other features in this stage to copy or delete scenarios.

Process

Here is the high-level process for setting up scenarios.

Either select the name of a scenario you want to modify or enter a name for the new cluster you want to create.
If you want the application to optimize the number of clusters, enter minimum and maximum values for the number of clusters.
If you want the application to generate a specific number of clusters, enter that value. In this case, the application generates the exact number of clusters and provides the optimal number of clusters as informational data.
Optionally, configure the weights assigned to the attributes. The total must add up to 100 percent. Use a value of 0 percent if you do not want a specific attribute to be part of the clustering process.
Click the Execute icon to execute the scenario. Once the processing is complete, you see the results in the Cluster Results stage.
To see a list of all scenarios and the status for each, go to the Scenario List tab.
To compare the defining characteristics of two different scenarios, go to the Scenario Compare tab.

Summary

This lists the criteria you initially selected to define the cluster.

Figure 4-15 Cluster Criteria Summary

Description of "Figure 4-15 Cluster Criteria Summary"

Table 4-12 Cluster Criteria Summary

Field	Description
Name	The name you provided for the cluster in the Cluster Criteria stage.
Cluster By	A predefined group of attributes that include Consumer Profile, Product Performance, Store Attribute, Product Attribute, and Mixed Attribute. These criteria types are sets of attributes. For example, store attributes are the properties of a store. These properties can include ethnicity, store format, and store size.
Merchandise	The merchandise level and nodes for the cluster.
Location	The location level and nodes for the cluster.
Fiscal Period	The time period for the cluster.
Is Nested	Indicates whether or not the cluster is nested within another cluster.
Merchandise Hierarchy Type	Provides details about which type of hierarchy the cluster criteria have been created for.

Scenario Definition Section

This area has three tabs: Scenario Definition, Scenario List, and Scenario Compare.

Scenario Definition Tab

Figure 4-16 Clustering Scenario Definition

Description of "Figure 4-16 Clustering Scenario Definition"

The following information is needed to define a scenario.

Table 4-13 Scenario Definition

Field Name	Description
Select Scenario	Select an existing scenario if you want to modify it.
Name	A unique name that identifies the scenario being defined.
Max. # of Clusters	Set the maximum number for the total number of clusters that can be generated. The application determines the optimal number of clusters during the generation process.
Min. # of Clusters	Set the minimum number for the total number of clusters that can be generated. The application determines the optimal number of clusters during the generation process.
Exact # of Clusters	Indicates that the exact number of clusters should be generated. The application does not determine the optimal number of clusters.

Attributes

The Attributes table is used to define which attributes are included in the cluster criteria and the weights that should be assigned to each participating attribute. You can

search by attribute, attribute value, and attribute weight
assign equal weight to the selected attribute
assign weight to the selected attribute or attribute value
reset weights to default values

The following information defines the attributes that are participating or non-participating.

Table 4-14 Attributes

Field Name	Description
Participating	A check in this column indicates that the attributes participate in the cluster criteria.
Groups	Identifies the group.
Attributes	A description of the attribute.
Weights	The weight assigned to the attribute. All participating attributes can have the same weight or each participating attribute can have a unique weight. The total of all the weights must add up to 100 percent.

The Attributes toolbar, shown in Figure 4-17, includes the following functionality:

Table 4-15 Attribute Toolbar

Function	Description
Action menu	Resets the weights to the default value overrides that the user provided during configuration.
Weight status	Provides the weight validation status. If the weights do not add up to 100 percent, then a warning is displayed and the scenario cannot be executed.
Include or exclude attributes	Any attribute with a weight equal to zero is not included in the clustering process.
Normalize	Scaling attribute weights to ensure that weights are valid. User-provided weights are normalized by applying a weighting average that adds up to 100 percent.

Figure 4-17 Attribute Toolbar

Description of "Figure 4-17 Attribute Toolbar"

Contextual Area

The contextual business intelligence lists a set of attributes and weights that the current scenario includes as the participating attributes for the clustering process.

Figure 4-18 Scenario Attributes

Description of "Figure 4-18 Scenario Attributes"

Scenario List Tab

The Scenario List summarizes the characteristics for each scenario.

Figure 4-19 Scenario List

Description of "Figure 4-19 Scenario List"

You can make a copy of a specific scenario in order to modify it in some way, delete a specific scenario, execute a specific scenario, or save a specific scenario. You can also refresh the scenario list in order to view the updated scenario status.

Table 4-16 Scenario List

Field Name	Description
Name	The unique name that identifies the scenario.
Status	Created, Ready for Approval, Completed with Errors, Approved, Rejected.
# of Attributes	The number of attributes is defined by the Cluster by option you select and the weights you optionally assign.
Max. # of Clusters	If you provided a value for this in the scenario definition, that number is displayed here.
Min. # of Clusters	If you provided a value for this in the scenario definition, that number is displayed here.

Scenario Compare Tab

You can select two scenarios from the list to compare. The scenarios you select from the Scenario list are shown side-by-side to facilitate the comparison.

Figure 4-20 Scenario Compare

Description of "Figure 4-20 Scenario Compare"

Cluster Results Stage

After you select a scenario and execute it, you can see the results in this stage. The application uses the data and the parameters you defined in order to group stores together that are most similar according to the characteristics you selected and to separate stores that are most dissimilar. You can also use this stage to rename a cluster.

Process

You use this stage to review clusters, their composition, and the cluster hierarchy, using the grid view and the graph view. This includes

Reviewing a cluster to see the goodness of fit by using the scores. Determine if any clusters are outliers that warrant further analysis.
Rank the scenarios (cluster sets) to see how well they are separated and how compact the stores are within each cluster.
View the optimality of the clusters recommended by the application to determine if increasing the number of clusters beyond the optimal number is significant.
Rename the cluster after analyzing the centroids and before the cluster is approved.

Summary

This lists the criteria you initially selected to define the cluster.

Figure 4-21 Cluster Criteria Summary

Description of "Figure 4-21 Cluster Criteria Summary"

Table 4-17 Cluster Criteria Summary

Field	Description
Name	The name you provided for the cluster in the Cluster Criteria stage.
Cluster By	A predefined group of attributes that include Consumer Profile, Product Performance, Store Attribute, Product Attribute, and Mixed Attribute. These criteria types are sets of attributes. For example, store attributes are the properties of a store. These properties can include ethnicity, store format, and store size.
Merchandise	The merchandise level and nodes for the cluster.
Location	The location level and nodes for the cluster.
Fiscal Period	The time period for the cluster.
Is Nested	Indicates whether or not the cluster is nested within another cluster.
Merchandise Hierarchy Type	Provides details about which type of hierarchy the cluster criteria have been created for.

Scenario Results Section

The Scenario Results section displays the following:

The Scenario Summary, which provides key cluster set attributes for the executed scenario as well as its status.
The Scenario Results, which has three tabs: Clusters, Cluster Composition, and Cluster Hierarchy.

Figure 4-22 Clustering Scenario Results

Description of "Figure 4-22 Clustering Scenario Results"

The Summary section provides an overview of the characteristics of the clusters.

Table 4-18 Scenario Results Summary

Field	Description
Status	Created, Ready for Approval, Completed with Errors, Approved, Rejected.
Optimal # of Clusters	The optimal number of clusters determined by the optimization.
Rank	The application compares executed scenarios and ranks them. A value of 1 indicates the best scenario.
Max./Min # of Clusters	The number you provided for the maximum and minimum number of clusters to calculated.
Largest/Smallest Cluster	Provides the sizes of the largest cluster and the smallest cluster in order to show the range of values.
Is System Preferred	Indicates whether or not the system prefers the scenario.
Is User Preferred	Indicates whether or not the user prefers the scenario.

The Clusters section provides the cluster results for each individual cluster in the scenario in either a Graph View or a Table View. The attributes displayed depend on the Cluster by option chosen in the Cluster Criteria stage.

The Graph View shows the percentage for each attribute in the cluster.

Figure 4-23 Cluster Results Graph View

Description of "Figure 4-23 Cluster Results Graph View"

The Table View provides details that can help you analyze the cluster.

Table 4-19 Scenario Results - Clusters: Table View

Field	Description
Name	The name you assigned to the cluster.
# of Stores	The number of stores in the cluster.
% of Total Stores	The percentage of the total stores that the number of stores represents.
Nearest Cluster Name	The name of the cluster that is most similar to this cluster.
Score %	This value is calculated at the level of store and then averaged to the cluster. The probability, expressed as a percentage, of a store being present in this cluster rather than any of the other clusters.
Has Outlier	Indicates a cluster with the number of stores below a threshold. For example, the number of stores are below certain percentage of the number of stores in a cluster.
Has Low Score	The score threshold can be defined in two ways. The default threshold is calculated as a probability that a store exists in any one of the clusters. The user can further override this threshold at the time of deployment by each Cluster by.
Attributes and their %	For each attribute specific to the Cluster by option, the value indicates the percentage that attribute represents within the total cluster.

The Clusters Composition sub-tab breaks down the cluster into its component parts and shows the percentages for each attribute.

The Table View, shown in Figure 4-24, shows attributes and the score percent.

Figure 4-24 Cluster Composition Table View

Description of "Figure 4-24 Cluster Composition Table View"

The Graph View shows the centroid of a cluster and allows the user to compare each store with the cluster centroid.

Cluster Hierarchy

The cluster hierarchy shows the parent-to-child cluster relationship with the cluster centers. Any additional store attributes are rolled up as averages or modes for each cluster level. You can view the cluster hierarchy for each scenario.

The cluster hierarchy is displayed for each scenario. In the case of nested hierarchies, the cluster hierarchy is displayed in tree format. You can select the attributes to view and export the selected attributes with a cluster hierarchy tree in Excel format.

Figure 4-25 Cluster Hierarchy Grid View

Description of "Figure 4-25 Cluster Hierarchy Grid View"

Export to Excel

You can select attributes to view data in Pivot and Excel to Export. On selecting icon export to excel user can open or save excel sheet. See excel format below to review cluster hierarchy and their aggregate.

Figure 4-26 Save as Excel File

Description of "Figure 4-26 Save as Excel File"

The Excel-generated file can be then used by external visualization tools to generate graphs not supported in the application.

Figure 4-27 Excel File

Description of "Figure 4-27 Excel File"

Scenario List

The Scenario List section contains one table with details about each cluster. Here, you can approve or reject a cluster.

After a cluster is approved, it is available for other applications.

Figure 4-28 Clustering Scenario List

Description of "Figure 4-28 Clustering Scenario List"

Table 4-20 Scenario List

Field	Description
Name	Name assigned to the scenario.
Status	Created, Ready for Approval, Completed with Errors, Approved, Rejected.
Parent Cluster Level #	The parent cluster name to which the scenario results apply if nested cluster criteria is selected.
# of Attributes	The total number of attributes used, as determined by the weight assigned to each attribute.
Max. # of Clusters	The value used for the maximum in the scenario execution, if this option used.
Min. # of Clusters	The value used for the minimum in the scenario execution, if this option used.
Optimal # of Clusters	The value used for the optimal number of clusters in the scenario execution, if this option used.
System Preferred	Indicates whether the scenario is the one the application prefers.
User Preferred	Indicates whether the scenario is one the user prefers.
Rank Sequence	Indicates the ranking the scenario is given by the application.

Scenario Compare

The Scenario Compare section shows two clusters of your choosing side by side so that you can compare the main results of each, using the same characteristics used in Scenario Results and Scenario List.

Figure 4-29 Compare Scenarios

Description of "Figure 4-29 Compare Scenarios"

The information displayed includes:

Table 4-21 Scenario Compare

Field Name	Description
Max. # of Clusters	The value used for the maximum in the scenario execution, if this option used.
Min. # of Clusters	The value used for the minimum in the scenario execution, if this option used.
Optimal # of Clusters	The value used for the optimal number of clusters in the scenario execution, if this option used.
Rank	The value for the rank.
Is System Preferred	Indicates whether the scenario is the one the application prefers.
Is User Preferred	Indicates whether the scenario is the one the user prefers.
Smallest Cluster Size	The size of the smallest cluster.
Largest Cluster Size	The size of the largest cluster.
Has Outlier	Indicates a cluster with the number of stores below a threshold. For example, the number of stores are below certain percentage of the number of stores in a cluster.
Attributes	A list of relevant attributes.

Scenario System Recommendations

The application provides the following recommendations at the scenario (cluster set), cluster, and store levels.

Figure 4-30 Scenario Recommendations

Description of "Figure 4-30 Scenario Recommendations"

Scenario Optimality

This graph indicates how the system identifies the best number of clusters for a given data set. It starts with a small number of cluster centers and searches for the number beyond which there is little improvement in the mean squared distance (MSD). At this point, increasing the number of cluster centers any more only decreases the MSD by a small amount, and the marginal improvement is small.

Scenario Rank

You can see the ranking of all scenarios in the Cluster Results step. The scenario with the highest rank is designated as System Preferred. The ranking is based on the following:

How many similar stores are contained in the cluster.
How well separated the clusters are from each other.

Outlier Indicator

This provides an outlier indicator in the cluster list if the cluster has an outlier store. Two outlier rules are supported. The distance from the centroid indicates that if a store is beyond a certain limit for the configuration threshold from the centroid, then the cluster to which the store belongs is marked as an outlier. When the size of a cluster is compared with the total stores, if the number of stores in the cluster is below a certain configured percentage of the total stores, then the cluster is marked as an outlier.

Cluster Scores

The application provides scores for clusters, based on the calculated threshold score. The score is based on the assumption that each store has an equal chance of being a member of the cluster. A high score indicates that the store is close to the centroid. A low score indicates that the store is an outlier.

New Stores and Stores with a Poor History

Advanced Clustering supports post-processing rules in order to allocate stores that are new or that have a poor history. These rules can be configured for each criterion and can be changed during deployment.

Like Stores. This rule allocates new stores or stores with a poor history to the same clusters that the like location belongs to. It requires data to be provided to Advanced Clustering that defines the mapping between the location and like locations. This mapping can be configured by merchandise, and one location can be mapped to multiple locations with different weights. For example, a like location can be used to correct a store with poor history or to allocate a new store to a valid performance cluster.
Largest Clusters. This rule allocates new stores or stores with a poor history to the largest cluster identified by Advanced Clustering. Stores can be allocated to a bigger group of stores. For example, a store that has not yet formed a customer base can be allocated to the largest cluster.
Cohesive Clusters. This rule allocates new stores or stores with a poor history to the most compact cluster identified by Advanced Clustering. Stores can be allocated to a compact group of stores. For example, stores can be assigned to a cluster that has not been affected because of outliers.

Insights Stage

Use the Insights stage to analyze a scenario, its clusters, and its hierarchy, based on performance and attribute contributions, prior to the approval of the scenario. This stage includes the following tasks:

Approve a cluster scenario
Create a new cluster within a scenario
Rename clusters within a scenario
Rank scenarios, if not completed earlier
Flag a cluster scenario as "system preferred"
Review a cluster hierarchy in a nested cluster

Select from the following views in this stage:

Criteria view. Displays the parent cluster, if it exists, or the scenario for root-level clusters.
Parent cluster. Displays any child clusters.
Cluster. Displays the stores under the selected cluster.

Each cluster is identified by the following information:

Table 4-22 Insights Stage

Field Name	Description
Name	The name assigned to the cluster.
Nearest Cluster Name	The name of the cluster that is most similar to the named cluster.
# of Stores	The number of stores in the cluster.
Is Outlier	Whether or not the cluster is considered an outlier. If it is an outlier, you may want to review that store.
All Sales Metric	Financial information about the store.
Score %	This value is calculated at the level of store and then averaged to the cluster. The probability, expressed as a percent, of a store being present in this cluster rather than any of the other clusters.
Has Low Store	Indicates a cluster that falls below a defined threshold.
% Total Stores	The percentage of the total stores that the number of stores represents.
Sales Retail	The sales revenue for the cluster or store. The merchandise and the source time period are selected when the cluster parameters are set up.
Sales Unit	The sales units for the cluster or store. The merchandise and the source time period are selected when the cluster parameters are set up.
Sales Average Unit Retail	The average unit retail sales for the cluster or store. The merchandise and the source time period are selected when the cluster parameters are set up.
Gross Margin Retail	The retail sales minus the cost of goods sold for the cluster or store. The merchandise and the source time period are selected when the cluster parameters are set up.
Gross Margin Percent	The retail sales minus the cost of goods sold divided by the retail sales for the cluster or store. The merchandise and the source time period are selected when the cluster parameters are set up.
Cluster Centers	Additional attributes for each cluster.

You use this stage to view all scenarios, examine and compare sales metrics for the various clusters, and manage clusters by approving, rejecting, merging, or deleting clusters.

Figure 4-31 Clustering Insights

Description of "Figure 4-31 Clustering Insights"

Summary

This lists the criteria you initially selected to define the cluster.

Figure 4-32 Cluster Criteria Summary

Description of "Figure 4-32 Cluster Criteria Summary"

Table 4-23 Cluster Criteria Summary

Field	Description
Name	The name you provided for the cluster in the Cluster Criteria stage.
Cluster By	A predefined group of attributes that include Consumer Profile, Product Performance, Store Attribute, Product Attribute, and Mixed Attribute. These criteria types are sets of attributes. For example, store attributes are the properties of a store. These properties can include ethnicity, store format, and store size.
Merchandise	The merchandise level and nodes for the cluster.
Location	The location level and nodes for the cluster.
Fiscal Period	The time period for the cluster.
Is Nested	Indicates whether or not the cluster is nested within another cluster.
Merchandise Hierarchy Type	Provides details about which type of hierarchy the cluster criteria have been created for.

Contextual Information

The graphs displayed in the Clustering Insights stage are also displayed in the Manage Store Clusters tab. For a discussion, see Contextual Information.

Manage Store Clusters Tab

You can use the Manage Store Clusters tab to view a list of the already executed cluster criteria and associated summary details. You can create manual clusters in a scenario, rank/approve/reject scenario, mark a scenario as user preferred, rename clusters, and override any cluster composition within a scenario.

Figure 4-33 Clustering Manage Cluster Criteria

Description of "Figure 4-33 Clustering Manage Cluster Criteria"

The following information is displayed.

Table 4-24 Manage Store Clusters

Field	Description
Is Nested	Indicates whether or not the cluster is nested within another cluster.
Is Deployed	Indicates that the cluster has been deployed.
Is Shared	A check mark indicates that more than one merchandise or location node is used in the cluster.
Scenario Created	The number of scenarios that were created.
Scenario Executed	The number of scenarios that were executed.
Name	The name of the scenario.
Status	Created, Ready for Approval, Completed with Errors, Approved, Rejected.
# Attributes	The total number of attributes used, as determined by the weight assigned to each attribute.
Max # Clusters	The value used for the maximum number of cluster centers in the scenario execution, if this option used.
Min # Clusters	The value used for the minimum number of cluster centers in the scenario execution, if this option used.
Optimal # Clusters	The value used for the optimal number of clusters in the scenario execution, if this option used.
System Preferred	Indicates whether the scenario is the one the application prefers.
Rank Sequence	Indicates the ranking the scenario is given by the application.

Contextual Information

The following charts are available.

Attribute Value Dist By Cluster

The chart shows a comparison of sales with the average value distribution by clusters. The y axis is sales retail and the x axis is sales units. The z bubble size shows the average value for the selected attribute for the cluster.

Figure 4-34 Attribute Value Dist. by Cluster

Description of "Figure 4-34 Attribute Value Dist. by Cluster"

Sales Retail vs. Sales Units

The chart shows a comparison of sales retail and sales unit. The y axis is sales retail and the x axis is sales unit. The z bubble size shows the gross margin retail for a cluster.

Figure 4-35 Sales Retail vs. Sales Units

Description of "Figure 4-35 Sales Retail vs. Sales Units"

Cluster Average Comparison

The chart overlays the centroid of the cluster and provides a comparison of cluster averages so that you can rename the clusters based on the information in the graph.

Figure 4-36 Cluster Average Comparison

Description of "Figure 4-36 Cluster Average Comparison"

Cluster Comparison

The chart shows the stacked contribution of each attribute by percentage for each cluster. You can use this chart to determine which cluster contributes the most for an attribute by viewing the clusters side by side.

Figure 4-37 Cluster Comparison

Description of "Figure 4-37 Cluster Comparison"

Scenario Compare

Multiple scenarios are compared using optimality and scenario rank. This chart is displayed when multiple scenarios for the criteria are available. The following properties are used by the chart: minimum number of clusters, maximum number of clusters, optimal number of clusters, and the rank of the scenarios, from top to bottom.

Figure 4-38 Scenario Compare

Description of "Figure 4-38 Scenario Compare"

Attribute Analysis

Stores and product attributes are analyzed and compared to identify the most prominent selling attributes within a cluster. This indicates how the store and product attributes are correlated and what each attribute in the specified cluster contributes to sales. You can make inferences about which attributes in the cluster contribute to significant sales and the potential attributes that should be considered for assortment planning in order to improve sales even more. With both store and product attribute graphs, you can see which location attributes drive the product attribute sales.

Figure 4-39 Attribute Analysis

Description of "Figure 4-39 Attribute Analysis"

The following properties are displayed by the graph. Note that the store and product attributes are only displayed when they are configured as part of the Cluster by process.

Table 4-25 Properties

Axis	Description
x-axis	Sales Retail $ contribution, calculated using the sales revenue share of each attribute in the cluster with respect to the total cluster sales revenue.
y-axis	Sales Unit $ contribution, calculated using the sales unit share of each attribute in the cluster with respect to the total cluster sales units.
z-axis	Total Sales Retail $ of each attribute in the selected cluster, indicating, via the bubble, the magnitude of the sales contribution.

Cluster Overrides

Both Cluster Insights and Manage Store Cluster can be used to override clusters manually. The following tasks, to be completed before the approval process, are available.

Rename Cluster

To complete this on a nested cluster, select the parent or child clusters in the tree. With the contextual information provided, you have the necessary details to understand the cluster across the hierarchy. This task can be performed using the Cluster Contextual menu.

Figure 4-40 Rename Cluster

Description of "Figure 4-40 Rename Cluster"

Create New Manual Cluster

You can create a new manual cluster, describe it, and tag the cluster as Inactive, Flagship, or Manual. This task can be performed using the Scenario Contextual menu.

Inactive

Inactive stores are allocated to these types of clusters. Such stores are either closed, tagged as invalid, or in construction for a specific effective period.

Flagship

Special clusters in which only certain stores reside.

Manually Created

A user manually creates an empty cluster and allocates stores to the cluster using the drag and drop feature. The application automatically re-calculates the cluster centers after the cluster composition changes.

Figure 4-41 Create Manual Cluster

Description of "Figure 4-41 Create Manual Cluster"

Delete Cluster

You can delete an empty cluster. Prior to deleting the cluster you must move the already allocated stores to another cluster. This can help you to merge clusters at the same level. The application automatically re-calculates the cluster centers after the cluster composition changes. This task can be performed using the Cluster Contextual menu.

Figure 4-42 Delete Cluster

Description of "Figure 4-42 Delete Cluster"

Move Stores to Clusters

Stores can be moved from one cluster to another using drag and drop. The application automatically re-calculates the cluster centers after a cluster composition changes.

Cluster Review, Approve, and Adjust

The approval step is the last step after the review and any manual overrides for the scenario are completed. The scenario results are a set of clusters that are effective for a merchandise, location, and calendar combination. The batch export process selects the last updated approved cluster to deploy to the subscribing applications using an interface file. You can reject an already approved cluster and deploy it if the selected scenario results are better. You are notified of any manual overrides in which approved clusters are modified. They are automatically redeployed.

Figure 4-43 Cluster Review, Approve, and Adjust

Description of "Figure 4-43 Cluster Review, Approve, and Adjust"

Nested vs. Mixed Attribute

This section describes nested attributes as compared to mixed attributes.

Nested

By default, all Cluster by except mixed attributes can have nested hierarchies. Performance attributes can be further clustered by location attributes, which can be further clustered by location attributes. This approach facilitates dynamic hierarchies in clusters. Nesting can be configured to be enabled or disabled.

Figure 4-44 Nested Hierarchies

Description of "Figure 4-44 Nested Hierarchies"

Nested clusters can be created by

Dynamic hierarchy. Select the Cluster by for each level of a hierarchy while creating a cluster criteria.
Templates. Select a predefined Cluster by hierarchy while creating a cluster criteria.
Manual nesting. Create a single cluster criteria, review the results, and then determine whether or not to further cluster. The number of clusters using this approach is granular. The cluster results are hierarchical.

Figure 4-45 Nested Clusters

Description of "Figure 4-45 Nested Clusters"

Mixed Attributes

The following mixed attributes are supported by default: performance, customer segment, location attributes, and product attributes. You can combine attributes from different Cluster by. For example, you can combine attributes from customer segment and performance Cluster by and generate a cluster using sales revenue and customer segment distributions. The number of clusters that are generated using mixed attributes are usually limited, as compared to nested clusters. This approach generates flat clusters with no hierarchy, as attributes are scaled based on the weights you provide.

Figure 4-46 Mixed Attribute Clusters

Description of "Figure 4-46 Mixed Attribute Clusters"

Figure 4-47 Create Cluster Using Mixed Attributes

Description of "Figure 4-47 Create Cluster Using Mixed Attributes"

Figure 4-48 Renaming Mixed Attribute Cluster Results

Description of "Figure 4-48 Renaming Mixed Attribute Cluster Results"