Attribute Importance

Along with attribute summaries, the system generates an attribute importance index that sums up data quality, data distribution, and its representation of each attribute in the data. See Figure 5-11.

Figure 5-11 Attribute Importance

Description of Figure 5-11 follows
Description of "Figure 5-11 Attribute Importance"

The system indicates the relative importance of attributes, based on upper and lower threshold values.

Table 5-13 Attribute Importance Values

Category Range

Low Importance

Minimum value for the lower threshold, indicated in red.

Average Importance

Values between the lower threshold and the upper threshold, indicated in yellow.

High Importance

Values above the upper threshold are indicated in green and are the best candidates for mining attributes.

The following attributes are displayed in Attribute Importance.

Table 5-14 Attributes

Attribute Name Description

Group Name

The attribute group name (attributes such as demographics, purchase behavior, or product profile).

Name

The name of the attribute (for example, income, ethnicity, or total sales retail)

Histogram

The spark chart for displaying the attribute distribution. This corresponds to the current data distribution graph.

Importance Indicator

The attribute importance index with an image to indicate if attribute has high, average, or low importance.

Percent Null

The percentage of data that is null for the attribute.

Distinct Value

The distinct value applicable to discrete attributes. If no value is available, this is empty.

Mode

The most common value of the discrete attributes. If no value is available, this is empty.

Average

The mean value of the numeric attributes. If no value is available, this is empty.

Median

The median value of the numeric attributes. If no value is available, this is empty.

Minimum Value

The minimum value of the numeric attributes. If no value is available, this is empty.

Maximum Value

The maximum value of the numeric attributes. If no value is available, this is empty.

Standard Deviation

The standard deviation indicates the deviation from the average for the numeric attributes. If no value is available, this is empty.

Variance

The variance indicates the dispersion from the average for the numeric attributes. If no value is available, this is empty.