Attribute Importance
Along with attribute summaries, the system generates an attribute importance index that sums up data quality, data distribution, and its representation of each attribute in the data. See Figure 5-11.
The system indicates the relative importance of attributes, based on upper and lower threshold values.
Table 5-13 Attribute Importance Values
Category | Range |
---|---|
Low Importance |
Minimum value for the lower threshold, indicated in red. |
Average Importance |
Values between the lower threshold and the upper threshold, indicated in yellow. |
High Importance |
Values above the upper threshold are indicated in green and are the best candidates for mining attributes. |
The following attributes are displayed in Attribute Importance.
Table 5-14 Attributes
Attribute Name | Description |
---|---|
Group Name |
The attribute group name (attributes such as demographics, purchase behavior, or product profile). |
Name |
The name of the attribute (for example, income, ethnicity, or total sales retail) |
Histogram |
The spark chart for displaying the attribute distribution. This corresponds to the current data distribution graph. |
Importance Indicator |
The attribute importance index with an image to indicate if attribute has high, average, or low importance. |
Percent Null |
The percentage of data that is null for the attribute. |
Distinct Value |
The distinct value applicable to discrete attributes. If no value is available, this is empty. |
Mode |
The most common value of the discrete attributes. If no value is available, this is empty. |
Average |
The mean value of the numeric attributes. If no value is available, this is empty. |
Median |
The median value of the numeric attributes. If no value is available, this is empty. |
Minimum Value |
The minimum value of the numeric attributes. If no value is available, this is empty. |
Maximum Value |
The maximum value of the numeric attributes. If no value is available, this is empty. |
Standard Deviation |
The standard deviation indicates the deviation from the average for the numeric attributes. If no value is available, this is empty. |
Variance |
The variance indicates the dispersion from the average for the numeric attributes. If no value is available, this is empty. |