Row Importance

12 Row Importance

Use row importance as an unsupervised technique to preprocess data before model building with other machine learning techniques.

12.1 About Row Importance

Row importance captures the influence of the rows or cases in a data set.

Row importance technique is used in dimensionality reduction of large data sets. Row importance identifies the most influential rows of the data matrix. The rows with high importance are ranked by their importance scores. The "importance" of a row is determined by high statistical leverage scores. In CUR matrix decomposition, row importance is often combined with column (attribute) importance. Row importance can serve as a data preprocessing step prior to model building using regression, classification, and clustering.

Related Topics

12.2 Selecting Important Rows

The rows with high importance are ranked by their importance scores. The "importance" of a row is determined by high statistical leverage scores.

Row importance, that is, rows with high leverage scores are reported as names (as case_id), scores (as importance), and ranks (by importance).

12.3 Row Importance Algorithms

Oracle Machine Learning supports CUR matrix decomposition algorithm to determine row and column (attribute) importance.

Popular algorithms for dimensionality reduction are Principal Component Analysis (PCA), Singular Value Decomposition (SVD), and CUR Matrix Decomposition. All these algorithms apply low-rank matrix decomposition.

In CUR matrix decomposition, the attributes include 2-Dimensional numerical columns, levels of exploded 2D categorical columns, and attribute name or subname or value pairs for nested columns. To arrive at row importance or selection, the algorithm computes singular vectors, calculates leverage scores, and then selects rows. Row importance is performed when users specify CURS_ROW_IMP_ENABLE for the CURS_ROW_IMPORTANCE parameter in the settings table and the case_id column is present. Unless users explicitly specify, row importance is not performed.

Related Topics