## 13 Row Importance

Use row importance as an unsupervised technique to preprocess data before model building with other machine learning techniques.

- About Row Importance

Identify and rank influential rows in a data set using statistical leverage scores for dimensionality reduction. - Row Importance Algorithms

Oracle Machine Learning supports CUR matrix decomposition algorithm to determine row and column (attribute) importance.

**Related Topics**

**Parent topic:** Machine Learning Techniques

### 13.1 About Row Importance

Identify and rank influential rows in a data set using statistical leverage scores for dimensionality reduction.

Row importance captures the influence of the rows or cases in a data set. Row importance technique is used in dimensionality reduction of large data sets. Row importance identifies the most influential rows of the data matrix. The rows with high importance are ranked by their importance scores. The "importance" of a row is determined by high statistical leverage scores. In CUR matrix decomposition, row importance is often combined with column (attribute) importance. Row importance can serve as a data preprocessing step prior to model building using regression, classification, and clustering.

**Related Topics**

**Parent topic:** Row Importance

### 13.2 Row Importance Algorithms

Oracle Machine Learning supports CUR matrix decomposition algorithm to determine row and column (attribute) importance.

Popular algorithms for dimensionality reduction are Principal Component Analysis (PCA), Singular Value Decomposition (SVD), and CUR Matrix Decomposition. All these algorithms apply low-rank matrix decomposition.

In CUR matrix decomposition, the attributes include 2-Dimensional numerical
columns, levels of exploded 2D categorical columns, and attribute name or subname or
value pairs for nested columns. To arrive at row importance or selection, the algorithm
computes singular vectors, calculates leverage scores, and then selects rows. Row
importance is performed when users specify `CURS_ROW_IMP_ENABLE`

for the
`CURS_ROW_IMPORTANCE`

parameter in the settings table and the
`case_id`

column is present. Unless users explicitly specify, row
importance is not performed.

**Related Topics**

**Parent topic:** Row Importance