Statistical Leverage Score

Statistical leverage scores highlight the most representative columns or rows, aiding in the selection of important data points.

Leverage scores are statistics that determine which column (or rows) are most representative with respect to a rank subspace of a matrix. The statistical leverage scores represent the column (or attribute) and row importance. The normalized statistical leverage scores for all columns are computed from the top k right singular vectors as follows:

where k is called rank parameter and j = 1,...,n. Given that πj>=0 and , these scores form a probability distribution over the n columns.

Similarly, the normalized statistical leverage scores for all rows are computed from the top k left singular vectors as:

where i = 1,...,m.