7.2.1 About the Exploratory Data Analysis Methods

OML4Py provides methods that enable you to perform exploratory data analysis.

The following table lists methods of OML4Py data type classes with which you can perform common statistical operations and indicates whether the class supports the method.

Table 7-2 Data Exploration Methods Supported by Data Type Classes

Method Description oml.Boolean oml.Bytes oml.Float oml.String oml.DataFrame

Computes pairwise correlation between all columns in an oml.DataFrame where possible, given the type of coefficient.

No No No No Yes

Computes the number of elements that are not NULL in the series data object or in each column of an oml.DataFrame.

Yes Yes Yes Yes Yes

Computes a cross-tabulation of two or more columns in an oml.DataFrame.

No No No No Yes

Computes the cumulative sum after an oml.Float series data object is sorted, or of each float or Boolean column after an oml.DataFrame object is sorted.

No No Yes No Yes

Computes descriptive statistics that summarize the central tendency, dispersion, and shape of an oml series data distribution, or of each column in an oml.DataFrame.

Yes Yes Yes Yes Yes

Computes the kurtosis of the values in an oml.Float series data object, or for each float column in an oml.DataFrame.

No No Yes No Yes

Returns the maximum value in a series data object or in each column in an oml.DataFrame.

Yes Yes Yes Yes Yes

Computes the mean of the values in an oml.Float series object, or for each float or Boolean column in an oml.DataFrame.

No No Yes No Yes

Computes the median of the values in an oml.Float series object, or for each float column in an oml.DataFrame.

No No Yes No Yes

Returns the minimum value in a series data object or of each column in an oml.DataFrame.

Yes Yes Yes Yes Yes

Computes the number of unique values in a series data object or in each column of an oml.DataFrame.

Yes Yes Yes Yes Yes

Converts an oml.DataFrame to a spreadsheet-style pivot table.

No No No No Yes

Sorts the values in a series data object or sorts the rows in an oml.DataFrame.

Yes Yes Yes Yes Yes

Computes the skewness of the values in an oml.Float data series object or of each float column in an oml.DataFrame.

No No Yes No Yes

Computes the standard deviation of the values in an oml.Float data series object or in each float or Boolean column in an oml.DataFrame.

No No Yes No Yes

Computes the sum of the values in an oml.Float data series object or of each float or Boolean column in an oml.DataFrame.

No No Yes No Yes