4.2.1 About the Exploratory Data Analysis Methods

OML4Py provides methods that enable you to perform exploratory data analysis.

The following table lists methods of OML4Py data type classes with which you can perform common statistical operations and indicates whether the class supports the method.

Table 4-2 Data Exploration Methods Supported by Data Type Classes

Method Description oml.Boolean oml.Bytes oml.Float oml.String oml.DataFrame oml.Datetime oml.Timedelta oml.Timezone oml.Integer
corr

Computes pairwise correlation between all columns in an oml.DataFrame where possible, given the type of coefficient.

No No No No Yes No No No No
count

Computes the number of elements that are not NULL in the series data object or in each column of an oml.DataFrame.

Yes Yes Yes Yes Yes Yes Yes Yes Yes
crosstab

Computes a cross-tabulation of two or more columns in an oml.DataFrame.

No No No No Yes No No No No
cumsum

Computes the cumulative sum after an oml.Float series data object is sorted, or of each float or Boolean column after an oml.DataFrame object is sorted.

No No Yes No Yes No No No Yes
describe

Computes descriptive statistics that summarize the central tendency, dispersion, and shape of an oml series data distribution, or of each column in an oml.DataFrame.

Yes Yes Yes Yes Yes Yes Yes Yes Yes
kurtosis

Computes the kurtosis of the values in an oml.Float series data object, or for each float column in an oml.DataFrame.

No No Yes No Yes No No No Yes
max

Returns the maximum value in a series data object or in each column in an oml.DataFrame.

Yes Yes Yes Yes Yes Yes Yes Yes Yes
mean

Computes the mean of the values in an oml.Float series object, or for each float or Boolean column in an oml.DataFrame.

No No Yes No Yes No No No Yes
median

Computes the median of the values in an oml.Float series object, or for each float column in an oml.DataFrame.

No No Yes No Yes No No No Yes
min

Returns the minimum value in a series data object or of each column in an oml.DataFrame.

Yes Yes Yes Yes Yes Yes Yes Yes Yes
nunique

Computes the number of unique values in a series data object or in each column of an oml.DataFrame.

Yes Yes Yes Yes Yes Yes Yes Yes Yes
pivot_table

Converts an oml.DataFrame to a spreadsheet-style pivot table.

No No No No Yes No No No No
sort_values

Sorts the values in a series data object or sorts the rows in an oml.DataFrame.

Yes Yes Yes Yes Yes Yes Yes Yes Yes
skew

Computes the skewness of the values in an oml.Float data series object or of each float column in an oml.DataFrame.

No No Yes No Yes No No No Yes
std

Computes the standard deviation of the values in an oml.Float data series object or in each float or Boolean column in an oml.DataFrame.

No No Yes No Yes No No No Yes
sum

Computes the sum of the values in an oml.Float data series object or of each float or Boolean column in an oml.DataFrame.

No No Yes No Yes No No No Yes