6.3.3 Pull Data from the Database to a Local Python Session

Use the pull method of an oml proxy object to create a Python object in your local Python session.

The pull method of an oml object returns a Python object of the same type. The object contains a copy of the database data referenced by the oml object. The Python object exists in-memory in the Python session in OML Notebooks or in your OML4Py client Python session..

Note:

You can pull data to a local pandas.DataFrame only if the data can fit into the local Python session memory. Also, even if the data fits in memory but is still very large, you may not be able to perform many, or any, Python functions in the local Python session.

Example 6-9 Pulling Data into Local Memory

This example loads the iris data set and creates the IRIS database table and the oml_iris proxy object that references that table. It displays the type of the oml_iris object, then pulls the data from it to the iris object in local memory and displays its type.


import oml
from sklearn.datasets import load_iris
import pandas as pd

iris = load_iris()
x = pd.DataFrame(iris.data, columns = [‘SEPAL_LENGTH’,‘SEPAL_WIDTH’, ‘PETAL_LENGTH’,‘PETAL_WIDTH’])
y = pd.DataFrame(list(map(lambda x: {0: ‘setosa’, 1: ‘versicolor’, 2:‘virginica’}[x], iris.target)), columns = [‘SPECIES’])
iris_df = pd.concat([x, y], axis=1)

oml_iris = oml.create(iris_df, table = ‘IRIS’)

# Display the data type of oml_iris.
type(oml_iris)

# Pull the data from oml_iris into local memory.
iris = oml_iris.pull()

# Display the data type of iris.
type(iris)

# Drop the IRIS database table.
oml.drop('IRIS')

Listing for This Example

>>> import oml
>>> from sklearn.datasets import load_iris
>>> import pandas as pd
>>>
>>> # Load the iris data set and create a pandas.DataFrame for it.
>>> iris = datasets.load_iris()

>>> iris = load_iris()
>>> x = pd.DataFrame(iris.data, columns = [‘SEPAL_LENGTH’,‘SEPAL_WIDTH’, ‘PETAL_LENGTH’,‘PETAL_WIDTH’])
>>> y = pd.DataFrame(list(map(lambda x: {0: ‘setosa’, 1: ‘versicolor’, 2:‘virginica’}[x], iris.target)), columns = [‘SPECIES’])
>>> iris_df = pd.concat([x, y], axis=1)

>>> oml_iris = oml.create(iris_df, table = ‘IRIS’)

>>>
>>> # Display the data type of oml_iris.
... type(oml_iris)
<class 'oml.core.frame.DataFrame'>
>>>
>>> # Pull the data from oml_iris into local memory.
... iris = oml_iris.pull()
>>>
>>> # Display the data type of iris.
... type(iris)
<class 'pandas.core.frame.DataFrame'>
>>>
>>> # Drop the IRIS database table.
... oml.drop('IRIS')