6.3.2 Push Local Python Data to the Database

Use the oml.push function to push data from your local Python session to a temporary table in your Oracle database schema.

The oml.push function creates a temporary table in the user’s database schema and inserts data into the table. It also creates and returns a corresponding proxy oml.DataFrame object that references the table in the Python session. The table exists as long as an oml object exists that references it, either in the Python session memory or in an OML4Py datastore.

The syntax of the oml.push function is the following:

oml.push(x, oranumber=True, dbtypes=None)

The x argument may be a pandas.DataFrame or a list of tuples of equal size that contain the data for the table. For a list of tuples, each tuple represents a row in the table and the column names are set to COL1, COL2, and so on.

The SQL data types of the columns are determined by the following:

  • OML4Py determines default column types by looking at 20 random rows sampled from the table. For tables with less than 20 rows, it uses all rows in determining the column type.

    If the values in a column are all None, or if a column has inconsistent data types that are not None in the sampled rows, then a default column type cannot be determined and a ValueError is raised unless a SQL type for the column is specified by the dbtypes argument.

  • For numeric columns, the oranumber argument, which is a bool, determines the SQL data type. If True (the default), then the SQL data type is NUMBER. If False, then the data type is BINARY_DOUBLE.

    If the data in x contains NaN values, then you should set oranumber to False.

  • For string columns, the default type is VARCHAR2(4000).

  • For bytes columns, the default type is BLOB.

With the dbtypes argument, you can specify the SQL data types for the table columns. The values of dbtypes may be either a dict that maps str to str values or a list of str values. For a dict, the keys are the names of the columns.

Example 6-8 Pushing Data to a Database Table

This example creates pd_df, a pandas.core.frame.DataFrame object with columns of various data types. It pushes pd_df to a temporary database table, which creates the oml_df object, which references the table. It then pulls the data from the oml_df object to the df object in local memory.

import oml
import pandas as pd

pd_df = pd.DataFrame({'numeric': [1, 1.4, -4, 3.145, 5, None],
                      'string' : [None, None, 'a', 'a', 'a', 'b'],
                      'bytes' : [b'a', b'b', b'c', b'c', b'd', b'e']})

# Push the data set to a database table with the specified dbtypes 
# for each column.
oml_df = oml.push(pd_df, dbtypes = {'numeric': 'BINARY_DOUBLE', 
                                    'string':'CHAR(1)', 
                                    'bytes':'RAW(1)'})

# Display the data type of oml_df.
type(oml_df)

# Pull the data from oml_df into local memory.
df = oml_df.pull()

# Display the data type of df.
type(df)

# Create a list of tuples. 
lst = [(1, None, b'a'), (1.4, None, b'b'), (-4, 'a', b'c'),
       (3.145, 'a', b'c'), (5, 'a', b'd'), (None, 'b', b'e')]
       
# Create an oml.DataFrame using the list.       
oml_df2 = oml.push(lst, dbtypes = ['BINARY_DOUBLE','CHAR(1)','RAW(1)'])

type(oml_df2)

Listing for This Example

>>> import oml
>>> import pandas as pd
>>>
>>> pd_df = pd.DataFrame({'numeric': [1, 1.4, -4, 3.145, 5, None],
...                       'string' : [None, None, 'a', 'a', 'a', 'b'],
...                       'bytes' : [b'a', b'b', b'c', b'c', b'd', b'e']})
>>> 
>>> # Push the data set to a database table with the specified dbtypes
... # for each column.
... oml_df = oml.push(pd_df, dbtypes = {'numeric': 'BINARY_DOUBLE',
...                                     'string':'CHAR(1)', 
...                                     'bytes':'RAW(1)'})
>>> 
>>> # Display the data type of oml_df.
... type(oml_df)
<class 'oml.core.frame.DataFrame'>
>>> 
>>> # Pull the data from oml_df into local memory.
... df = oml_df.pull()
>>> 
>>> # Display the data type of df.
... type(df)
<class 'pandas.core.frame.DataFrame'>
>>>
>>> # Create a list of tuples. 
... lst = [(1, None, b'a'), (1.4, None, b'b'), (-4, 'a', b'c'),
...        (3.145, 'a', b'c'), (5, 'a', b'd'), (None, 'b', b'e')]
>>> 
>>> # Create an oml.DataFrame using the list.       
... oml_df2 = oml.push(lst, dbtypes = ['BINARY_DOUBLE','CHAR(1)','RAW(1)'])
>>> 
>>> type(oml_df2)
<class 'oml.core.frame.DataFrame'>