10.2 Parallelism with OML4Py Embedded Python Execution

OML4Py embedded Python execution allows users to invoke user-defined functions from Python, SQL, and REST interfaces using Python engines spawned and controlled by the Oracle Autonomous Database environment.

The user-defined functions can be invoked in a data-parallel and task-parallel manner with multiple Python engines, with output formats including structured data, XML, JSON, and PNG images.

Oracle Autonomous Database provides different service levels to manage the load on the system by controlling the degree of parallelism jobs can use:

  • LOW - the default, with maximum 2 degrees of parallelism

  • MEDIUM - maximum of 4 degrees of parallelism, and allows greater concurrency for job processing

  • HIGH - maximum of 8 degrees of parallelism but significantly limits the number of concurrent jobs

Parallelism applies to:

  • oml.row_apply, oml.group_apply, and oml.index_apply using the Python API for embedded Python execution

  • pyqRowEval, pyqGroupEval, and *pyqIndexEval using the SQL API for embedded Python execution

  • row-apply, group-apply, index-apply using the REST API for embedded Python execution

Note:

pyqIndexEval is available on Oracle Autonomous Database only.

Setting Parallelism Using Embedded Python Execution

For the ADB Python API for Embedded Python Execution:

The parallel parameter specifies the preferred degree of parallelism to use in the embedded Python execution job. The value may be one of the following:

  • A positive integer greater than or equal to 1 for a specific degree of parallelism
  • False, None, or 0 for no parallelism
  • True for the default data parallelism

Setting the argument parallel=True corresponds to service level defined in the notebook interpreter. The argument parallel=x is limited by the service level. For instance, the maximum number of parallel engines allowed by the MEDIUM service level is 4, therefore selecting parallel=6 effectively results in parallel=4.

For the ADB SQL API for Embedded Python Execution:

The argument oml_parallel_flag and oml_service_level are used together to enable data-parallelism and task-parallelism. For more information see Special Control Arguments (Autonomous Database).

For the ADB REST API for Embedded Python Execution:

When executing a REST API Embedded Python Execution function, the service argument allows you to select the Autonomous Database service level to be used. For example, the parallelFlag is set to true in order to use database parallelism along with the MEDIUM service.

-d '{"parallelFlag":true,"service":"MEDIUM"}'
For more information see Specify a Service Level.