Common Installation Tasks for Oracle Monetization Suite AI Services

Overview of Installation Tasks

To install and deploy data service, training service, and prediction service, you need to perform some common installation tasks. These installation tasks are required only if you plan to deploy these services in the OCI environment. These tasks help you to set up the environment.

The following is the list of high-level installation tasks:

Create a bucket in OCI Object Storage to store configuration files and resources.

See "Creating Object Storage Buckets in OCI".
Create a private endpoint in OCI Data Flow.

See "Creating Data Flow Private Endpoints".
Generate and download public and private API keys.

See "Generating API Keys".
Create and verify the archive.zip deployment file.

See "Creating the Archive.zip File".
Create any required Python scripts to run in OCI Data Flow.

Note:

Complete this step only if you plan to use OCI Data Science for job creation.
Upload configuration and dependency files to OCI Object Storage.

See "Uploading Files to Object Storage".
Create a Data Flow application in OCI.

See "Creating Data Flow Application".
Create jobs in OCI Data Science as needed.

See "Setting Up a Data Science Job".

Note:

Complete this step only if you plan to use OCI Data Science for job creation.
Set up an Oracle Identity Cloud Service (IDCS) instance to manage authentication.

See "Federating with Identity Providers" in the OCI documentation.

Note:

Using IDCS is optional but highly recommended to handle authorization and authentication. It is configurable and all the provided services are certified with it.

Creating Object Storage Buckets in OCI

You need to create buckets in OCI Object Storage to store various configuration files, API keys, scripts, and other resources.

To create a bucket in OCI Object Storage:

From the Navigation menu, select Storage.

The Storage window appears on the right pane.
Click Buckets.

The Buckets page appears.
Click Create Bucket.

The Create Bucket page appears.
Enter a name for your bucket and use the default values for other settings.
Click Create Bucket.

Creating Data Flow Private Endpoints

To create a private endpoint in OCI Data Flow:

From the Navigation menu, select Analytics & AI.

The Analytics & AI window appears on the right pane.
Select Data Flow.

The Data Flow page appears.
From the left navigation pane, select Private Endpoints.

The Private Endpoints page appears.
Click Create private endpoint.

The Create private endpoint page appears.
Enter a name in the Name field.
From the list, select the VCN and Subnet if you plan to use a private network. For public network, you can leave this field empty.
In the DNS zones to resolve field, add the DNS whose servers need the access.

Fill in additional fields as needed based on your requirements.
Click Create.

Generating API Keys

To generate and download API keys:

Click your profile in the OCI Console and select User settings.

The User setting page appears.
In the left panel, select API Keys.

The API Keys page appears.
Click Add API Keys.
Download both the Private Key and Public Key files.

Note:

The Private Key file is used for all future authorization of OCI features.

Creating the Archive.zip File

To prepare the archive.zip deployment package:

Pull the dependency packer Docker image:
```
docker pull phx.ocir.io/axmemlgtri2a/dataflow/dependency-packager-linux_arm64_v8:latest
```
Note:
You can also use Podman, based on your preference.
Create a requirements.txt file listing required Python dependencies. For this use case, include oci and pandas.
Place required JAR files and configuration files in the same directory as requirements.txt.

Run the following command (using your Python version) to create archive.zip:

docker run --platform linux/amd64 --rm -v $(pwd):/opt/dataflow --pull always -it phx.ocir.io/axmemlgtri2a/dataflow/dependency-packager-linux_x86_64:latest -p 3.11

Verify that archive.zip includes ojdbc11.jar and the configuration file.
Update the configuration file with your User OCID, Tenancy, Region, and other required values.

Uploading Files to Object Storage

To upload files to your Object Storage bucket:

In the OCI Console, search for and navigate to Buckets and select Buckets from the Services results.

The Buckets page appears with a list of available buckets.
Select your bucket.
Click Upload objects.
In the upload wizard:
1. In the Select files section:
  - (Optional) In Object name prefix, enter a prefix to prepend it to each uploaded file name.
    
    To organize files in folders or subfolders, start the prefix with a slash (“/”). For example, /myfiles stores files in a myfiles folder.
  - Select Standard for the storage tier field.
  - Drag your file to the upload area or click to select it.
  - Click Next.
2. In the Review files section:
  - Confirm the details and files to upload.
  - Click Next.
3. In the Upload files section:
  - Click Upload objects.
  - Wait for the upload to complete.
  - Click Close when the process finishes.
  - Verify that your file appears in the Objects list for the bucket.

Creating Data Flow Application

To create an application in OCI Data Flow, perform the following steps:

Go to the OCI Console.
From the Navigation menu, select Analytics & AI.

The Analytics & AI window appears on the right pane.
In the Analytics & AI window, select Data Flow.

The Data Flow page appears.
Click Create Application.

The create application details page appears.
From the Spark version list, select the required version.

Note:

Oracle recommends you to use the latest version of Spark for best results.
In the Driver Shape and Executer Shape fields, select the required shape or hardware template. It should have a minimum of 2 OCPU and 32 GB of memory.
In the Number of Executors field, select 1 or more.
In the Language field, select Python.
In Select a file, select your bucket and the dataflow.py file.

Note:

You can locate the dataflow.py file in the artifacts provided with the package and you can configure this file as per your requirements.
In the Archive URI section, select your bucket and the archive.zip file.

See "Creating the Archive.zip File" for more information.
In Application log location, select your bucket for log storage.
Click Create.

The data flow application is created.

Setting Up a Data Science Job

Note:

The steps for creating training jobs may vary. For details, see "Deploying the Training Service in OCI and Non-OCI Environments".

To create and set up a job in OCI Data Science, perform the following tasks:

Go to the OCI Console.
From the Navigation menu, select Analytics & AI.

The Analytics & AI window appears on the right pane.
Select Data Science.

The Data Science page appears.
Click Create project.

The Create project page appears.
Enter the required Name and Description for your project, then click Create.
Upload the oci_config file, private_key file, and data_flow_config files to your object storage bucket.

Note:

You can locate these files in the artifacts provided with the package and you can configure them as per your requirements.

For more information, see "Uploading Files to Object Storage".
Go to your project, select Jobs, and click Create job.
In the Basic Information section:
1. Select Single Node.
2. Choose the appropriate compartment.
3. Enter the required Name and Description for your job.
In the Configuration section, set the following environment variables:
1. JOB_RUN_ENTRYPOINT: controller.py
2. CONDA_ENV_SLUG: python_p312_any_x86_64_v1 (for the data service) or tensorflow216_p310_gpu_v1 (for Training Service)
3. CONDA_ENV_TYPE: service
In the Compute Shape section, select the required shape.

Note:

You must choose a compute shape with at least 1 core CPU and 16 GB RAM.
In the Storage section, insert the required storage.

Note:

You need to have at least 50 GB storage.
In the Networking section, select the appropriate VCN and subnet configurations if you plan to use a private network. For public network, you can leave this field empty.
In the Upload job artifact section, select the collect_data_from_dataflow.zip file.

Note:

You can locate the collect_data_from_dataflow.zip file in the artifacts provided with the package.
In the Additional Configuration section, specify the log group and logs as required.
In the Storage Mount section, click + Add object storage mount.
1. Select the correct Compartment and Bucket from the list.
2. Leave the Object name prefix blank.
3. In the Destination path and directory field, enter /home.
Click Create.

The data science job is created.

3 Common Installation Tasks for Oracle Monetization Suite AI Services