Deploying and Configuring the Training Service

5 Deploying and Configuring the Training Service

Learn how to deploy and configure the training service for artificial intelligence (AI) and machine learning (ML) in Oracle Monetization Suite within Oracle Cloud Infrastructure (OCI) and non-OCI environments.

Topics in this document:

Deploying the Training Service in OCI and Non-OCI Environments

You can deploy the training service either in Oracle Cloud Infrastructure or in a non-OCI environment according to your requirements.

For OCI deployments:

Complete all common installation tasks described in "Overview of Installation Tasks".
When uploading files to Object Storage, also upload the following:
- OCI configuration file
- Private key file
- Preprocessing script (Optional)
Note:

A sample for each of these files are available as part of the artifacts provided with the deployment package.
When setting up data science jobs, create the following five jobs:
- Data Fetch Job
  
  Set JOB_RUN_ENTRYPOINT to entry_file.py
  
  Job artifact: data_fetch_from_os.zip
- Preprocess Job
  
  Set JOB_RUN_ENTRYPOINT to preprocess_entry.py
  
  Job artifact: preprocess.zip
- Train Job
  
  Set JOB_RUN_ENTRYPOINT to train_entry.py
  
  Job artifact: train.zip
- Artifact Job
  
  Set JOB_RUN_ENTRYPOINT to artifact_entry.py
  
  Job artifact: artifact.zip
- Deploy Job
  
  Set JOB_RUN_ENTRYPOINT to deployment_entry.py
  
  Job artifact: deploy.zip
Note:

Oracle provides all of these zip files as part of the artifacts given with the deployment package.

For non-OCI deployments: You only need to install the Helm charts for the training service as described below.

To install the training services using Helm charts:

Download the training utility service Helm charts from the deployment package.

For more information, see "Downloading Packages for the Cloud Native Helm Charts and Docker Files" and "Setting Up Prerequisite Software and Tools".

Generate the SSL certificate and private key:

openssl req -newkey rsa:2048 -nodes -keyout privateKeyName.key -x509 -days 365 -out certificateName.crt

Create a Kubernetes TLS Secret using your certificate and private key:

kubectl create secret tls secretName --cert=certificateName.crt --key=privateKeyName.key

Install the Ingress NGINX controller:

Add the ingress-nginx Helm repository:
```
helm repo add ingress-nginx
```
Update the Helm repository:
```
helm repo update
```

Install the Ingress Controller:

helm install ingress-nginx ingress-nginx/ingress-nginx --namespace nginxNamespace --create-namespace

Attach the service to the NGINX Controller:

helm install ingress-nginx ingress-nginx/ingress-nginx --namespace nginxNamespace \ 
  --set controller.service.enableHttp=false \
  --set controller.service.enableHttps=true \ 
  --set controller.service.ports.https=443 \ 
  --set controller.service.nodePorts.https=31231 nodePort \ 
  --set controller.config.ssl-redirect=true \ 
  --set controller.config.force-ssl-redirect=true \ 
  --set controller.ingressClassResource.name=ingressClassName \ 
  --set controller.ingressClass=ingressClassName \ 
  --create-namespace

Update the values.yaml file for your deployment:
1. Set the following mount paths:
  - modelFilesPath: Mount path for model files. This is required for both OCI and non-OCI deployments.
  - dataFilesPath: Mount path for data files if storage_type is pvc.
  - logFilesPath: Mount path for log files. If not set, it defaults to modelFilesPath.
  - artifactsPath: Mount path for Python scripts and configuration files, or leave blank if these files are in the model files directory.
  - Create separate PV and PVC as needed for each path (train-artifacts-pvc, train-logs-pvc, data-storage-pvc).
    
    Note:
    These paths exist inside the container and do not need to match the host path.
  - Set folder permissions:
```
groupadd -g 10001 oracle
useradd -mr -u 10001 -g oracle oracle
chown 10001:10001 -R mountPath
```
2. Set the following configuration parameters:
  - imageRepository
  - IMAGE_TAG for both trainUtilityOrchestrator.image and trainUtilityPredictor.image
3. Specify the paths to place the following files in the mount path, under trainUtilityPredictor.configurableFiles:
  - preprocessScript: Path to the Python preprocessing script
  - ociConfigFile: Path to the configuration file with OCI connection details
  - datascienceConfig: Path to the configuration for data science jobs
  - labelFile: Path to the label file for DNN model training
  - logConfigFile: Path to the log configuration
  Note:
  
  A sample file is available for each of them in the artifacts provided with the package: sample_preprocess.py (for preprocessScript), oci_config (for ociConfigFile), config.json (for datascienceConfig), sample_label.csv (for labelFile), and logging_config.py (for logConfigFile). You can configure these files as per your requirements.
4. Set the host value to your deployment host's name.
5. Set tlsSecretName to the value used when creating the Kubernetes TLS secret.
6. Set ingressClassName to your ingress class name.
7. Set identityURI, clientID, and clientSecret to your IDCS configuration. To disable IDCS, set trainUtilityOrchestrator.idcs.enabled to false.
8. Set serviceMonitor.serviceNamespace to the namespace where you have deployed the services.
Create StorageClass (SC), PersistentVolume (PV), and PersistentVolumeClaim (PVC) resources in your namespace. In pv-template.yaml, update spec.hostPath.path to the host system path that should be mounted:
```
kubectl apply -f helm/sc.yaml
kubectl apply -f helm/pv-template.yaml
kubectl apply -f helm/pvcTemplate.yaml
```
Update the helm/cgiutrainingutility/templates/deployment.yaml file as required, especially the spec.template.spec.volumes section.

Install the Helm charts:

helm upgrade --install training-utility-services helm/cgiutrainingutility/ --namespace=training-utility-service

Configuring and Using the Training Service

You interact with the training service using REST APIs, which enables integration with external systems.

For training service, you use the following REST service:

Train API (/utility/train)

Use this API to train a model based on the data acquired by the data service and the configured parameters.

For more information, see "About the REST APIs".