Creating Jobs

Ensure that you have created the necessary policies, authentication, and authorization for your jobs.

Before you begin:

You can create and run jobs using the ADS SDK, OCI SDK, or the OCI Console.

When creating jobs, you can use the Fast Launch enabled Compute shapes when they are available in your region. This launches the job in the fastest way possible.

Using the ADS SDK

The ADS SDK is also a publicly available Python library that you can install with this command:

pip install oracle-ads

It provides the wrapper that makes the creation and running jobs from notebooks or on your client machine easy.

You can use the ADS SDK to create and run jobs.

Tip

You can use the ListFastLaunchJobConfigs API to retrieve the configurations that are fast launch capable.

Using the Console

  1. Log into your tenancy using the Console with the necessary policies.
  2. Open the navigation menu and click Analytics & AI. Under Machine Learning, click Data Science.
  3. Select the compartment that contains the project you want to use.
  4. Click the name of a project.
  5. Click Jobs.
  6. Click Create job.
  7. (Optional) Select a different compartment for the job.
  8. (Optional) Enter a unique name and description for the job (limit of 255 characters). If you don't provide a name, a name is automatically generated for you.

    For example:

    job20210808222435
  9. Use one of the following:
    OptionDescription

    Bring your own container

    Select to use your container, and then specify the image as a container with the custom environment variable keys.

    Upload job artifact

    Drag and drop your job artifact file into the box, or click select a file to navigate to it for selection.

    A container image or a job artifact is required for a job.

  10. (Optional) You can create a default job configuration that is used when the job is run using these options:

    Enter or select any of the following:

    Custom environment variable key

    Environment variables control the job. When you are using a container image, you must specify CONTAINER_CUSTOM_IMAGE, and other options are optional.

    Note

    If you uploaded ZIP or compressed tar file, add the JOB_RUN_ENTRYPOINT as a custom environment variable to point to the file.

    Value

    Value for your custom environment variable key.

    You can click Additional custom environment variables to specify more variables.

    Command line arguments

    The command line arguments that you want to use for running the job.

    Maximum runtime (in minutes)

    The maximum number of minutes that the job can run. The service cancels the job run if its runtime exceeds the specified value. The maximum runtime is 30 days. We recommend that you configure a maximum runtime on all job runs to prevent runaway job runs.

  11. (Optional) Click Select to return to the jobs creation page.
  12. Select a Compute shape by clicking Select.

    We recommend that you use the fast launch option that automatically selects a Compute shape from a predefined pool of shapes so that your job can start as fast as possible. There must be a shape available in the pool for this option to be selected.

    Select one of the supported Compute shapes.

    1. Select Fast launch or Custom configuration.
    2. Select the shape that best suits how you want to use the resource.
    3. Click Submit.
  13. (Optional) To use logging, click Select, and then ensure that Enable logging is selected.
    1. Select a log group from the list. You can change to a different compartment to specify a log group in a different compartment from the job.
    2. Select one of the following to store all stdout and stderr messages:
      Enable automatic log creation

      Data Science automatically creates a log when the job starts.

      Select a log

      Select a log to use.

  14. Enter the Block Volume size you want to use between 50 GB and 10,240 GB (10 TB). You can change the value by 1 GB increments. The default value is 1,024 GB.
  15. Select how you want to configure your network:
    • Default networking—The workload is attached using a secondary VNIC to a pre-configured, service-managed VCN and subnet. This provided subnet allows egress to the public internet through a NAT gateway, and access to other Oracle Cloud services through a service gateway.

      If you only need access to the public internet and OCI services, this is the fastest, easiest way to get started on the service. It doesn't require you to create your own networking resources or write policies for networking permissions.

    • Custom networking—Select the VCN and subnet that you want to use for the resource (notebook session or job).

      For egress access to the public internet, use a private subnet with a route to a NAT gateway.

      You can change the compartment if necessary by clicking Change Compartment, and then selecting the new compartment that has the VCN or subnet that you want to use.

  16. (Optional) Add tags to easily locate and track the resource by selecting a tag namespace, then entering the key and value. To add more than one tag, click +Additional Tags.

    Tagging describes the various tags that you can use organize and find resources including cost-tracking tags.

  17. Click Create.

    After the job is in an active state, you can use job runs to repeatedly run the job.

Using the CLI

These environment variables control the job.

Tip

You can use the ListFastLaunchJobConfigs API to retrieve the configurations that are fast launch capable.

You can use the OCI CLI to create a job as in this example:

  1. Create a job with:
    oci data-science job create \
    --display-name <job_name>\
    --compartment-id <compartment_ocid>\
    --project-id <project_ocid> \
    --configuration-details file://<jobs_configuration_json_file> \
    --infrastructure-configuration-details file://<jobs_infrastructure_configuration_json_file> \
    --log-configuration-details file://<optional_jobs_infrastructure_configuration_json_file>
  2. Use this jobs configuration JSON file:
    {
      "jobType": "DEFAULT",
      "maximumRuntimeInMinutes": 240,
      "commandLineArguments" : "test-arg",
      "environmentVariables": {
        "SOME_ENV_KEY": "some_env_value" 
      }
    }
  3. Use this jobs infrastructure configuration JSON file:
    {
      "jobInfrastructureType": "STANDALONE",
      "shapeName": "VM.Standard2.1",
      "blockStorageSizeInGBs": "50",
      "subnetId": "<subnet_ocid>"
    }
  4. (Optional) Use this jobs logging configuration JSON file:
    {
      "enableLogging": true,
      "enableAutoLogCreation": true,
      "logGroupId": "<log_group_ocid>"
    }
  5. Upload a job artifact file for the job you created with:
    oci data-science job create-job-artifact \
    --job-id <job_ocid> \
    --job-artifact-file <job_artifact_file_path> \
    --content-disposition "attachment; filename=<job_artifact_file_name>"