Pipeline and Step Parameters

Add parameters to pipelines and steps.

ML Pipelines introduces parameterization for pipeline execution, enabling dynamic configuration of step inputs, infrastructure, and behavior without changing underlying code or container images. Parameters let you define a clear interface for pipelines, similar to a function that has a signature that defines its parameters. Pipelines become reusable components whose behavior can be precisely controlled at runtime without changing their implementation.

Two new parameter types are introduced:
Pipeline Parameters
Defined at pipeline creation and provided at pipeline run start, they can be referenced across steps.
Step Output Parameters
Produced by steps at runtime (written to JSON files) and consumed by later steps.

With pipeline parameters, you can:

  • Dynamically set environment variables, command line arguments, and infrastructure properties.
  • Pass values from one step to another without hard-coding.
  • Reference implicit parameters (for example, pipeline.id or pipeline.region_identifier) automatically provided by the service.
  • Reuse step implementations across workflows. Parameters make it easier to reuse existing step implementations without modification, letting you combine them into both linear and parallel flows. You can adapt steps to fit into new, unforeseen workflows

Pipeline and Step Parameters

Working with pipeline and step output parameters is easy. You can think of pipeline parameters as name-value pairs. You can refer to them from environment variables, command line arguments, and infrastructure configurations.

Parameter Types

Pipeline parameters
  • Declared in pipeline definition (parameters object).

  • Passed at pipeline run creation (parametersOverride).

  • Accessible in steps by using:

    {{pipeline.parameters.<name>}}
  • Support for implicit parameters such as:

    • pipeline.id

    • pipeline.region_identifier

    • pipeline.time_accepted

Step output parameters
  • Declared in stepParameters with outputParameterType (JSON).

  • Produced by writing values to files inside the step.

  • Accessed in later steps as:

    {{stepParameters.<stepName>.output.<paramName>}}
  • When a step refers to the step output parameters of another step, the step needs to state it directly in the dependencies (dependsOn field) or indirectly (transitively) by using dependencies of the step it depends on. That is, when a step is in the tree of steps that are depended on, its parameters can be used.

Parameterized Fields

Parameters can be referenced in:

  • Environment variables

  • Command line arguments

  • Infrastructure settings:

    • ocpusParameterized

    • memoryInGBsParameterized

    • blockStorageSizeInGBsParameterized

    • shapeName
  • Step run names

Limits

  • No parameterization of:

    • OCID fields (for example, subnetId).

    • Storage mounts
  • Maximum number of placeholders:

    • Five per environment variable value

    • Ten in command line arguments

    • One in other fields

  • Output step parameters can only be referenced in the configuration by steps that explicitly depend on the producing step or indirectly (transitively) by using dependencies of the step it depends on. That is, when a step is in the tree of steps that are depended on, its parameters can be used.
  • Step parameters aren't supported for Data Flow steps.

Adding Parameters When Creating Pipelines

  • Follow these steps to add pipeline parameters and define a compute shape with parameters.

    1. On the Projects list page, select the project that contains the pipelines that you want to work with. If you need help finding the list page or the project, see Listing Projects.
    2. On the project details page, select Pipelines.
    3. Select Create pipeline.
    4. On the Create pipeline page, under Parameters, define pipeline parameters:
      Note

      The step needs to ensure that the specified file (for example, /home/datascience/output.json) is populated with a valid JSON defining the specified variables. For example:
      { "message":"Hello John!", "ocpu": 2, "memory": 10 }
      • Custom environment variable key (Optional): The environment variables for this pipeline step.
      • Value (Optional): The key's value.
    5. If creating a step, under Output parameters, define output step parameters:

      This step is only for creating a pipeline step which is at a different level than pipeline.

      • Output parameter type: Select JSON.
      • Parameter name: Enter a parameter name.
      • Output file name: Select the output file name in which the step stores the output parameters. For example: /home/datascience/output.json.
    6. To define the compute shape with parameters, select Compute shape parameterized.
    7. Update other fields as needed.
      For field descriptions, see Creating a Pipeline.
    8. Select Create.
  • Use a JSON file to create the pipeline. Assume a JSON file called create_pipeline.json:
    Create the pipeline using:
    oci data-science pipeline create --from-json file://create_pipeline.json
  • Use the CreatePipeline operation to create a pipeline.

Example JSON File

The following example, create.pipeline.json, shows how to define a basic pipeline that takes two pipeline parameters, uses them to print a personalized greeting, and produces three output parameters that are used by the second step:
{
  "displayName": "Parameter Demo",
  "description": "Hello world pipeline with parameters",
  "compartmentId": "ocid1.compartment.oc1......",
  "projectId": "ocid1.datascienceproject.oc1......",
                            
  # defining 2 parameters that users can use to influence behavior of this pipeline
  "parameters": {
  "greeting": "Hello",
  "name": "John"
  },
  "infrastructureConfigurationDetails": {
   "blockStorageSizeInGBs": 50,
   "shapeConfigDetails": {
    "memoryInGBs": 16.0,
    "ocpus": 3.0
   },
   "shapeName": "VM.Standard.E5.Flex",
   "subnetId": null
  },
  "logConfigurationDetails": {
   "enableAutoLogCreation": true,
   "enableLogging": true,
   "logGroupId": "ocid1.loggroup......",
   "logId": null
  },
  "stepDetails": [
   {
    "stepName": "say_hello",
    "stepType": "CUSTOM_SCRIPT",
    "stepConfigurationDetails": {
     "type": "DEFAULT",
                
     # command line arguments and environment variables can refer to parameters (both pipeline and step output parameters)
     "commandLineArguments": "--person {{pipeline.parameters.name}}",
     "environmentVariables": {
      "GREETING": "{{pipeline.parameters.greeting}}"
     }
    },
    "stepParameters": {
     "parameterType": "DEFAULT",
     "output": {
      "outputParameterType": "JSON",
      "parameterNames": [
       "message",
       "ocpu",
       "memory"
      ],
      "outputFile": "/home/datascience/output.json"
     }
    }
   },
   {
    "stepName": "echo_message",
    "stepType": "CUSTOM_SCRIPT",
                            
    # to use output parameters of a step, the step must be included in the dependencies
    "dependsOn": [
     "say_hello"
    ],
    "stepConfigurationDetails": {
     "type": "DEFAULT",
                            
     # step output parameters can be used to configure next steps with on the fly provided values in a particular pipeline run
     "commandLineArguments": "--msg {{stepParameters.say_hello.output.message}}"
    },
    "stepInfrastructureConfigurationDetails": {
     "blockStorageSizeInGBs": 50,
     "shapeName": "VM.Standard.E5.Flex",
     "shapeConfigDetails": {
      "ocpus": 1,
      "memoryInGBs": 4,
                            
      # even infrastructure can be configured on the fly by refering to parameters
      "ocpusParameterized": "{{stepParameters.say_hello.output.ocpu}}",
      "memoryInGBsParameterized": "{{stepParameters.say_hello.output.memory}}"
     }
    }
   }
  ]
 }

Create Pipeline Runs

    1. Follow the steps in Starting a Pipeline Run.
    2. Somewhere in defining the run, you can set override parameters.
  • Use a JSON file to create the pipeline run. Assume a JSON file called create_run.json:
    Create the pipeline run using:
    oci data-science pipeline-run create --from-json file://create_run.json

Example JSON File

The following example, create_run.json, start a pipeline run by supplying values for the parameters defined in the pipeline:
{
 "displayName": "Hello Run",
 "compartmentId": "ocid1.compartment.oc1.......",
 "projectId": "ocid1.datascienceproject.oc1.....",
 "pipelineId": "ocid1.datasciencepipeline.oc1.....",
                            
 # pamaters can be viewed as an interface that allows users to control the behavior of the pipeline
 "parametersOverride": {
  "greeting": "Hey",
  "name": "Alex"
 }
}