Pipeline and Step Parameters
Add parameters to pipelines and steps.
ML Pipelines introduces parameterization for pipeline execution, enabling dynamic configuration of step inputs, infrastructure, and behavior without changing underlying code or container images. Parameters let you define a clear interface for pipelines, similar to a function that has a signature that defines its parameters. Pipelines become reusable components whose behavior can be precisely controlled at runtime without changing their implementation.
- Pipeline Parameters
- Defined at pipeline creation and provided at pipeline run start, they can be referenced across steps.
- Step Output Parameters
- Produced by steps at runtime (written to JSON files) and consumed by later steps.
With pipeline parameters, you can:
- Dynamically set environment variables, command line arguments, and infrastructure properties.
- Pass values from one step to another without hard-coding.
- Reference implicit parameters (for example,
pipeline.idorpipeline.region_identifier) automatically provided by the service. - Reuse step implementations across workflows. Parameters make it easier to reuse existing step implementations without modification, letting you combine them into both linear and parallel flows. You can adapt steps to fit into new, unforeseen workflows
Pipeline and Step Parameters
Working with pipeline and step output parameters is easy. You can think of pipeline parameters as name-value pairs. You can refer to them from environment variables, command line arguments, and infrastructure configurations.
Parameter Types
- Pipeline parameters
-
-
Declared in pipeline definition (
parametersobject). -
Passed at pipeline run creation (
parametersOverride). -
Accessible in steps by using:
{{pipeline.parameters.<name>}} -
Support for implicit parameters such as:
-
pipeline.id -
pipeline.region_identifier -
pipeline.time_accepted
-
-
- Step output parameters
-
-
Declared in
stepParameterswithoutputParameterType(JSON). -
Produced by writing values to files inside the step.
-
Accessed in later steps as:
{{stepParameters.<stepName>.output.<paramName>}} -
When a step refers to the step output parameters of another step, the step needs to state it directly in the dependencies (
dependsOnfield) or indirectly (transitively) by using dependencies of the step it depends on. That is, when a step is in the tree of steps that are depended on, its parameters can be used.
-
Parameterized Fields
Parameters can be referenced in:
-
Environment variables
-
Command line arguments
-
Infrastructure settings:
-
ocpusParameterized -
memoryInGBsParameterized -
blockStorageSizeInGBsParameterized shapeName
-
-
Step run names
Limits
-
No parameterization of:
-
OCID fields (for example,
subnetId). - Storage mounts
-
-
Maximum number of placeholders:
-
Five per environment variable value
-
Ten in command line arguments
-
One in other fields
-
- Output step parameters can only be referenced in the configuration by steps that explicitly depend on the producing step or indirectly (transitively) by using dependencies of the step it depends on. That is, when a step is in the tree of steps that are depended on, its parameters can be used.
-
Step parameters aren't supported for Data Flow steps.
Adding Parameters When Creating Pipelines
Follow these steps to add pipeline parameters and define a compute shape with parameters.
Use a JSON file to create the pipeline. Assume a JSON file called create_pipeline.json: Create the pipeline using:oci data-science pipeline create --from-json file://create_pipeline.jsonUse the CreatePipeline operation to create a pipeline.
Example JSON File
create.pipeline.json, shows how to define a basic pipeline that takes two pipeline parameters, uses them to print a
personalized greeting, and produces three output parameters that are used by the second step:{
"displayName": "Parameter Demo",
"description": "Hello world pipeline with parameters",
"compartmentId": "ocid1.compartment.oc1......",
"projectId": "ocid1.datascienceproject.oc1......",
# defining 2 parameters that users can use to influence behavior of this pipeline
"parameters": {
"greeting": "Hello",
"name": "John"
},
"infrastructureConfigurationDetails": {
"blockStorageSizeInGBs": 50,
"shapeConfigDetails": {
"memoryInGBs": 16.0,
"ocpus": 3.0
},
"shapeName": "VM.Standard.E5.Flex",
"subnetId": null
},
"logConfigurationDetails": {
"enableAutoLogCreation": true,
"enableLogging": true,
"logGroupId": "ocid1.loggroup......",
"logId": null
},
"stepDetails": [
{
"stepName": "say_hello",
"stepType": "CUSTOM_SCRIPT",
"stepConfigurationDetails": {
"type": "DEFAULT",
# command line arguments and environment variables can refer to parameters (both pipeline and step output parameters)
"commandLineArguments": "--person {{pipeline.parameters.name}}",
"environmentVariables": {
"GREETING": "{{pipeline.parameters.greeting}}"
}
},
"stepParameters": {
"parameterType": "DEFAULT",
"output": {
"outputParameterType": "JSON",
"parameterNames": [
"message",
"ocpu",
"memory"
],
"outputFile": "/home/datascience/output.json"
}
}
},
{
"stepName": "echo_message",
"stepType": "CUSTOM_SCRIPT",
# to use output parameters of a step, the step must be included in the dependencies
"dependsOn": [
"say_hello"
],
"stepConfigurationDetails": {
"type": "DEFAULT",
# step output parameters can be used to configure next steps with on the fly provided values in a particular pipeline run
"commandLineArguments": "--msg {{stepParameters.say_hello.output.message}}"
},
"stepInfrastructureConfigurationDetails": {
"blockStorageSizeInGBs": 50,
"shapeName": "VM.Standard.E5.Flex",
"shapeConfigDetails": {
"ocpus": 1,
"memoryInGBs": 4,
# even infrastructure can be configured on the fly by refering to parameters
"ocpusParameterized": "{{stepParameters.say_hello.output.ocpu}}",
"memoryInGBsParameterized": "{{stepParameters.say_hello.output.memory}}"
}
}
}
]
}Create Pipeline Runs
- Follow the steps in Starting a Pipeline Run.
- Somewhere in defining the run, you can set override parameters.
Use a JSON file to create the pipeline run. Assume a JSON file called create_run.json: Create the pipeline run using:oci data-science pipeline-run create --from-json file://create_run.json
Example JSON File
create_run.json, start a pipeline run by supplying values for the parameters defined in the
pipeline:{
"displayName": "Hello Run",
"compartmentId": "ocid1.compartment.oc1.......",
"projectId": "ocid1.datascienceproject.oc1.....",
"pipelineId": "ocid1.datasciencepipeline.oc1.....",
# pamaters can be viewed as an interface that allows users to control the behavior of the pipeline
"parametersOverride": {
"greeting": "Hey",
"name": "Alex"
}
}