19 Machine Learning (Preview)

Oracle AI Data Platform Workbench provides machine learning (ML) lifecycle management using MLflow concepts and APIs, specifically experiments, runs, and a model registry.

These capabilities are deeply integrated into AI Data Platform Workbench across multiple surfaces, including Workspaces, Experiments, and Catalog, so teams can track work where it happens and promote outcomes into shared, governed assets.

ML Lifecycle

End-to-end ML lifecycles typically follows these steps:
  1. Data prep: Clean and format raw inputs
  2. Exploratory data analysis (EDA): Explore data to find patterns
  3. Feature engineering: Create variables for models
  4. Experiment: Iteratively train using multiple approaches (each iteration is a run)
  5. Validate and store: Identify the best run and register the model for reuse
  6. Run inferences: Use a registered model version for batch inference from notebooks
  7. Monitor: Track basic production performance and availability for deployed models

Core capabilities

Experiment tracking per team workspace

  • Experiments are scoped to a workspace to separate teams and organize work.
  • MLflow-compatible autologging captures parameters, metrics, and artifacts for each run, creating a reproducible record that supports reruns with controlled changes.

Run comparison and registration

  • Runs can be filtered and compared to identify a candidate model.
  • A run can be registered into the Master Catalog-backed Model Registry, carrying versioning, tags, and custom fields. Version management is handled by the platform when updated models are registered.

From registry to notebook inference

  • Models can be loaded in notebooks by latest or by explicit version, enabling consistent reuse.
  • Batch inference workflows can reference registry versions directly, reducing manual handling between experimentation and inference.

Lineage for auditability

  • Registered models link back to the originating experiment run, including run conditions such as hyperparameters, environment variables, metrics, and artifacts.
  • This supports review and audit by making the provenance of each model explicit.

Why Use MLflow?

AI Data Platform Workbench uses MLflow as the foundation for its MLOps framework because it provides an open, extensible, and framework-agnostic approach to managing the end-to-end machine learning lifecycle.

MLflow supports the core capabilities required for operationalizing machine learning at scale, including experiment tracking, model packaging, artifact management, model versioning, registry-based and governance. Its ability to capture parameters, metrics, artifacts, and run metadata in a consistent way makes it well suited for improving reproducibility, auditability, and collaboration across data science and engineering teams.

A key reason for selecting MLflow is its broad compatibility with popular machine learning frameworks such as TensorFlow, PyTorch, and scikit-learn. This allows AI Data Platform Workbench to support diverse model development patterns without forcing teams into a single framework or toolchain. MLflow’s plugin architecture and deployment flexibility also make it easier to extend the platform and integrate with existing enterprise infrastructure.

By standardizing on MLflow, AI Data Platform Workbench can provide a consistent MLOps experience across experimentation, model registration, lifecycle management, while retaining the flexibility needed to evolve with different AI/ML use cases.