1 Overview of Oracle AI Data Platform

This chapter provides information and procedures necessary for setting up your AI Data Platform.

What is Oracle AI Data Platform Used For?

Oracle AI Data Platform provides streamlined, secure, and seamless data management, analysis, and collaboration.

Oracle AI Data Platform is designed for enterprises that need to:
  • Streamline Data Discovery and Governance: AI Data Platform provides a centralized metadata repository (Master Catalog) that enhances searchability and governance of structured and unstructured data.
  • Enable Secure Data Collaboration: Through RBAC-based access control, AI Data Platform allows different teams to work on shared datasets while maintaining strict security policies.
  • Accelerate Data Preparation and Processing: With built-in notebooks and workflow orchestration, users can clean, transform, and enrich data efficiently.
  • Support Advanced Analytics and AI/ML: AI Data Platform integrates with Apache Spark, allowing data scientists and analysts to run complex computations and model training directly within their data lake.
  • Ensure Seamless Integration Across Data Sources: AI Data Platform supports external catalogs from Autonomous Database (ADB), Object Storage (OS), and third-party data sources, enabling users to query and analyze data without duplication.

Managed Integration with Open Source

Oracle AI Data Platform leverages and extends open-source technologies to provide a powerful yet managed experience.

Some key integrations include:
  • Apache Spark: AI Data Platform's compute layer is powered by Spark, enabling scalable, distributed data processing.
  • Delta Lake Support: AI Data Platform leverages Delta Lake to enhance data reliability, ACID transactions, and schema evolution.
  • Iceberg & Hudi Compatibility via Delta Uniform: Through Delta Uniform, AI Data Platform extends support for Apache Iceberg and Apache Hudi, enabling interoperability across different storage formats. This ensures users can adopt a unified table format strategy while maintaining efficient query execution and data governance.
  • JDBC Integration for BI Tools: AI Data Platform provides JDBC drivers, allowing seamless connectivity with external BI tools like Oracle Analytics Cloud (OAC) and third-party visualization platforms.

Personas for Oracle AI Data Platform Users

Oracle AI Data Platform serves a variety of users across different roles within an organization, each with unique needs and requirements.

Here’s a general overview of the key personas who interact with AI Data Platforms:
  • Data Engineers - Data engineers work with large-scale data pipelines, transforming raw data into usable formats for analysis. They rely on AI Data Platform’s robust capabilities to design and manage data workflows, ingest data from various sources, and ensure data quality. They are highly focused on automating processes, optimizing compute resources, and integrating different data systems seamlessly.
  • Data Analysts - Data analysts use AI Data Platform to discover, analyze, and generate insights from data. They require an intuitive interface and tools for querying and analyzing large datasets. AI Data Platform empowers them with interactive notebooks and seamless integration with business intelligence (BI) tools, helping them transform raw data into actionable insights for decision-makers.
  • Data Scientists - Data scientists leverage AI Data Platform’s scalable compute capabilities for machine learning and advanced analytics tasks. They need access to diverse datasets, powerful processing tools, and the ability to run complex models. AI Data Platform’s Spark-powered notebooks, AI/ML integration, and support for open-source libraries enable data scientists to build, test, and deploy models within the platform.
  • Data Stewards - These users ensure that all data is handled in compliance with industry regulations and organizational policies. They focus on maintaining data privacy, auditing access, and monitoring data usage across the organization. AI Data Platform helps them manage metadata, enforce role-based access controls (RBAC), and ensure proper governance through cataloging, lineage tracking, and security policies.

Common Use Cases for Oracle AI Data Platform

Oracle AI Data Platform serves a variety of use cases across industries and business functions.

Medallion Architecture

  • Implement a Medallion Architecture with bronze, silver, and gold layers.
  • Use Delta Uniform and Iceberg for efficient data storage and query optimization.
  • Enable zero-copy access to external data sources for seamless analytics.

ETL & Data Engineering

  • Use Spark-based workflows and notebooks to process, transform, and enrich raw data.
  • Automate data pipelines with low-code/no-code workflow orchestration.
  • Handle large-scale batch processing and real-time data ingestion.

Machine Learning, AI and Data Science

  • Train and deploy machine learning models using Spark-powered notebooks.
  • Enable large-scale feature engineering and data transformation.
  • Provide managed execution environments for Python and PySpark workloads.

Enterprise Data Catalog & Governance, Delta Sharing

  • Centralized metadata management for structured and unstructured data.
  • Role-based access control (RBAC) for secure data access and collaboration.
  • Integration with external catalogs, including Autonomous Database (ADB) and Object Storage.
  • Oracle AI Data Platform supports Delta Sharing, enabling secure, real-time, and governed data sharing across organizational boundaries.

Analytics, Business Intelligence & Reporting

  • Connect OCI Oracle Analytics Cloud (OAC) and third-party BI tools via JDBC like Tableau, Power BI.

Multi-Cloud & Hybrid Data Integration

  • Enable federated query execution across multiple OCI services.
  • Integrate with third-party cloud storage and databases for hybrid analytics.
  • Maintain data sovereignty and compliance across multiple environments.