About Using Oracle Modern Data Platform for Business Reporting and Forecasting

This design explores using Oracle Modern Data Platform for business reporting and forecasting.

Consider the use case where an enterprise has a large amount of product sales data from internal and external sources. While both historical and recent sales data has value for the company, but mostly only data from recent months or years is used in day to day business. Consumers of this data are business users who use this data for reporting, analysis and forecasting. They want their reports and dashboards to return results quickly to maximize employee productivity.

Traditionally businesses relied on data warehouses based on relational databases for reporting use cases such as the one mentioned above. However, these implementations had several limitations - scalability and performance being the most prominent of those limitations.

Architecture

A more modern approach is to use a data lakehouse architecture consisting of OCI Object Storage, Big Data technologies such as Hadoop, Spark and columnar or traditional database for reporting.

Oracle Cloud Infrastructure (OCI) has a wide array of tools and services that cater to all the aspects of a modern data platform. In this solution, we look at a small subset of OCI services that address the architectural requirements.

  1. OCI Object Storage: An internet-scale, high-performance storage platform that offers reliable and cost-efficient data durability. It can store an unlimited amount of unstructured data of any content type, including analytic data and rich content, like images and videos.
  2. Oracle Big Data Service: A managed Hadoop service that is designed for a diverse set of big data use cases and workloads. From short-lived clusters used to tackle specific tasks to long-lived clusters that can horizontally scale up to meet an organization’s requirements at a low cost and with the highest levels of security.
  3. Oracle Autonomous Database: An easy-to-use, fully autonomous database that scales elastically and delivers fast query performance. As a service, Autonomous Database does not require database administration.
  4. Oracle Analytics Cloud: A scalable and secure public cloud service that empowers business analysts and consumers with modern, AI-powered, self-service analytics capabilities for data preparation, visualization, enterprise reporting, augmented analysis, and natural language processing.

The following image illustrates the architecture.
Description of oci-modern-data-reporting-arch.png follows
Description of the illustration oci-modern-data-reporting-arch.png

oci-modern-data-reporting-arch-oracle.zip

Each of the architecture components listed above work together in the following manner:
  • OCI Object Storage layer provides reliable and cost effective way to store huge quantities of data. By using Object Storage you have a common persistence data store that can be used by multiple tools and services. This also ensures that data processing layer can be scaled up or down independently from the storage.
  • Oracle Big Data Service processing layer provides a platform for ingesting, transforming and aggregating bulk quantities of data.
  • Database layer serves as a fast and efficient method serving client reporting tools with curated data. Recent and,or only pertinent data is persisted in this layer.
  • Oracle Analytics Cloud provides the ability to visualize data and make forecasts.

This architecture supports the following components:

  • OCI Data Integration

    Oracle Cloud Infrastructure Data Integration is a fully managed, serverless, cloud-native service that extracts, loads, transforms, cleanses, and reshapes data from a variety of data sources into target Oracle Cloud Infrastructure services, such as Autonomous Data Warehouse and Oracle Cloud Infrastructure Object Storage. ETL (extract transform load) leverages fully-managed scale-out processing on Spark, and ELT (extract load transform) leverages full SQL push-down capabilities of the Autonomous Data Warehouse in order to minimize data movement and to improve the time to value for newly ingested data. Users design data integration processes using an intuitive, codeless user interface that optimizes integration flows to generate the most efficient engine and orchestration, automatically allocating and scaling the execution environment. Oracle Cloud Infrastructure Data Integration provides interactive exploration and data preparation and helps data engineers protect against schema drift by defining rules to handle schema changes.

  • Streaming

    Oracle Cloud Infrastructure Streaming provides a fully managed, scalable, and durable storage solution for ingesting continuous, high-volume streams of data that you can consume and process in real time. You can use Streaming for ingesting high-volume data, such as application logs, operational telemetry, web click-stream data; or for other use cases where data is produced and processed continually and sequentially in a publish-subscribe messaging model.