Data Platform - Departmental Data Warehouse for Line of Business (LOB)
Use Oracle Autonomous Data Warehouse and Oracle Analytics Cloud to load and optimize data from Oracle E-Business Suite, spreadsheets, third-party, and other sources into a centralized data warehouse so that departments can analyze the data and can gain actionable insights.
This reference architecture positions the technology solution within the overall business context:
Lines of business typically don't have timely or efficient access to data and information. Analysts gather the data manually, work with it on an individual basis, and then share copies of files through email or file servers. The data is not centralized, so ensuring the accuracy and the security of the data is difficult. Analysis can take a long time and the results are not easily repeatable.
A departmental data warehouse is a simple form of a data warehouse that is focused on a single subject or functional area, such as sales, marketing, or finance and is often built and controlled by a single department within an organization. Given their single-subject focus, departmental data warehouses usually draw data from only a few sources. The sources could include internal systems, a central data warehouse, or external data.
Governed enterprise and departmental data warehouses can provide rich information to business users and more effectively deliver the organization's key performance indicators without relying heavily on IT resources and availability.
Architecture
This architecture uses Oracle Autonomous Data Warehouse to load and optimize data from multiple financial systems, spreadsheets, and other data sources into a centralized data warehouse and then uses Oracle Analytics Cloud to analyze the data to provide actionable insights.
This architecture supports multiple use cases. The most direct path provides a simple method for acquiring, transforming and loading data into the data warehouse using tools embedded within Oracle Autonomous Data Warehouse (ADW). Other paths support scenarios that use a data lake or scenarios that have more complex extract, transform, and load (ETL) requirements.
Description of the illustration data-platform-ebs.png
The architecture focuses on the following logical divisions:
- Ingest, Transform, Connect
Ingests and refines the data for use in each of the data layers in the architecture.
- Persist, Curate, Create
Facilitates access and navigation of the data to show the current business view. For relational technologies, data may be logical or physically structured in simple relational, longitudinal, dimensional or OLAP forms. For non-relational data, this layer contains one or more pools of data, either output from an analytical process or data optimized for a specific analytical task.
- Analyze, Learn, Predict
Abstracts the logical business view of the data for the consumers. This abstraction facilitates agile approaches to development, migration to the target architecture, and the provision of a single reporting layer from multiple federated sources.
The following diagram shows a mapping of the architecture to services provided on Oracle Cloud Infrastructure with basic security.
oci-adb-oac-arch-public-oracle.zip
The architecture features the following components:
- Autonomous Data
Warehouse
Oracle Autonomous Data Warehouse is a self-driving, self-securing, self-repairing database service that is optimized for data warehousing workloads. You do not need to configure or manage any hardware, or install any software. Oracle Cloud Infrastructure handles creating the database, as well as backing up, patching, upgrading, and tuning the database.
- Analytics
Oracle Analytics Cloud is a scalable and secure public cloud service that provides a full set of capabilities to explore and perform collaborative analytics for you, for your workgroup, and for your enterprise.
With Oracle Analytics Cloud you also get flexible service management capabilities, including fast setup, easy scaling and patching, and automated lifecycle management.
- Data integration
Oracle Autonomous Data Warehouse includes tools to acquire, load, and transform data for many departmental scenarios.
For advanced use cases, you can use Oracle Cloud Infrastructure Data Integration, a fully managed, serverless, native cloud service that helps you with common extract, load, and transform (ETL) tasks such as ingesting data from different sources, cleansing, transforming, and reshaping that data, and then efficiently loading it to target data sources on Oracle Cloud Infrastructure.
- Region
An Oracle Cloud Infrastructure region is a localized geographic area that contains one or more data centers, called availability domains. Regions are independent of other regions, and vast distances can separate them (across countries or even continents).
- Compartment
Compartments are cross-region logical partitions within an Oracle Cloud Infrastructure tenancy. Use compartments to organize your resources in Oracle Cloud, control access to the resources, and set usage quotas. To control access to the resources in a given compartment, you define policies that specify who can access the resources and what actions they can perform.
- Identity and Access Management (IAM)
Oracle Cloud Infrastructure Identity and Access Management (IAM) is the access control plane for Oracle Cloud Infrastructure (OCI) and Oracle Cloud Applications. The IAM API and the user interface enable you to manage identity domains and the resources within the identity domain. Each OCI IAM identity domain represents a standalone identity and access management solution or a different user population.
- Policy
An Oracle Cloud Infrastructure Identity and Access Management policy specifies who can access which resources, and how. Access is granted at the group and compartment level, which means you can write a policy that gives a group a specific type of access within a specific compartment, or to the tenancy.
- Availability domain
Availability domains are standalone, independent data centers within a region. The physical resources in each availability domain are isolated from the resources in the other availability domains, which provides fault tolerance. Availability domains don’t share infrastructure such as power or cooling, or the internal availability domain network. So, a failure at one availability domain is unlikely to affect the other availability domains in the region.
- Virtual cloud network (VCN) and subnet
A VCN is a customizable, software-defined network that you set up in an Oracle Cloud Infrastructure region. Like traditional data center networks, VCNs give you complete control over your network environment. A VCN can have multiple non-overlapping CIDR blocks that you can change after you create the VCN. You can segment a VCN into subnets, which can be scoped to a region or to an availability domain. Each subnet consists of a contiguous range of addresses that don't overlap with the other subnets in the VCN. You can change the size of a subnet after creation. A subnet can be public or private.
- Security list
For each subnet, you can create security rules that specify the source, destination, and type of traffic that must be allowed in and out of the subnet.
- Route table
Virtual route tables contain rules to route traffic from subnets to destinations outside a VCN, typically through gateways.
- Internet gateway
The internet gateway allows traffic between the public subnets in a VCN and the public internet.
- Site-to-Site VPN
Site-to-Site VPN provides IPSec VPN connectivity between your on-premises network and VCNs in Oracle Cloud Infrastructure. The IPSec protocol suite encrypts IP traffic before the packets are transferred from the source to the destination and decrypts the traffic when it arrives.
- Dynamic routing gateway (DRG)
The DRG is a virtual router that provides a path for private network traffic between VCNs in the same region, between a VCN and a network outside the region, such as a VCN in another Oracle Cloud Infrastructure region, an on-premises network, or a network in another cloud provider.
- Network address translation (NAT) gateway
A NAT gateway enables private resources in a VCN to access hosts on the internet, without exposing those resources to incoming internet connections.
- Service gateway
The service gateway provides access from a VCN to other services, such as Oracle Cloud Infrastructure Object Storage. The traffic from the VCN to the Oracle service travels over the Oracle network fabric and never traverses the internet.
- Network security group (NSG)
Network security group (NSG) acts as a virtual firewall for your cloud resources. With the zero-trust security model of Oracle Cloud Infrastructure, all traffic is denied, and you can control the network traffic inside a VCN. An NSG consists of a set of ingress and egress security rules that apply to only a specified set of VNICs in a single VCN.
- Object storage
Object storage provides quick access to large amounts of structured and unstructured data of any content type, including database backups, analytic data, and rich content such as images and videos. You can safely and securely store and then retrieve data directly from the internet or from within the cloud platform. You can seamlessly scale storage without experiencing any degradation in performance or service reliability. Use standard storage for "hot" storage that you need to access quickly, immediately, and frequently. Use archive storage for "cold" storage that you retain for long periods of time and seldom or rarely access.
Recommendations
Use the following recommendations as a starting point to load and optimize data from multiple flat-file sources into a centralized data warehouse location for analysis.
Your requirements might differ from the architecture described here.
- Data Refinery
Autonomous Database Tools are embedded in Oracle Autonomous Data Warehouse and provide the ability to load, transform, catalog, gain insights, and even develop business models in simple straightforward fashion. These tools can be found under the Database Actions menu.
- Oracle
Analytics Cloud
Before you connect Oracle Analytics Cloud to Oracle Autonomous Data Warehouse, have a database administrator add the IP address (or address range) for your Oracle Analytics Cloud instance to the list of allowed addresses. The database administrator must add a security rule that allows TCP/IP traffic from Oracle Analytics Cloud to the database.
Considerations
When loading and optimizing data from multiple flat-file sources into a centralized data warehouse location for analysis, consider these implementation options.
Guidance | Data Refinery | Data Persistence Platform | Access & Interpretation |
---|---|---|---|
Recommended | Oracle Autonomous Database Tools (Database Actions) | Oracle Autonomous Data Warehouse | Oracle Analytics Cloud |
Other Options | Oracle Cloud Infrastructure Data Integration | Oracle Exadata Database Service | |
Rationale | Oracle Autonomous Data Warehouse is an easy-to-use, fully autonomous database that offers direct access to and ingestion of a variety of data sources such as CSV files and relational data sources. | Oracle Autonomous Data Warehouse is an easy-to-use, fully autonomous database that scales elastically, delivers fast query performance and requires no database administration. It also offers direct access to the data from object storage via external tables. | Oracle Analytics Cloud is a fully managed and tightly integrated with the curated data layer (Oracle Autonomous Data Warehouse). |
Deploy
The Terraform code for this reference architecture is available in GitHub. You can pull the code into Oracle Cloud Infrastructure Resource Manager with a single click, create the stack, and deploy it. Alternatively, you can download the code from GitHub to your computer, customize the code, and deploy the architecture by using the Terraform command line interface (CLI).
- Deploy by using Oracle Cloud Infrastructure Resource
Manager:
- Click
If you aren't already signed in, enter the tenancy and user credentials.
- Review and accept the terms and conditions.
- Select the region where you want to deploy the stack.
- Follow the on-screen prompts and instructions to create the stack.
- After creating the stack, click Terraform Actions, and select Plan.
- Wait for the job to complete, then review the plan.
To make any changes, return to the Stack Details page, click Edit Stack, and make the required changes. Then, run the Plan action again.
- If no further changes are necessary, return to the Stack Details page, click Terraform Actions, and select Apply.
- Click
- Deploy using the Terraform code in GitHub:
- Go to GitHub.
- Clone or download the repository to your local computer.
- Follow the instructions in the
README
document.
Change Log
This log lists only the significant changes:
June 20, 2023 | Updated the Deploy section to remove the full solution and small solution deployment options. |
December 8, 2022 |
|
February 10, 2022 | Updated the Deploy section and included links for deploying full solution and small solution. |
November 12, 2021 | Added the option to download editable versions (.SVG and .DRAWIO) of the architecture diagram. |
March 17, 2021 |
|
December 11, 2020 | Updated the Deploy section to deploy the stack in GitHub directly to Oracle Cloud Infrastructure Resource Manager with a single click. |
September 21, 2020 | Updated the Deploy section to include a link to Oracle Cloud Infrastructure Resource Manager as a source for the Terraform stack. |
June 29, 2020 | Updated the GitHub link in the Deploy section to point to the specific use-case folder in the project folder. |