Departmental Data Warehousing/Data Marts - Consolidate Spreadsheets

Analysts need an efficient way to consolidate data from multiple spreadsheets and other flat-file data sources into a trusted, maintainable, and query-optimized source.

With Oracle Autonomous Data Warehouse and Oracle Analytics Cloud, you can load and optimize data from multiple flat-file sources into a centralized data warehouse so departments can analyze the data and gain actionable insights.

This reference architecture positions the technology solution within the overall business context:



Lines of business typically don't have timely or efficient access to data and information. Analysts gather the data manually, work with it on an individual basis, and then share copies of files through email or file servers. The data is not centralized, so ensuring the accuracy and the security of the data is difficult. Analysis can take a long time and the results are not easily repeatable.

A data mart is a simple form of a data warehouse that is focused on a single subject or functional area, such as sales, marketing, or finance and are often built and controlled by a single department within an organization. Given their single-subject focus, data marts usually draw data from only a few sources. The sources could include internal systems, a central data warehouse, or external data.

Governed data warehouses and data marts can provide rich information to business users and more effectively deliver the organization's key performance indicators without relying heavily on IT resources and availability.

At a conceptual level, the technology solution addresses the problem as follows:



Architecture

This architecture uses Oracle Autonomous Data Warehouse to load and optimize data from multiple flat-file sources into a centralized data warehouse and then uses Oracle Analytics Cloud to analyze the data to provide actionable insights.



The architecture focuses on the following logical divisions:

  • Data refinery

    Ingests and refines the data for use in each of the data layers in the architecture. The shape is intended to illustrate the differences in processing costs for storing and refining data at each level and for moving data between them.

  • Data persistence platform (curated information layer)

    Facilitates access and navigation of the data to show the current business view. For relational technologies, data may be logical or physically structured in simple relational, longitudinal, dimensional or OLAP forms. For non-relational data, this layer contains one or more pools of data, either output from an analytical process or data optimized for a specific analytical task.

  • Access and interpretation

    Abstracts the logical business view of the data for the consumers. This abstraction facilitates agile approaches to development, migration to the target architecture, and the provision of a single reporting layer from multiple federated sources.

The architecture has the following components:

  • Network access control lists (ACLs)

    You can control access to your Oracle Autonomous Data Warehouse by setting network access control lists (ACLs). Access control lists specify client IP addresses that are allowed to connect to an Oracle Autonomous Data Warehouse instance and block all other IP addresses.

  • Autonomous Data Warehouse

    Oracle Autonomous Data Warehouse is a fully managed, preconfigured database environment. You do not need to configure or manage any hardware, or install any software. After provisioning, you can scale the number of CPU cores or the storage capacity of the database at any time without impacting availability or performance.

  • Analytics

    Oracle Analytics Cloud is a scalable and secure public cloud service that provides a full set of capabilities to explore and perform collaborative analytics for you, your workgroup, and your enterprise.

    With Oracle Analytics Cloud you also get flexible service management capabilities, including fast setup, easy scaling and patching, and automated lifecycle management.

Recommendations

Use the following recommendations as a starting point to load and optimize data from multiple flat-file sources into a centralized data warehouse location for analysis.

Your requirements might differ from the architecture described here.

  • Oracle SQL Developer

    Use Oracle SQL Developer on your desktop or Oracle SQL Developer Web from a browser to run SQL statements and scripts in a worksheet and to perform other tasks with Oracle Autonomous Data Warehouse.

  • Oracle Analytics Cloud

    Before you connect Oracle Analytics Cloud to Oracle Autonomous Data Warehouse, have a database administrator whitelist the IP address (or address range) for your Oracle Analytics Cloud instance. The database administrator must add a security rule that allows TCP/IP traffic from Oracle Analytics Cloud to the database.

Considerations

When loading and optimizing data from multiple flat-file sources into a centralized data warehouse location for analysis, consider these implementation options.

Guidance Data Refinery Data Persistence Platform Access & Interpretation
Recommended Oracle Autonomous Data Warehouse Oracle Autonomous Data Warehouse Oracle Analytics Cloud
Other Options Oracle Data Integrator Oracle Database Exadata Cloud Service  
Rationale Oracle Autonomous Data Warehouse is an easy-to- use, fully autonomous database that offers direct access to and ingestion of a variety of Data Sources such as csv files and relational data sources. Oracle Autonomous Data Warehouse is an easy-to- use, fully autonomous database that scales elastically, delivers fast query performance and requires no database administration. It also offers direct access to the data from object storage via external tables. Oracle Analytics Cloud is a fully managed and tightly integrated with the Curated Data Layer (Oracle Autonomous Data Warehouse).

Deploy

The Terraform code for this reference architecture is available on as a sample stack in Oracle Cloud Infrastructure Resource Manager.

You can also download the code from GitHub, and customize it to suit your specific requirements.

  • Deploy using the sample stack in Oracle Cloud Infrastructure Resource Manager:
    1. Go to Oracle Cloud Infrastructure Resource Manager.

      If you aren't already signed in, enter the tenancy and user credentials.

    2. Select the region where you want to deploy the stack.
    3. Follow the on-screen prompts and instructions to create the stack.
    4. After creating the stack, click Terraform Actions, and select Plan.
    5. Wait for the job to be completed, and review the plan.

      To make any changes, return to the Stack Details page, click Edit Stack, and make the required changes. Then, run the Plan action again.

    6. If no further changes are necessary, return to the Stack Details page, click Terraform Actions, and select Apply.
  • Deploy using the Terraform code in GitHub:
    1. Go to GitHub.
    2. Clone or download the repository to your local computer.
    3. Follow the instructions in the README document.

Explore More

Learn more about related architectures and about the features of this architecture.

Change Log

This log lists only the significant changes: