Plan a Data Model

Before you start modeling your data, take some time to think about your business requirements and to understand data modeling concepts.

Topics:

Understand Data Model Requirements

Before you can begin to model data, you must first understand your data model requirements:

  • What kinds of business questions are you trying to answer?

  • What are the measures required to understand business performance?

  • What are all the dimensions under which the business operates? Or, in other words, what are the dimensions used to break down the measurements and provide headers for the reports?

  • Are there hierarchical elements in each dimension, and what types of relationships define each hierarchy?

After you have answered these questions, you can identify and define the elements of your business model.

Components of Data Models

Fact tables, dimension tables, joins, and hierarchies are key components you will come across when building your data model.

Component Description

Fact Tables

Fact tables contain measures (columns) that have aggregations built into their definitions.

Measures aggregated from facts must be defined in a fact table. Measures are typically calculated data such as dollar value or quantity sold, and they can be specified in terms of hierarchies. For example, you might want to determine the sum of dollars for a given product in a given market over a given time period.

Each measure has its own aggregation rule such as SUM, AVG, MIN, or MAX. A business might want to compare values of a measure and need a calculation to express the comparison.

Dimension Tables

A business uses facts to measure performance by well-established dimensions, for example, by time, product, and market. Every dimension has a set of descriptive attributes. Dimension tables contain attributes that describe business entities (like Customer Name, Region, Address, or Country).

Dimension table attributes provide context to numeric data, such as being able to categorize Service Requests. Attributes stored in this dimension might include Service Request Owner, Area, Account, or Priority.

Dimension tables in the data model are conformed. In other words, even if there are three different source instances of a particular Customer table, the data model only has one table. To achieve this, all three source instances of Customer are combined into one using database views.

Joins

Joins indicate relationships between fact tables and dimension tables in the data model. When you create joins, you specify the fact table, dimension table, fact column, and dimension column you want to join.

Joins allow queries to return rows where there is at least one match in both tables.

Tip: Analysts can use the option Include Null Values when building reports to return rows from one table where there’re no matching rows in another table.

Hierarchies

Hierarchies are sets of top-down relationships between dimension table attributes.

In hierarchies, levels roll up from lower levels to higher levels. For example, months can roll up into a year. These rollups occur over the hierarchy elements and span natural business relationships.

About Modeling Source Objects with Star Relationships

Star sources consist of one or more fact tables that reference any number of dimension tables. Because Data Modeler presents data in a star structure, working with star sources is the simplest modeling scenario. In star sources, dimensions are normalized with each dimension represented by a single table.

For example, assume that you have separate sources for Revenue Measures, Products, Customers, and Orders. In this scenario, you load data from each source to separate database tables. Then, you use Data Modeler to create a fact table (Revenue Measures) and dimension tables (Products, Customers, and Orders). Finally, you create joins between the dimension tables and the fact table.

When you create your fact and dimension tables, you can drag and drop the source objects into the data model, or you can use menu options to create the fact and dimension tables individually.

See Roadmap for Modeling Data for a full list of data modeling tasks.

About Modeling Source Objects with Snowflake Relationships

Snowflake sources are similar to star sources. In a snowflake structure, however, dimensions are normalized into multiple related tables rather than in single dimension tables.

For example, assume that you have separate sources for Revenue Measures, Products, Customers, and Orders. In addition, you have separate sources for Brands (joined to Products) and Customer Group (joined to Customers). The Brands and Customer Group tables are considered to be "snowflaked" off the core dimension tables Customers and Products.

In this scenario, you load data from each source to separate database tables. Next, you create database views that combine the multiple dimension tables into a single table. In this example, you create one view that combines Products and Brand, and another view that combines Customer and Customer Group.

Then, you use Data Modeler to create a fact table (Revenue Measures) and dimension tables (Products + Brand view, Customers + Customer Group view, and Orders). Finally, you create joins between the dimension tables and the fact table.

See Roadmap for Modeling Data for a full list of data modeling tasks.

About Modeling Denormalized Sources

Denormalized sources combine facts and dimensions as columns in one table (or flat file). With a denormalized flat source, one data file is loaded into one table. The data file consists of dimension attributes and measure columns.

In some cases, the data model might consist of a hybrid model that involves a combination of star, snowflake, and denormalized sources. For example, a denormalized source might include information about revenue measures, products, customers, and orders - but all in a single file rather than in separate source files.

In this scenario, you first load the denormalized file as a single database table. Then, you use the Add to Model wizard to partition columns into multiple fact and dimension tables. In this example, you drag and drop revenue measure columns to create a fact table, then drag and drop columns for products, customers, and orders to create three separate dimension tables. Finally, you create joins between the dimension tables and the fact table.

See Roadmap for Modeling Data for a full list of data modeling tasks.

About Modeling Normalized Sources

Normalized or transactional sources distribute data into multiple tables to minimize data storage redundancy and optimize data updates. In a normalized source, you have multiple data files that correspond to each of the transactional tables. Data from Oracle Cloud applications is likely partitioned into a normalized source.

Similar to snowflake sources, modeling normalized sources involves creating database views to combine columns from multiple source tables into individual fact and dimension tables. Some normalized sources are very complex, requiring a number of database views to organize the data into a star-type model.

For example, assume that you have source files for Products, Customers, Orders, and Order Items. Orders and Order Items both contain facts.

In this scenario, you first load the files as separate database tables. Next, you create a database view that combines the multiple fact columns into a single table. In this example, you create a view that combines columns from Orders and Order Items.

Then, you use Data Modeler to create a fact table (Orders + Order Items view) and dimension tables (Products and Customers). Finally, you create joins between the dimension tables and the fact table.

See Roadmap for Modeling Data for a full list of data modeling tasks.