Data Warehouse
The data warehouse is a database primarily used for reporting and analysis purpose. It is a central repository for the current, as well as historical data from various operational applications. Data from transactional or operational applications is extracted, transformed, and loaded into the star schema in the data warehouses.
The components of a typical data warehouse environment are:
Source Application: The source application from where the data is transferred to the data warehouse.
ETL/ELT: The Extract, Transform and Load (ETL) tools move the data from the source application to the target data warehouse.
Staging Area: The staging area is a database that stores the raw data extracted from the source application.
Data Presentation Area: The data storage architecture in the data warehouse such as the star schema.
Source Application
The day-to-day transactions of the system is captured in the source application. Data in the source application is stored in a way that allows for the fast reads and updates. Queries are not expected to be in bulk, but typically arranged, row by row. To achieve this, data is stored in a normalized form in the source application. Generally, the source system does not store historical data.
ETL/ELT
The process of identifying data to be transferred from the source to the data presentation area, extracting the data, and transforming and loading into the data warehouse is called Extraction, Transform and Load (ETL). There are many tools available in the industry. Based on the complexity of the ETL, you need to choose the tools or write the ETL programs.
Staging Area
The data warehouse stores data from various operational sources. The staging area is a database that stores the raw data extracted from the source application. The replication area is similar to the source application in the structure. This area can be used for cleansing the data, resolving domain conflict, merging or combining data from the multiple source applications, data duplication, etc.
Data Presentation Area
Once data is cleansed in the staging area, it is ready to be moved to presentation area. The data is organized in such a way that it is easily available for the presentation. The data is stored in dimensional model, typically in the star schema.