Performance Considerations for Frequent Data Refresh
Review and consider the following to ensure that frequent data refreshes work as expected.
Performance of frequent data refreshes depends on the following:
- Size of data.
- Data change such as what data has changed, and which pipeline gets triggered.
- Number of extracted records that may result in very different number of published records. For example, in one scenario, 44 extracted records resulted in 1060 published records in 70 minutes and 395 extracted records resulted in 55 published records in 35 minutes.
The frequent data refresh process isn't executed in the following scenarios:
- In the 180-minute window before the scheduled start of the daily incremental data refresh.
- If any release upgrade is in progress.
- Until the previous frequent data refresh process is completed. You can set a maximum frequency of 1 hour. If it takes more than 1 hour to complete the refresh, the next frequent data refresh process starts at the next hour.
Follow these guidelines to configure frequent data refresh of warehouse tables; that is, those in datasets:
- Ensure that you know the exact names of the datasets to refresh.
- Specify no more than 20 datasets for each run.
- Determine any dependencies and include the applicable tables because dependencies aren't automatically included.