Types of Data You Can Refresh

After you add data, the data might change, so you must refresh the data from its source.

Rather than refreshing a data set, you can replace it by loading a new data set with the same name as the existing one. However, replacing a data set can be destructive and is discouraged. Don’t replace a data set unless you understand the consequences:

  • Replacing a data set breaks projects that use the existing data set if the old column names and data types aren’t all present in the new data set.
  • Any data wrangling (modified and new columns added in the data stage) is lost and projects using the data set are likely to break.

Databases

For databases, the SQL statement is rerun and the data is refreshed.

CSV or TXT

To refresh a CSV or TXT file, you must ensure that it contains the same columns that are already matched with the date source. If the file that you reload is missing some columns, then you’ll see an error message that your data reload has failed due to one or more missing columns.

You can refresh a CSV or TXT file that contains new columns, but after refreshing, the new columns are marked as hidden and don’t display in the Data Panel for existing projects using the data set.

Excel

To refresh a Microsoft Excel file, you must ensure that the newer spreadsheet file contains a sheet with the same name as the original one. In addition, the sheet must contain the same columns that are already matched with the data source. If the Excel file that you reload is missing some columns, then you'll see an error message that your data reload has failed due to one or more missing columns.

You can refresh an Excel file that contains new columns, but after refreshing, the new columns are marked as hidden and don’t display in the Data Panel for existing projects using the data set. To resolve this issue, use the Inspect option of the data set to show the new columns and make them available to existing projects.

Oracle Applications

You can reload data and metadata for Oracle Applications data sources, but if the Oracle Applications data source uses logical SQL, reloading data only reruns the statement, and any new columns or refreshed data won’t be pulled into the project. Any new columns come into projects as hidden so that existing projects that use the data set aren’t affected. To be able to use the new columns in projects, you must unhide them in data sets after you refresh. This behavior is the same for file-based data sources.