Modifying Uploaded Data Sources

You can modify uploaded data sets to help you further curate data in projects. This is also sometimes referred to as “data wrangling”.

You can add new columns, edit columns, delete columns, and hide and show columns for a data set. The column editing options depend on the column data type (date, strings, or numeric). These options do the work for you by invoking a logical SQL function that edits the current column or creates a new one in the selected data set.

For example, you can select the Convert to Text option for the Population column (number data type). It uses the formula of the Population column, and wraps it with a logical SQL function to convert the data to text and adds that newly converted data text column to the data set. Note that the original Population column is not altered.

Modifying data sets can be very helpful in cases where you may not have been able to perform joins between data sources because of “dirty data”. You can create a column group or build your own logical SQL statement to create a new column that essentially enables you to scrub the data.

To modify uploaded data sets:
  1. On the project toolbar, click Stage.
  2. If there are more than one uploaded data sets in the project, select the one you want to work with. Only the first 100 records in the selected data set are displayed.
  3. Click Options for the column you want to work with, and then select an option.
    • Concatenate takes two columns and concatenates them to create a new column.
    • Edit Column edits the current column and can be used to reformat a source column without creating a second column and hiding the original column.
    • Hide hides the column in the Data Elements pane and in visualizations on the canvas. If you want to see hidden columns, click Hidden columns (ghost icon) on the page header. You can then unhide individual columns or unhide them all at once.
    • Group enables you to create your own custom groups. For example, for the State column, you can For example you can group States together into custom Regions. Of you can categorize dollar amounts into groups indicating small, medium, and large.
    • Replace enables you to replace bits of words in a column and create a new column with the string you entered.
    • Split enables you to split a specific column value into parts. For example, you could split a column called Name into first and last name.
    • Uppercase creates a column with the values in all capital letters and the Lowercase option creates a new column with the values all in lower case.
    Data wrangling doesn't modify the original columns in the data set. Instead, duplicate columns are created.
  4. Save your changes.

Note:

When you edit a data set in this way, it affects all projects that use the data set. So, for example, if another user has a project that uses the data set you modified, and they open the project after the you change the data set, they see a message in their project that indicates that the data set has been modified.