You can modify uploaded data sets to help you further curate (organize and integrate from various sources) data in projects. This is also sometimes referred to as data wrangling.
You can create new columns, edit columns, and hide and show columns for a data set. The column editing options depend on the column data type (date, strings, or numeric). Selecting an option invokes a logical SQL function that edits the current column or creates a new one in the selected data set.
For example, you can select the Convert to Text option for the Population column (number data type). It uses the formula of the Population column, and wraps it with a logical SQL function to convert the data to text and adds that newly converted data text column to the data set. Note that the original Population column isn’t altered.
Modifying data sets can be very helpful in cases where you haven’t been able to perform joins between data sources because of dirty data. You can create a column group or build your own logical SQL statement to create a new column that essentially you scrub data (amend or remove data in the database that isn’t correct in some way).
For a date or time column, create a year, quarter, month, or day column.
For an attribute column, convert a column to a number or convert it to a date. You can concatenate or replace the column. You can group or split the column. You can apply upper case, lower case, or sentence case to the data items in the column.
For a measure column, apply operators such as power, square root, or exponential.
- In the Project Editor, click the Prepare canvas.
- If there is more than one uploaded data set in the project, then go to the tabs at the bottom of the window and select the data set that you want to work with. The first 100 records in the selected data set are displayed.
- Click Options for the column that you want to work with, and then select an option to modify or convert the column. The options list and column modifications you can perform depends on the type of column you’re working with.Data wrangling doesn't modify the original columns in the data set. Instead, it creates duplicate columns.
- Click Save.
Note:When you edit a data set in this way, it affects all projects that use the data set. For example, if another user has a project that uses the data set that you modified, and they open the project after you change the data set, they see a message in their project that indicates that the data set has been modified.