You can explore your imported data to evaluate its quality and usability. You can then prepare your data to improve its quality before you import it into the application. For example, you can remove white space, filter your data, delete duplicate rows or cells, find and replace data, concatenate columns, and so on. You can refine the data from different sources to standardize the format for all the data. You can also use data wrangling as a data discovery tool and to bypass traditional extraction, transformation, and loading (ETL) methods.
On the Wrangle Your Data page, you work with sample data (a subset of your actual data) to build a script of functions that will clean and prepare your data, also known as data wrangling.
To build the script:
Select data. This can be a data column or part of a cell.
Select a function to apply to the selected data. For example, you can split the selection into multiple columns or change the data type.
The function and data are added to the script.
Repeat the steps above until you have refined and improved your data.
You can view all the changes that you made to your data, by scrolling through the script. To undo a function and all the functions applied after it, click the function in the History Script area. Similarly, you can redo functions by clicking a function in the History Script area. All undone functions applied before the selected function are re-applied.
Save the completed script.
The script is saved with the dataset and therefore in the dashboard. The completed script is applied to your data when you publish the cube. You can also export the script, to save the functions in order. You can then import the saved script to apply to another set of data. Steps for both are included in the procedure.
You can:
Delete a column or row.
Remove duplicate cells or rows.
Extract data to create a new column, by specifying:
A fixed length of characters to extract
The character that indicates where to begin extracting data
The character that indicates where to stop extracting data
Filter data, so that you can focus on a refining smaller subsets of data. Rows are not deleted when they are filtered; they are only hidden from view.
Find and replace a cell or characters in cells.
Split the selected data (including the header) into separate columns or cells by specifying:
A fixed length of characters to split
The character to split the selection at
The character that indicates where to begin the split
The character that indicates where to end the split
Split the selected cell into separate rows by specifying the character to split the selection at.
Change the selection to title case (each word begins with an uppercase letter), uppercase, or lowercase.
Replace the codes in an HTML-encoded selection with the correct characters, so that the text becomes more readable. For example, you can refine encoded text using the Unescape HTML function.
Change the data type.
Copy selected data into blank space (Fill Down).
Blank out cells that contain repeated data (Blank Down).
Add characters before or after the text in cells (Pad). For example, if your Region column contains “South” and “North”, you can add “ern” using the Pad After function. The column then contains “Southern” and “Northern”.
Trim leading and trailing white space.
Remove blocks of white space.
Remove specific characters, or all text before or after specified characters. For example, you can remove all the dashes from the Phone Number column, so that the column contains only numbers. If you do not want your Revenue column to display cents (such as $12.43), you can remove all the characters after the period, then remove the period (resulting in $12).
Group similar data into a cluster and edit the entire cluster at once.
Duplicate the selection into another column.
Rename the selection.
Concatenate or combine two columns into a new column.
Swap columns and rows (Transpose).
If you republish your data, the data wrangling steps are applied to the new data. For example, if you changed a text column to all uppercase, the same text column will be displayed as uppercase in the republished data. For background information on republish and refreshing your data, see Republishing imported data.
You can refine data imported from your computer or network, or uploaded from a URL
This procedure assumes that you have already begun importing data, and are previewing your data. The Preview page is displayed after you select a data source to import from, configure the data source, and click Prepare Data. For a list of the data sources that you can import from, and links to steps to import data, see About importing data.
To refine your data before importing it:
On the Preview page, click Wrangle. The Wrangle Your Data page opens, displaying a sample of your dataset.
You can import a saved script of functions to apply to your data. To import a script, follow the steps below:
Click Import in the History Script area. The Import History Script dialog box opens.
Do one of the following:
To import a saved script, click Browse.
Copy a script into the interface.
Click Apply. The functions are applied to your data sample and you are returned to the Wrangle Your Data page.
If you do not want to make any additional changes to the data, click OK to apply the script, as displayed in the History Script area, to your actual data. You are returned to the Preview page.
Select a data column or part of a cell to apply a function to. A list of suggested functions is displayed, based on the selected data.
Select a function to apply to the selected data, from either the suggested functions or the Select Function drop-down list.
For a function that needs no further input, like facet, the function is automatically applied to your data.
If the function needs parameters, define the parameters, then click Apply to apply the function to your data.
If you change a column’s data type to date, type the date format of your data into the Input Date Format field, using the characters in Valid date and time forms for wrangling your data. For example, if your data displays 01022014 for January 2, 2014, type the date format MMddyyyy in the field. After the data is converted to a date, the dates are displayed in yyyy-MM-dd format. That is, January 2, 2014 displays as 2014-01-02.
The sample data is updated to preview the effect of the function.
To work with a smaller subset of data, filter the data using the following steps:
Select the data to work with.
Select one of the following from the Select Function drop-down list:
To filter text by selecting values from the data, select Text Facet. A list of the text values in the selected data is displayed. To filter the selection, click the text value to display in the selection. You can instead display all values except the selected value, by clicking Exclude. This option can be helpful when you do not know the exact values in your data.
To filter text by typing values, select Text Filter. Type the text to filter by.
To filter numeric data, select Numeric Facet. A graph is displayed of the numbers in the selected data. Move the start point and end point of the graph to filter the range of values to display in the selection.
To filter data consisting of dates, select Timeline Facet. A graph is displayed of the dates in the selected data. Move the start point and end point of the graph to filter the range of dates to display in the selection.
Rows are not deleted when you use a facet function; they are only hidden from the data preview.
You can apply functions to the filtered data, as described above.
To undo a specific filter, click the arrow icon in the title of the filter, and select Reset.
To undo all filters, click the arrow icon of any filter title, and select Reset ALL Facets.
To delete a specific filter, click the arrow icon in the title of the filter, and select Delete.
To delete all filters, click the arrow icon of any filter title, and select Delete ALL Facets.
To group similar data into a cluster and edit the entire cluster at once, complete the following steps:
Select the data to work with.
From the Select Function drop-down list, select Cluster and Edit. The Cluster and Edit dialog box opens. The selected data has been grouped into clusters of related data.
To change the algorithm that determines the clusters, select a different algorithm from the Algorithm drop-down list.
To replace all the cluster’s values with a new value, complete the following steps:
Select the check box of the cluster.
In the New Cell Value field, type the replacement value.
Repeat the step above for each cluster to update.
Do one of the following:
To replace the values of the selected clusters and recluster the data, click Merge and Recluster. Repeat the steps above as needed.
To replace the values of the selected clusters and continue refining data, click Merge and Close. You are returned to the Refine Your Data page.
To edit data directly in a cell, complete the following steps:
Hover your cursor over the cell to edit and click the Edit icon
Make your changes in the dialog box that opens.
You can choose to change the selected cell or the entire column. For example, if you select a cell that contains Books and change it to Book, you can choose to change only that cell. If you apply the change to all the cells, all the Books cells in the column are changed to Book. Select one of the following:
To make the changes to the selected cell only, click Apply.
To make the changes to all cells in the column of the selected cell, click Apply to All.
To undo a function, click the function in the History Script area. All functions applied after the selected function are also undone.
To redo a function, click the function in the History Script area. All undone functions applied before the selected function are also re-applied.
To undo all the actions in the script, click Reset to Initial Data in the History Script area. Your data is returned to its original state. The history script is not deleted until you select another function, which causes a new script to begin.
You can export the script, to save the functions in order. You can then import the saved script to apply to another set of data. To export the script, complete the following steps:
Click Export in the History Script area. The Extract History Script dialog box opens, displaying the full script.
Click Save to save the script into a file, to save the functions for later use.
Click Close to return to the Wrangle Your Data page.
When you have completed refining and cleaning your data, click OK to apply the script, as displayed in the History Script area, to your actual data. You are returned to the Preview page.
Previewing your data and specifying data import options
Best practices: Importing data
_____________________________
Copyright © 2019, Oracle and/or its affiliates. All rights reserved.
Legal Notices