Before you Begin

This 10-minute tutorial shows you how to use data quality insights to implement changes in datasets using Oracle Analytics.

Background

When you create a dataset from a file or connection, Oracle Analytics profiles the data and provides data quality insights for a representational sample of the data. In this tutorial, you create a dataset from two spreadsheet files and use a common column to create a join between the dataset tables.

Data quality insights provide histograms and frequency tiles as visual overviews of your data columns. Quality bars indicate the validity of the data in the columns, and can help you resolve hidden issues in your data.

In this tutorial, you learn how to fix hidden issues such as spelling errors, missing values, and non-standard values in your data.

What Do You Need?

Create a Dataset Table

In this section, you create a dataset from the web_orders_data and web_customers spreadsheet files.

  1. Sign in to Oracle Analytics.
  2. On the Home page, click Create, and then click Dataset.
  3. In Create Dataset, click Drop data file here or click to browse, select the web_orders_data.xlsx file, and then click Open.
  4. In Create Dataset Table from web_orders_data.xlsx, click OK.


    Description of web_orders_data.png follows
    Description of the illustration web_orders_data.png
  5. In the Connections Connections icon panel, click Add Add icon, and then click Add File.
  6. In File Upload, select web_customers.xlsx, and then click Open.
  7. In Create Dataset Table from web_customers.xlsx, click OK.
  8. In the Join Diagram, drag the web_customers table on top of the web_orders_data table.
  9. In Join web_orders_data - web_customers under the web_orders_data column, click the select a column list icon, select CUSTID.
  10. In Join web_orders_data - web_customers under the web_customers column, click the select a column list icon, select Custid, and then click outside of the dialog.
  11. Click Save Save icon. In Save Dataset as, enter web_orders in Name, and then click OK.


    Description of web_orders_dataset.png follows
    Description of the illustration web_orders_dataset.png

Update the Treat As Property in Columns

In this section, you change the column's treat as property from handling the numeric values as countable numbers (measures) to attributes.

  1. Click the web_orders_data tab, select the Order Line ID column, click Measure measure icon, and then select Attribute.
  2. Select the Order ID column, click Measure measure icon, and then select Attribute.


    Description of treat_as_attribute.png follows
    Description of the illustration treat_as_attribute.png
  3. Click Save Save icon.

Standardize Data Values

In this section, you standardize the values in the Order Priority column to use High, Medium, and Low.

  1. In the web_orders_data tab, hover the Order Priority column, right-click and select Replace Values List.
  2. In Replace List under the Original Value column, go to the Critical row. Enter High in the row under the Replace Value column.
  3. Under the Original Value column in the Not Specified row, enter Low in the Replace Value column.


    Description of replace_list_of_values.png follows
    Description of the illustration replace_list_of_values.png
  4. Click Add Step.


    Description of order_priority_after.png follows
    Description of the illustration order_priority_after.png
  5. Click Save Save icon.

Correct Spelling to Improve Quality

In this section, you can see that Phoenix is spelled incorrectly in some instances as Pheonix. Correcting the misspelled city name increases the percentage of valid entries in the column.

  1. In web_orders_data, hover over the Ship to City quality bar to view the percentage of valid and invalid data values.


    Description of invalid_cities.png follows
    Description of the illustration invalid_cities.png
  2. Click the Ship to City filter column filter icon and select Filter by Invalid or Missing.


    The misspelling of the City of Phoenix makes up the entire percentage of invalid values in the Ship to City list.

  3. In the Ship to City column, right-click and select Replace Value List.
  4. In the Pheonix row, enter Phoenix in Replace Value, and then click Add Step.


    Description of ship_to_cities_valid.png follows
    Description of the illustration ship_to_cities_valid.png
  5. Click Save Save icon.

Replace Missing Values

In this section, you add a value to replace the missing values in the Payment Type Name column.

  1. Click the web_customers tab.
  2. In the Payment Type Name column, hover over the quality bar to view the percentage of populated text values.


    Description of missing_values.png follows
    Description of the illustration missing_values.png
  3. Click the Payment Type Name filter column filter icon and select Filter by Invalid or Missing.
  4. In Replace Value, go to the row with Missing or Null, enter Gift Card in Replace Value row, and then click Add Step.


    Description of payment_type.png follows
    Description of the illustration payment_type.png
  5. Click Save Save icon.

Learn More