Using Explore

Explore provides you with an out-of-the-box guided analytics experience. It configures the canvas with the right visualizations to explain the data based on your goals. This allows you to get a summary of the data and explore data quality.

explore button that is highlighted

Explore provides an attribute-focused visual summary of the data, summarizing value distributions, data quality gaps, and relationships.

Explore presents visualizations that give you the most insight into the data set, selecting the types most suitable to each attribute's data type and value distribution.

Visualizations are automatically composed, to save you time and effort at this early stage in the process. When you have a better understanding of the data set, you can compose your own visualizations on data that you have identified as worthy of further analysis.

You can think of the Explore area as a guided tour of new data sets, freeing you from the need to manually query the data or configure your own visualizations. It helps you immediately extract meaning from the results to better understand what's inside the data set.

Here are some of the questions that Big Data Discovery provides answers to, when you use Explore:
  • What fields are in my data? Do I understand what they are? Explore shows these fields as a series of visualizations.
  • How are values distributed? Explore uses descriptive statistics (mean, median, mode, quintiles) and visuals (histograms, box plots).
  • How dirty/messy is the data? Are there missing values or outliers? Explore uses the bar underneath each attribute, showing you at a glance whether any inconsistencies exist.
  • What relationships exist between fields? Explore uses numeric correlations and visuals.

In addition, Explore lets you add your own notes and tags to data sets. You can later sort or search on those tags, or review your notes for a quick summary without having to walk through the data a second time.

As an outcome, you understand the data sets you've been exploring.

Here are some examples of Explore visualizations:

Explore area in Studio shows attributes with the highest information potential.

This image shows Explore for a data set with 57.1K records, with attributes sorted by name.

Note:

Even if you could explore a large data set at full scale, you always want to start by exploring a representative sample, and later confirm your hypotheses or expand your analysis at full data scale.