Ways of using Big Data Discovery

In your organization, use BDD as the center of your data lab, as a unified environment for navigating and exploring all of your data sources in Hadoop, and to create projects and BDD applications.

Use BDD in the data lab

The data lab is a complete set of analytics solutions. It lets analysts collaborate and have access to many data sets. It serves as a sandbox for data experiments and supports many data projects. It is often integrated with the production environment, so that it can integrate feedback, reiterate and produce new solutions that can be taken into production.

When used in the data lab, Big Data Discovery lets you:
  • Define projects in BDD and share them with others in your research data lab.
  • Use BDD to develop ideas, build prototypes and models, and invent ways of deriving value from data. For example, you can try out new approaches, quickly discard the ones that are not working, and move on to try new ways of working with your data.
  • Shape data in your projects in multiple ways, from making single-row changes, such as trimming, editing, splitting, or null-filling, to advanced data shaping techniques, such as aggregations, joins, and custom transformations.
  • Let others in your organization consume models created in the data lab, and publish insights to decision-making groups inside your organization.

Navigate and explore data sets in Hadoop

When used as the navigator on top of your data sources in Hadoop, BDD visually represents all data available to you in your environment. You see all the data in Studio's Catalog and can find interesting data sets quickly. You can filter, edit metadata on data sets, and create new data sets for others in your team.

Create BDD applications

BDD lets you create BDD applications for well-established and known problems, to suit your business needs. For example, you can:
  • Create BDD projects from scratch, improve them, and share them with wider groups of users in the organization.
  • Serve as the author of all discovery solution elements: data sets, information models, discovery applications, and transformation scripts.
  • Run discovery repeatedly and transparently to others in your group, and on different large-scale data sets arriving periodically from various sources.