In your organization, use BDD as the center of your data lab, as a
unified environment for navigating and exploring all of your data sources in
Hadoop, and to create projects and BDD applications.
Use BDD in the data lab
The data lab is a complete set of analytics solutions. It lets analysts
collaborate and have access to many data sets. It serves as a sandbox for data
experiments and supports many data projects. It is often integrated with the
production environment, so that it can integrate feedback, reiterate and
produce new solutions that can be taken into production.
When used in the data lab, Big Data Discovery lets you:
- Define projects in BDD and
share them with others in your research data lab.
- Use BDD to develop ideas,
build prototypes and models, and invent ways of deriving value from data. For
example, you can try out new approaches, quickly discard the ones that are not
working, and move on to try new ways of working with your data.
- Shape data in your
projects in multiple ways, from making single-row changes, such as trimming,
editing, splitting, or null-filling, to advanced data shaping techniques, such
as aggregations, joins, and custom transformations.
- Let others in your
organization consume models created in the data lab, and publish insights to
decision-making groups inside your organization.
Navigate and explore data sets in Hadoop
When used as the navigator on top of your data
sources in Hadoop, BDD visually represents all data available to you in your
environment. You see all the data in Studio's Catalog and can find interesting
data sets quickly. You can filter, edit metadata on data sets, and create new
data sets for others in your team.
Create BDD applications
BDD lets you create
BDD applications for well-established and known problems,
to suit your business needs. For example, you can:
- Create BDD projects from
scratch, improve them, and share them with wider groups of users in the
organization.
- Serve as the author of all
discovery solution elements: data sets, information models, discovery
applications, and transformation scripts.
- Run discovery repeatedly
and transparently to others in your group, and on different large-scale data
sets arriving periodically from various sources.