Creating a data set from a JDBC data source

If a Studio administrator has already created a data connection and added a JDBC data source, then you can import and filter a JDBC data source into Studio. After import, the data source is available as a data set in the Catalog. For information about creating data sources, see the Administrator's Guide.

The Studio user must have the role of User, Power User, or Administrator to import JDBC data.

To load data from a JDBC data source:

  1. Click the Add Data Set option on the Catalog.

    This option adds the new data set to the Catalog. You can also add a new data set from within a project.

  2. Click Create a data set from a database.
  3. On the Select data source page, select a row corresponding to the data source you want to import and click Next.
  4. Provide the user name and password of the person who has database credentials to access the data and click Continue.
  5. In the Preview & filter data page, you can both edit attributes and limit the data before you upload it:
    1. To exclude an attribute from the data set, deselect its check box.
    2. To modify the name of an attribute as it appears in the data set, select the column header and edit the name of the attribute.
    3. To filter an attribute by an attribute value, select the funnel icon in an attribute header. (This adds a filter to the Filter By pane.) And then select a sample value that you want to filter by. For example, if you have an attribute named Country_Name, you can select the filter icon and then select United States. That filters the records down to the set of records where Country_Name matches United States.
    4. If you know the language of text data in your attributes, select the source language from Default search language. (This setting is used during data process and then used for value and keyword searches.)
  6. Click Next.
  7. On the Select data source page:
    1. Specify a name for the data set as it appears in the Catalog.
    2. Optionally, specify a description for the data set.
    3. Optionally, specify a Hive table name. By default, the Hive table name is the same as the data set name. If you create a data set by the same name as an existing data set, you must specify a different Hive table name. Studio maps the data set name to a unique table name.
  8. Click Create.
A new data set, based on the data source, is available in the Catalog.