Work with Data Entities

A Data Entity is a tabular representation of a data structure.

It includes Database Tables or Views that can be used in a mapping as a source or target. They are simply the metadata for sources and targets. They are used in creating data flows.

You can add Data Entities to your newly created project in one of the following two ways:

All the newly created or imported Data Entities along with their details are displayed in the Data Entities page. The details include:

  • Name of the Data Entity
  • Connection for which the Data Entity was created
  • Schema to which the Data Entity is associated
  • Click the Actions icon (Actions icon) next to the selected Data Entity to perform the following operations:
    • Select Edit, to edit the existing details.
    • Select Preview, to preview the selected Data Entity. If the data entity belongs to an Oracle database, you can also view statistics of the table. See View Statistics of Data Entities for more details.
    • Select Delete, to delete the selected Data Entity.
  • To delete the Data Entities in bulk, in the Data Entities page, select the check boxes of the respective Data Entities and click Delete.
  • You can also search for the required Data Entity to know its details based on the following filters:
    • Name of the Data Entity
    • Connection for which the Data Entity was created
    • Schema to which the Data Entity is associated
    • Tag that is associated with the Data Entity

Import Data Entities

The easiest and most common way to create a Data Entity is by importing its structure from the connection type (for example, Oracle database).

To import existing Data Entities:

  1. From the left pane of the Home page, click the Data Entities tab.

    Data Entities page appears.

  2. Click Import Data Entities, to import the existing data entities.

    Import Data Entities page slides-in.

  3. Select the Connection followed by Schema and then finally select the Type of Objects you want to import. For Oracle Object Storage connections, the Schema drop-down lists the name of the bucket that you specified in the URL when you created the connection.
  4. [For Oracle Business Intelligence Cloud Connector (BICC) connections only] From the Offerings to import for collection, choose the offerings whose data stores you want to import. You must select at least one offering to import the BICC public view objects (PVO).
    Note

    The import of BICC PVOs can take a long time depending on the number of selected objects. To improve performance, Oracle recommends that you use a mask to filter and limit the number of PVOs that you want to import.
  5. Choose a Mask/filter if you don't want to import every object in the schema.

    Depending on the Connection Type, you will be presented with further options for importing.

    Note

    For Oracle Object Storage connections, this value is case-sensitive. If Batch similar files is set to True, all the files that match the mask and have the same structure are grouped together into a single data entity.
  6. [For Oracle Financials Cloud connections only] From the list in the Resources section, select the items that you want to import. When the import process completes, a table is created for each selected resource.
  7. [For REST server connections only] In the Resources section, do the following:
    • In the Resource URI field, enter the URL of the REST service you want to import resources from.
    • Click the + icon.
    • In the Name column enter an identifier for the resource.
    • In the Operation URI column enter the URI of the resource.
    • Click Test Resource to check whether the entries are valid.
  8. Click Start.

    A Job is created and the corresponding Job ID is displayed for you to track the session. Click the Job ID to view the details of the job.

    Upon successful execution of the job, all the selected Data Entities are imported. Click the Refresh icon Refresh icon present at the right corner of the Data Entities page, to see the new imported Data Entities.

Create Data Entities

You can manually create or update the Data Entities through the Oracle Data Transforms interface.

Data entities should possess the corresponding objects in the source connection to be used as a source in a data flow. Usually the import process makes sure that these objects are in coordination. However, whenever you manually create or update Data Entities always make sure to check if both the definitions are in coordination with each other.

When you use a Data Entity as a target then it doesn't have to exist previously in the target connection and can be created as a part of Data Flow execution.

To create a new Data Entity:
  1. From the left pane of the Home page, click the Data Entities tab.

    Data Entities page appears.

  2. Click Create Data Entity, to create a new data entity.

    Create Data Entity page appears.

  3. In the Name text box, enter the name of the new Data Entity that you are creating.
  4. From the Connection drop-down, select the required connection from which you wish to add the newly created Data Entity.
    Note

    Oracle Financials Cloud connections are not listed here because you cannot manually create data entities for such connections. You can only import data entities from Oracle Financials Cloud REST endpoints using the Import Data Entities page. See Import Data Entities.
  5. In the Schema drop-down, all schema corresponding to the selected connection are listed in two groups.
    • New Database Schema (ones that you've not imported from before) and
    • Existing Database Schema (ones that you've imported from before and are potentially replacing data entities).

    From the Schema drop-down, select the required schema.

    Note

    For Oracle Object Storage connections, the Schema drop-down lists the name of the bucket that you specified in the URL when you created the connection.
  6. From the Type drop-down, select the data entity type.
    • Table: To define the table structure for the newly created Data Entity, click the + icon to add columns. For each column, you can specify parameters such as Name, Data Type, Length, Scale, Not Null. Double click on the cell to configure the value.

      Click the 'x' icon, to delete a row. Click the Up and Down arrows to sort the table rows.

    • Inline View: To create the data entity using inline code, enter the Select statement in the Query tab. For example, SELECT * FROM CUSTOMER. Click Validate.

      The Columns tab displays a read-only list of the columns that the query returns. Click the Preview tab to see the column data.

  7. In the Tags text box, enter a tag of your choice. You can use tags to filter the Data Entities displayed in the Data Entity Page.
  8. For Oracle Object Storage connections, this page displays the following options:
    • Contents – Select the CSV file that contains the data you want to import. The metadata displayed in the preview table, such as the data type and length of columns, is based on the first row of the CSV file. Make sure that the CSV file has a header line. The header should contain only alphanumeric characters and no special characters.
    • Group Files – Select this check box if you want to group data from multiple CSV files into one data entity. For example, say you want to merge data from Employee_Data1.csv, Employee_Data2.csv, and Employee_Data3.csv into a single data entity.
    • Resource Name – Use this option along with Group Files. Enter the value you want to use to identify the files. The resource name should be a regular expression. You can use only an asterisk (*) as a wildcard character in the resource name. For example, Employee_Data*.csv.
  9. For Oracle database connections, you can mark the data entities as a feature group. Expand Advanced Options and click the Treat as Feature Group checkbox.
  10. Click Save.

    The new Data Entity is created.

Create Data Entities within the Data Flow editor

If you have already created or imported your target data entity, then you would drag the data entity onto the Design Canvas and complete the column mappings and options.

To create the definition of a new target table while in the Data Flow editor,

  1. Select the component at the end of your data flow.
  2. Click the Add Data Entity icon Add Data Entity icon present on the top right corner of the target component.
  3. Add Data Entity page appears allowing you to configure the following details of the target component:

    General tab

    • In the Name text box, enter the name of the newly created Data Entity.
    • The Alias text box is auto-populated with the name of the newly created Data Entity.
    • From the Connection Type drop-down, select the required connection from which you wish to add the newly created Data Entity.
    • It loads the server name coined at the time of connection creation. From the Server drop-down, select the required server name from which you wish to add the newly created Data Entity.
    • From the Schema drop-down, select the required schema.
    • Click Next.

    Columns tab

    It allows you to create, remove or edit the column definitions.

    • Click the Add Columns icon Add Columns icon, to add new columns to the newly created Data Entity.

      A new column is added to the displayed table.

    • The table displays the following columns:
      • Name
      • Data Type - Click the cell to configure the required Data Type.
      • Scale
      • Length
      • Actions - Click the cross icon to delete the created column.
    • To delete the columns in bulk, select the columns and click the Delete icon Delete icon.
    • To search for the required column details, in the Search text box enter the required column name and click enter. The details of the required column are displayed.
    • Click Next.

    Preview Data Entity tab

    It displays a preview of all the created columns and their configured details. If the data entity belongs to an Oracle database, you can also view statistics of the table. See View Statistics of Data Entities for more information.

  4. Click Save.

    The new target Data Entity is created.

  5. Expand the Properties Panel in the right pane to view the following settings of the created components:
    • General - Displays the Name of the component along with its Connection and Schema details.
    • Attributes - Displays the details of all the attributes associated to the component.
    • Column Mapping - Click Auto Map to map all the columns automatically.
    • Preview - Click to have a preview of the component.
    • Options - Change the options as appropriate.

View Statistics of Data Entities

The Preview tab displays detailed statistics of each data entity.

Note

This feature is available for Oracle database tables only.
You can view the statistics of a selected data entity in one of following ways:
  • In the Data Entities list, click the Actions icon (Actions icon) next to the Data Entity and click Preview. Select the Statistics tab to view the statistics of the selected data entity.
  • On any data flow click on any source or target data entity, and expand the properties panel in the right pane. Click Preview.

The statistical data is presented as follows:

  • The total number of rows and columns in the data entity is displayed at the top.
  • The statistics panel displays the thumbnail graphs for each column with information about the Min, Max, Distinct, and Null values.
  • Two types of thumbnail representations are displayed based on the histogram:
    • A bar chart represents data for frequency and top-frequency histograms. The bar chart show the first top 10 values for the number of rows in the table.
    • A table lists data for Hybrid and Height-Balanced histograms. The table displays the entire data and is scrollable. The table displays the range for the values and the percentage of rows in each range.
  • You can click each thumbnail to view the statistics of the column in a new browser tab.
  • The detailed view of each chart also shows the type of histogram.