Loading Data

You can load data from files on your local device, from remote databases, or from cloud storage buckets from directories and share providers.

The following topics describe the interfaces for these actions.

Use OCI Language Service Capabilities in Data Studio

You can utilize OCI Language Service Capabilities such as Sentiment Analysis and Key Phrase Extraction to analyze data without machine learning (ML) or artificial intelligence (AI) expertise.

For example, you can use it for feedback on a product. A phone manufacturer has launched a new phone model and they want to know what the customer sentiment is on their product. If a large percentage of sentiment is negative it could signal a potential fault with the camera which wasn’t detected in Quality Control (QC).

Overview of Sentiment Analysis and Key Phrase Extraction

Sentiment Analysis and Key Phrase Extraction is currently supported in loading data from local files and loading data from cloud storage.

Sentiment Analysis

Sentiment Analysis analyses the text to define your sentiment on a topic or product. The Language service sentiment analysis uses natural language processing (NLP). The Data Studio tool uses the Oracle Cloud Infrastructure (OCI) Language service to analyze and understand the input data. The Data Studio tool dynamically adds new columns to their data load that contains the output of the OCI Language service. You can detect the sentiments of any column of the source data. For example, when searching through a column containing reviews for an application, assume that you want a general opinion about the application. The Data Studio tool performs sentiment analysis on the input data and creates a new expression column defined in the target table that consists of the sentiment.

For more details, refer to Sentiment Analysis in OCI.

Key Phrase Extraction

Key phrase extraction identifies the main concepts in a text. Keyword extraction is the automated process of extracting the words with the most relevance, and expressions from the input text. It helps summarize the content and recognizes the main topics. A word cloud can be generated with key phrases to help visualize key concepts in text comments or feedback. For example, a Movie review could generate a word cloud based on key phrases identified in their comments and might see that people are commenting most frequently about the direction, acting, and cinematography staff.

For more details, refer to Key Phrase Extraction in OCI.

Before you start:
  • Load Data from Local Files or Cloud storage: Load the data you wish to analyze into your Oracle Autonomous Database from Local files or Cloud storage. Make sure the data is loaded to the data load cart without any errors.
  • The tools perform sentiment analysis or Key Phrase extraction while you specify settings for the data load job.

Parameters to Analyze Data

When you invoke an Add Expression from the Settings tab, you must configure the model using parameters.

Table 3-3 Parameters for Sentiment Analysis and Key Expression

Parameter Description
Expression Type Sentiment Analysis or Key Phrase extraction
Input Column

Select the column which you wish to analyze.

The input column drop-down only contains columns that Sentiment Analysis or Key Phrase Extraction supports.

For Sentiment Analysis, only VARCHAR2, NVARCHAR2, CLOB, or NCLOB target columns will be displayed in the input drop-down.

Target Column
  • Enter the name of the newly created expression column defined in the target table.
  • For Sentiment Analysis, this displays the sentiment of the input column. The different types of sentiments the tool identifies are:
    • Positive
    • Neutral
    • Mixed
    • Negative

      If the tool cannot determine the sentiment of the input column, it returns NULL in the expression column.

  • For Key Phrase Extraction, this displays the key phrases of the input column you select.

Perform Sentiment Analysis

To determine the Sentiments of input data:

  1. After you load data into the Data Load cart from local files or cloud storage, you can view the file in the cart. Click the settings icon.

  2. Clicking the Setting icon opens a Load Data from Local File wizard. In this example, we have loaded data from a local file.
  3. On the Settings tab of the wizard, click Add Expression under the Mapping section.

    This opens the Add Expression dialog box.



  4. On the Add Expression dialog, specify the following fields:
    • Expression Type: From the Expression Type drop-down, select Sentiment Analysis.
    • Input Column: Select the column from the drop-down that you wish to analyze. For example, CONTENT.
    • Target column: Enter the name of the newly created expression column. For example, CONTENT_SENTIMENT.

      Refer to the Parameters to Analyze Data for more details.

  5. Click OK. You will see a new row added to the mapping grid. This row determines the output expression column generated by the OCI Language service.

  6. Click Close.
  7. Click Start in the Data Load menu cart to load data from local files. You will view a confirmation message asking if you wish to start the load from local files.
  8. Click Run to confirm.

When the data load job completes, the Table and View Loads of the Data Load page display the details of the source table that is loaded into the tool. It displays the number of rows and columns and when the upload was complete.

Output Data Generated from OCI Sentiment Analysis

When you analyze columns using OCI Language Service model, the Data Studio generates a new expression column and saves the result in the updated table.

To locate the generated expression column, from the Database Actions Launchpad, navigate to Data Load. Select the table you load under the Table and View Loads section.

Click the three vertical dots beside the load name, and click Table then select View.
Description of view-details.png follows
Description of the illustration view-details.png

This opens the Preview tab of the data load which displays the updated source file. For example, here is an output dataset from sentiment analysis of the Instagram application. Here, CONTENT is the target column, and CONTENT_SENTIMENT is the sentiment analysis of the input column. This column displays one of the following values such as positive, neutral, mixed, or negative It displays Null when the tool is unable to determine the sentiment.
Description of content-analysis.png follows
Description of the illustration content-analysis.png

Perform Key Phrase Extraction

To extract Key Phrase information from input data:

  1. After you load data into the Data Load cart from local files or cloud storage, you can view the file in the cart. Click the settings icon.

  2. Clicking the Setting icon opens a Load Data from Local File wizard. In this example, we have loaded data from a local file.
  3. On the Settings tab of the wizard, click Add Expression under the Mapping section.

    This opens the Add Expression dialog box.



  4. On the Add Expression dialog, specify the following fields:
    • Expression Type: From the Expression Type drop-down, select Key Phrase Extraction.
    • Input Column: Select the column from the drop-down that you wish to analyze. For example, CONTENT.
    • Target column: Enter the name of the newly created expression column. For example, CONTENT_KEY_PHRASE_EXTRACTION.

    Refer to the Parameters to Analyze Data for more details.

  5. Click OK. You will see a new row added to the mapping grid .This row determines the output expression column generated by the OCI Language service.

  6. Click Close.
  7. Click Start in the Data Load menu cart to load data from local files. You will view a confirmation message asking if you wish to start the load from local files.
  8. Click Run to confirm.

When the data load job completes, the Table and View Loads of the Data Load page display the details of the source table that is loaded into the tool. It displays the number of rows and columns and when the upload was complete.

Output Data Generated from OCI Key Phrase Extraction

When you analyze columns using OCI Language Service model, the Data Studio generates a new expression column and saves the result in the updated table.

To locate the generated expression column, from the Database Actions Launchpad, navigate to Data Load. Select the table you load under the Table and View Loads section.

Click the three vertical dots beside the load name, and click Table then select View.
Description of view-details.png follows
Description of the illustration view-details.png

For example, here's an output dataset from sentiment analysis of the Instagram application. Here, CONTENT is the target column, and CONTENT_KEY_PHRASE_EXTRACTION column displays the key phrases extracted from the input column.
Description of key-output.png follows
Description of the illustration key-output.png

Use GeoJSON in Data Load

A GeoJSON object accommodates information about the specific geometry (e.g. Point, Line, Polygon, etc.) along with optional metadata (e.g. ID, etc.).

The extension for a GeoJSON file is *.geojson. You can load GeoJSON data into the Autonomous Database using the Data Load in Data Studio. If the table contains GeoJSON data, then the data is loaded in a column that projects GeoJSON data from the document set of SQL data type SDO_GEOMETRY.

Load a table with GeoJSON Data

When you load a table in Data Studio with GeoJSON data and review its settings, you will see that it creates table Brazil_Geo, which has a column geometry of GeoJSON data.



After you load BRAZIL_GEO you will view that the tool loads GeoJSON object into a new column geometry with the data type SDO_GEOMETRY.
Description of edittable-geo.png follows
Description of the illustration edittable-geo.png

You can also view the same in the Data Definition tab when you View Details of the Table load after it is run.
Description of datadefinition-geo.png follows
Description of the illustration datadefinition-geo.png

Loading Data from AI Source

You can use Data Studio tools to load data from AI source.

On the Data Load dashboard, click Load Data card, and select AI Source.



You need to perform a pre-requisite of setting up your AI profile before using this feature. See Configure DBMS_CLOUD_AI PackageConfigure DBMS_CLOUD_AI Package for details on the setup of this feature.

Loading Data from File System

You can load files from file systen directories to your Autonomous Database.

You can set filters on the data for a table to load only the specified data. For example, to limit the files to only those that are CSV files, enter *.CSV in the file extension filter.

Configure and run a data load job from the Load Cloud Object page. To open that page:
  1. On the Data Studio tab and select Data Load. You will view the Data Load dashboard.
  2. Click LOAD DATA and select the File System option.

On the top of the page you need to select the directory from where you need to load the files. On the left side of the page is a navigator pane, where you choose the files in the directory containing the data. On the right of the page is the data load cart, where you stage the files and folders for the data load job. You can set options for the data load job before running it. The Autonomous Database comes with predefined CPU/IO shares assigned to different consumer groups. You can set the consumer group to either low, medium or high while executing a data load job depending on your workload. To load files from a directory into your database, do the following:

  1. Prepare the Data Load Job: Refer to the Prepare the Data Load Job section for more details.
  2. Add Files or Folders for the Data Load Job: Refer to the Add Files or Folders for the Data Load Job section for more details.
  3. Enter Details for the Data Load Job: Refer to the Enter Details for the Data Load Job for more details.
  4. Run the Data Load Job: Refer to the Run the Data Load Job section for more details.
  5. View Details About the Data Load Job After It Is Run: Refer to the View Details About the Data Load Job After It Is Run section for more details.
  6. View the Table Resulting from the Data Load Job: Refer to the View the Table Resulting from the Data Load Job section for more details.

Create Directories in Database Actions

In the Autonomous Database, there is a preconfigured data_pump_dir where you can store files. You can also create directories, drop directories, and attach network file systems.

For example, you can use the CREATE DIRECTORY command to create additional directories. Use the database DROP DIRECTORY command to drop directories and use DBMS_CLOUD.LIST_FILES to list the contents of a directory.

Create a Directory

To add a directory, you must have the CREATE ANY DIRECTORY system privilege. The ADMIN user is granted the CREATE ANY DIRECTORY system privilege. The ADMIN user can grant CREATE ANY DIRECTORY system privileges to other users.

See CREATE DIRECTORY for more information.

Note:

  • CREATE DIRECTORY creates the database directory object in the database and also creates the file system directory. For example the directory path could be:
    /u03/dbfs/7C149E35BB1000A45FD/data/stage
  • You can create a directory in the root file system to see all the files with the following commands:

    CREATE OR REPLACE DIRECTORY ROOT_DIR AS '';

    After you create the ROOT_DIR directory, use the following command to list all files:

    SELECT * FROM DBMS_CLOUD.list_files('ROOT_DIR');
  • To run DBMS_CLOUD.LIST_FILES with a user other than ADMIN you need to grant read privileges on the directory to that user.

Let us demonstrate how to create a directory and access it from Data Studio:

  • Create a directory in Database Actions:

    Login to your Database Actions instance and select the SQL card under Development. You can view the SQL worksheet. Now create a directory and attach a file system name of your choice to the directory you create. In the below given example, FSS_DIR is the name of the directory.

    Description of create-directory.png follows
    Description of the illustration create-directory.png

    Run the above command. The above command gives the following output:

    PL/SQL procedure successfully completed.

  • Attach the file system
    Attach your file system with the name of your choice to the FSS_DIR directory via the DBMS_CLOUD_ADMIN.ATTACH_FILE_SYSTEM procedure.
    
    BEGIN
      DBMS_CLOUD_ADMIN.ATTACH_FILE_SYSTEM(
        file_system_name => '********',
        file_system_location => '*******.sub1********1.********.oraclevcn.com:/********',
        directory_name => 'FSS_DIR',
        description => 'attach OCI file system'
      );
    END;
    /
    
    

    Run the above command to view the following output:

    PL/SQL procedure successfully completed.

  • To view the attached file system, run the following command:

    SELECT file_system_name, file_system_location, directory_path FROM dba_cloud_file_systems;

    You will view the file system name, file system location and directory path.

  • You can view the new directory along with the files attached to it by navigating to Load Cloud Object under Data Load menu of the Data Studio tool. Click the Select Cloud Store location drop-down.Description of new-dir.png follows
    Description of the illustration new-dir.png

You can view the newly created directory FSS_DIR. You can load data from the file system directories to the autonomous database using the Data Load tool. See Loading Data from File System.

Loading Data from Share

You can select tables from a Share. You need to subscribe and access provided data share.

To load tables from a share, click Load Data on the Data Load Dashboard. Click Share on the Load Data page. Click + Subscribe to Share Provider to subscribe to a Share Provider.

Subscribe to the Data Share Provider

To subscribe, you need to use the information contained in the uploaded JSON profile you received from the share provider.
  1. On the Subscribe to Data Share Provider, enter the Provider Name.Description of subscribe_1.png follows
    Description of the illustration subscribe_1.png
  2. Select Add New Provider JSON and click on the Delta Share Profile JSON to drag and drop the JSON profile.
  3. Click Next to progress to the Add Shares tab.
  4. Select the level of network access you want to allow from your database to the host with the Share REST endpoint, and click Run. In this example, Allow access to Host Only is selected.Description of network_acl_script.png follows
    Description of the illustration network_acl_script.png
  5. To register the shares available to you, move shares from Available Shares to Selected Shares and click Subscribe.

    The screenshot below shows the REVIEWS share moved from Available Shares to Selected Shares before clicking Subscribe.

    Description of subscribe_2.png follows
    Description of the illustration subscribe_2.png
  6. Create external tables derived from tables selected from the data share.
    1. Drag and drop tables from the selected share. You can optionally click settings to view the table details.

      In this example, the only table selected is HOTEL_REVIEWS.

      Description of sel_tbls.png follows
      Description of the illustration sel_tbls.png
    2. You can optionally change the name of your table and click Close.

      In this example, the name is changed from HOTEL_REVIEWS to HOTEL_REVIEWS_SHARE.

      Description of name_change.png follows
      Description of the illustration name_change.png
    3. Create the external table by clicking Start, on the Select Share page, and then clicking Run on the Run Data Load Job dialog. Description of start_run.png follows
      Description of the illustration start_run.png
    4. When the external tables are created the Complete message is displayed.Description of xt_complete.png follows
      Description of the illustration xt_complete.png
    5. Click on the external table name to view the table details.Description of xt_details.png follows
      Description of the illustration xt_details.png
  7. View the created tables from Database Actions.
    1. Click on Database Actions, in the breadcrumb, to go back to the Database Actions launchpad.
    2. Click on the SQL tile.
    3. Select the external table, drag and drop it into the worksheet.

      The SQL Select statement for the table appears. This SQL statement can be run to consume the shared data.

      Description of xt_sql.png follows
      Description of the illustration xt_sql.png
  8. Drag and drop tables from selected share

Create Live Feed from Data Load

The Data load tool loads the data from folders in cloud object stores and enables it to schedule repeated data loads in real time. This is the creation of Live Feed from a data load job.

Once the data load is complete, you can create a Live Feed from a cart item that loaded an object store folder using the Create Table or Drop Table and Create New Table options.

To create a live feed from Data Load:
  1. Click Selector to display the navigation menu. Under Data Studio, select Data Load.
  2. Select the Load Data tile to load data from various sources such as local files, databases, cloud store, directories, and Shares.
  3. Click Cloud Storesto load objects from URLs or Cloud store links.
  4. Select the cloud store location from the drop-down. If are not able to view the cloud store location, select Create Cloud Store Location to create a new cloud store location. Follow the steps described in the Create Oracle Cloud Infrastructure Native Credentials if you do not have a cloud location available.
  5. After you select the cloud store location, you can view the list of folders and files present in the cloud storage. Add files from the cloud store to the data load cart, where you can edit the details of the load job.

    Note:

    The Data load tool does not support the creation of a live feed from a loaded cart item that consists of a single file in CSV, XLS, XLSX, TSV, TXT, XML, JSON, and AVRO format or of a folder that contains a file in XLSX format.
  6. To add the folders, drag a folder from the file navigator on the left and drop them into the cart on the right. When you add the folder to the cart, a prompt is displayed that asks if you want to load all the objects from the multiple source files into a single target table. Click Yes to continue or No to cancel. You can add multiple folders to the cart, the data represented by each card will be loaded into a separate table, but all the items in the cart will be processed as part of the same data load job.
  7. Select Settings (pencil icon) from the data load cart to enter the details about the data load job.
  8. In the Settings tab of the Load Data from the Cloud Store Location, you can select Create Table or Drop Table and Create New Table from the Option drop-down.

    Note:

    The Live feed tool works with the Data load job only if you create a table and insert the data into a new table or drop the existing table and insert the data into a new table.
  9. Enter the other details for the data load job. For more information on entering the details, refer to the Enter Details for the Data Load Job topic.
  10. Once you have added data sources to the data load cart and entered details about the data load job, select Start to run the job.
  11. After the data load job is run, the data load cart displays a green check mark

    which indicates that the data load job is complete.
  12. Click Create Live Table Feed on the data load cart to create a Live table feed from the data load job. You will view a successful message that says the creation of Live Table feed is successful and if you wish to edit the live table feed. Click Yes to continue and No to cancel. Selecting Yes opens an Edit Live Table Feed wizard.
  13. On the Edit Live Table Feed wizard, you can view the Cloud Store Location of the source folder and the Object Filter applied to the data.

    Select any file whose data you want to preview from the Preview drop-down in the Data Source tab: The field shows the total number of files present in the cloud store folder you loaded. A preview of the data is displayed.



  14. Click Next to progress to the Table Settings tab.

    You can view the Properties and Mapping details of the data load job on the Table Settings tab.

    Note:

    You cannot select or edit any of the details of this tab.
  15. Click Next to progress to the Preview tab of the wizard.

    Select any file from the Preview drop-down to view the file. The Table Preview displays the preview of the file you select from the drop-down.



  16. Click Next to view the Live Feed Settings tab of the wizard. On the Live Feed Settings tab, specify the following values in the field:

    • Enable for Notification: Select this option so that new or changed data in the data source will be loaded based on an Oracle Cloud Infrastructure notification. When you select this option, you can avoid delays that might occur when polling is initiated on a schedule (that is, if you selected the live table feed Scheduled option.

      When you select the Enable for Notification option, you must also configure your object store bucket to emit notifications

    • Enable For Scheduling: Select this option to set up a schedule for data feed. In the time interval fields, enter a number, and select a time type and the days on which to poll the bucket for new or changed files. For example, to poll every two hours on Monday, Wednesday, and Friday, enter 2, select Hours, and then select Monday, Wednesday, and Friday in the appropriate fields. You can select All Days, Monday to Friday, Sunday to Thursday, or Custom from the Week Days drop-down. The Custom field enables you to select Monday,Tuesday, Wednesday, Thursday and Friday .

    Select a start and end date. If you don't select a start date, the current time and date will be used as the start date. The end date is optional. However, without an end date, the feed will continue to poll.

    The rest of the fields displayed in the wizard such as the Live Table Feed Name, Target Table Name , and Consumer Group are greyed out and disabled for selecting or editing.

Click Save to save and create a Live Table Feed from a data load cart.