4.6 Import Data into Apache Hive

In the Data section of the Oracle Big Data Manager console, you can import .csv files, Apache Avro files, and Apache Parquet files from HDFS to the HiveServer2 on the cluster where Oracle Big Data Manager is running.

To import one of the supported files into Hive:
  1. Click the Data tab at the top of the page, and then click the Explorer tab on the left side of the page.
  2. From the Storage drop-down list in one of the panels, select HDFS storage (hdfs).

    Apache Hive import might not work, depending on the access rights of the file and its parent directories. If so, you can copy or move the file to the /tmp directory and import from there.

  3. Navigate to the file you want to import, right-click it, select Import to Apache Hive, and select how to import it: Import as CSV, Import as Apache Avro, or Import as Apache Parquet.
  4. Provide import details.
    • For Import as CSV, provide values on each tab of the Create a new job wizard and then click Create.
    • For Import as Apache Avro and Import as Apache Parquet, specify the Hive table in the Table name field, and select the Hive database from the Database name drop-down list. The table name defaults to the name of the file you selected to import. The database defaults to the default Hive database. If other databases have been created in your Hive metastore, those databases will also be listed in the drop-down list. Click Import to Hive.