Integrate Oracle Big Data Cloud Service with Oracle Internet of Things Cloud Service

Integrate an Oracle Internet of Things Cloud Service application with Oracle Big Data Cloud Service to send Oracle Internet of Things Cloud Service data to Oracle Big Data Cloud Service. Data is first sent to Oracle Storage Cloud Service and then to Oracle Big Data Cloud Service.

These are the prerequisites for the successful completion of this procedure:

An active Oracle Big Data Cloud Service instance.
An active Oracle Internet of Things Cloud Service instance.
An active Oracle Storage Cloud Service instance.
An Oracle Internet of Things Cloud Service application.
A device model associated with the Oracle Internet of Things Cloud Service application.
The message format associated with the device model is associated with stream.
WinSCP open source SFTP and FTP client for Microsoft Windows (or similar).
PuTTY SSH and telnet client (or similar).
A device simulator.

Open the Oracle Internet of Things Cloud Service Management Console.
Click the Menu () icon adjacent to the Oracle Internet of Things Cloud Service title on the Management Console.
Click Applications.
Click an application.
Click Integration.
Select one of these options:
- If you have not previously created an integration, click Create Integration and then select Big Data Cloud Service.
- If you have previously created an integration, click the Add () icon and then select Big Data Cloud Service.
Complete these fields:
- Name: Enter a unique name for the Oracle Big Data Cloud Service integration.
- Description: Enter an optional description for the Oracle Big Data Cloud Service integration.
- URL: Enter the URL for the Oracle Storage Cloud Service instance.
- Identity Domain: Enter the identity domain for the Oracle Cloud Infrastructure Object Storage Classic instance.
- Username: Enter the user name used to access the Oracle Cloud Infrastructure Object Storage Classic instance.
- Password: Enter the password used to access the Oracle Cloud Infrastructure Object Storage Classic instance.
- Container Name: Accept the default or enter a unique name for the Oracle Cloud Infrastructure Object Storage Classic container that stores Oracle Internet of Things Cloud Service data.
(Optional) Click Verify Connectivity to test the connection between Oracle IoT Cloud Service and Oracle Storage Cloud Service.
Common checks include the following:
- DNS Resolution Test: Checks whether the server name resolves to an IP address.
- Connectivity Test: Checks whether the IP address is reachable.
- IP/Port test: Checks whether you can talk to the target URL on the specified port.
- SSL Verification Test: Checks whether you have a secured, or trusted, connection to the target. Applies only if the target is an SSL endpoint.
- Authentication Test: Checks whether you can exchange messages with the target.
Click Create.
Add a stream to the integration:
1. Select the integration in the Integrations list and then click the Edit () icon.
2. Click the Streams tab.
3. Click Create a Stream.
4. Select a message format in the Message Format list.
  The same message format cannot be used by multiple streams. Select a unique message format for each stream you create.
5. Enter an object base name in the Object Base Name field. The object base name is the name used to identify the data files in the Oracle Cloud Infrastructure Object Storage Classic container. The naming convention for the data files includes the object base name and the creation date and time. For example, if the object base name is HVAC, the data file name is HVAC-2016-11-21-11-36-12.268.json.
6. Select annotations in the Annotations list. Annotations identify the metadata sent to Oracle Cloud Infrastructure Object Storage Classic.
  
  If you previously created custom metadata keys for your devices and streams, then these also appear in the list of annotations.
  
  You can also type a new annotation name and select the data type for the annotation.
7. Click the Add () icon to add another stream or click Save to save the stream.
Select Download BDCS Scripts to download a zip file containing scripts for importing data from Oracle Storage Cloud Service to Oracle Big Data Cloud Service. These scripts are customized to use streams that are configured with the integration and contain HiveQL and ODCP (Oracle Distributed Copy) commands for importing data:
1. Select the integration in the Integrations list and then click the Edit () icon.
2. Click the Streams tab.
3. Click Download BDCS Scripts.
4. Browse to a location where you want to save the BDCS scripts and then click Save.
Send device data to the Oracle Cloud Infrastructure Object Storage Classic instance:
1. Open a client device or simulator and send messages to Oracle Internet of Things Cloud Service.
2. Select the integration in the Integrations list and then click the Edit () icon.
3. Click Synchronize Now to send the message data to the Oracle Cloud Infrastructure Object Storage Classic instance.
Configure Oracle Big Data Cloud Service for importing Oracle Cloud Infrastructure Object Storage Classic data:
1. Identify the IP address of the BDCS Cluster node from the Big Data Cloud Service instance.
2. Connect to the Big Data Cloud Service node as opc user by using PuTTY on Windows or UNIX as applicable.
3. While connected to the Big Data Cloud Service node as opc user, use a bda-oss-admin add_swift_cred command to create an association between the Oracle Big Data Cloud Service instance and the Oracle Storage Cloud Service instance.
  The syntax for the bda-oss-admin command is:
```
bda-oss-admin                                   
	--cm-url    [Cloudera Manager URL]        
	--cm-admin  [Cloudera Manager username]   
	--cm-passwd [Cloudera Manager password]   
	add_swift_cred                            
	--swift-username   [OSCS username]         
	--swift-password   [OSCS pwssword]        
	--swift-storageurl [OSCS URL] 
	--swift-provider [provider name to create]
	-N
```
Add operating system accounts on the BDCS instance for user groups to be accessible through the operating system. While connected to the Big Data Cloud Service node as opc user, run the following commands:
1. Run the sudo bash command to assign yourself as a root user.
2. Run the dcli -C 'groupadd group_name' command to create a user group.
3. Run the dcli -C 'useradd -G group_name user_name' to add a new user to the new group.
Add the user name to Kerberos KDC and to bdacloudservice.oracle.com by creating a principal. The command requires that the user name must match the OS user identity created in step 14.
Run the kdadmin.local command to open the Kerberos administration console followed by an addprinc user_name command to create the principal.
In the following example, we substitute my_user_name for user_name. Here is a sample output:
```
bash-4.1# kadmin.local
Authenticating as principal oracle/admin@BDACLOUDSERVICE.ORACLE.COM with password.
kadmin.local:  addprinc my_user_name
WARNING: no policy specified for my_user_name@BDACLOUDSERVICE.ORACLE.COM; defaulting to no policy
Enter password for principal "my_user_name@BDACLOUDSERVICE.ORACLE.COM":
Re-enter password for principal "my_user_name@BDACLOUDSERVICE.ORACLE.COM":
Principal "my_user_name@BDACLOUDSERVICE.ORACLE.COM" created.
```
1. Verify that the principal is created before exiting the kadmin.local console. Use the listprincs command in the kadmin.local console and look for the principal in the output:
```
kadmin.local:  listprincs
...(lots of data)
my_user_name@BDACLOUDSERVICE.ORACLE.COM
...(more data)
kadmin.local:  q
bash-4.1#
```
2. Exit out of the kadmin.local console before proceeding to the next step.
3. Run the sudo -u user_name kinit command to obtain a Kerberos ticket by while connected to the Big Data Cloud Service node as opc user.
4. Enter the password for your cluster when the password prompt appears.
In a browser window, open Hue and login as an administrator.
Create an Oracle IoT Cloud Service user in Hue by entering these details:
1. First and last names
2. Email ID.
3. User name as specified in step 14.
Use the Cloudera Manager to add a jar file to the classpath:
1. Open Cloudera as an administrator.
2. Click Hive.
3. Click Configuration and enter aux in the search box.
4. Note the value of Hive Auxiliary JARs directory.
5. While connected to the Big Data Cloud Service node using PuTTY or SSH, search for the file hive-hcatalog-core.jar. Copy this file or create a soft-link to this file within the Hive Auxiliary JARs directory noted in the previous step.
  Note:
  Back in Cloudera Manager, it may be necessary to add the location of hive-hcatalog-core.jar to the three options below the Hive Auxillary JARs Directory option as shown below. These options are named
  - Hive Service Advanced Snippet (Safety Valve) for hive-site.xml.
  - Gateway Client Environment Advanced Configuration Snippet (Safety Valve) for hive-esv.sh.
  - Hive Advanced Configuration snippet (Safety Valve) for hive-site.xml
6. Restart the HiveServer2, Hive, and Hue services.
In a browser window, open Hue and login using the user name.
1. Open File Browser.
2. Create a destination folder path for incoming data. For example: /user/user_name/data.
3. Create additional folder paths as required.
Using WinSCP open source SFTP and FTP client for Microsoft Windows (or similar), copy the zip file containing BDCS scripts to a directory on a Big Data Cloud Service node.
While connected to the Big Data Cloud Service node as opc user, extract the zip file.
Inspect the ODCP script file, which ends in *_odcp-script.sh and make any required changes prior to running the script.
Run the *_odcp-script.sh script as opc user to manually move data from Oracle Storage Cloud Service to Oracle Big Data Cloud Service. Provide the following details when running the *_odcp-script.sh script command:

user_name — specified in bda-oss-admin command.

provider name - specified in bda-oss-admin command.

data directory – directory created earlier using Hue.

Example: MyApp_MyBDCS_odcp-script.sh my_user_name myprovider /user/my_user_name/data
Open Hue and login as user_name in a browser window.
1. Open File Browser.
2. Navigate to the destination folder path for incoming data.
3. Verify that the data is present in the folder.
4. Note the folder name for later use.
Create an alias to a beeline command to be used by the oracle user to query the Oracle Internet of Things Cloud Service data imported into Oracle Big Data Cloud Service.
1. Connect to a Big Data Cloud Service node as oracle user.
2. Run the command this command to determine the value of TRUSTSTOREPATH for your instance:
  bdacli getinfo cluster_https_truststore_path
3. Run the command this command to determine the value of TRUSTSTOREPWD for your instance:
  bdacli getinfo
4. Determine the value of HIVE_NODE for the Oracle Big Data Cloud Service instance.
5. Determine the value of KERBEROS_REALM for the Oracle Big Data Cloud Service instance.
6. Create the alias command by replacing the appropriate values shown in this code snippet:
```
alias bee="beeline -u \"jdbc:hive2://<HIVE_NODE>:10000/default;principal=hive/<HIVE_NODE>@<KERBEROS_REALM>;ssl=true;sslTrustStore=<TRUSTSTOREPATH>;trustStorePassword=<TRUSTSTOREPWD>\" -d org.apache.hive.jdbc.HiveDriver"
```
Obtain a Kerberos ticket as an oracle user by invoking the kinit command.
Among the downloaded BDCS scripts is a script named *_hive-create.sql. This script contains customized HiveQL commands for creating tables that holds imported Oracle Internet of Things Cloud Service data. Inspect this file to view how the tables will be created and note the table names.
Run the bee -f filename command to view contents of the *_hive-create.sql file. For example, bee -f MyApp_MyBDCS_hive-create.sql.
Open Hue In a browser window and login as user_name to access Hive.
Run this Hive QL command to associate the table created with the folder location of the imported Oracle Internet of Things Service data:
ALTER TABLE <table name> SET LOCATION ‘<folder location>’;

For example, ALTER TABLE myappcontainer_readings SET LOCATION ‘/user/my_user_name/data’;

Now that the table and folder location are associated, you can now use additional HiveQL commands to query the synchronized data.
Go to the Big Data Cloud Service Integration page in Oracle IoT Cloud service to setup Automatic Synchronization.
1. Select the Synchronization tab.
2. Enable Auto Sync and choose a time interval.
Connect to the Big Data Cloud Service node as opc user by using PuTTY on Windows or UNIX as applicable to setup a cron job within Big Data Cloud Service.

Create a script file that contains the following contents:

The export and initialization of the environment variable HADOOP_CLASSPATH.

The invocation of the _odcp-script.sh script.

Here is a sample runsync.sh file:

#!/bin/bash
 
# HADOOP_CLASSPATH may not set in the cron environment, so is set here.
# The value was obtained by "env | grep HADOOP_CLASSPATH".
export HADOOP_CLASSPATH=/opt/oracle/oraloader-3.7.0-h2/jlib/avro-1.7.3.jar:/opt/oracle/oraloader-3.7.0-h2/jlib/avro-mapred-1.7.3-hadoop2.jar:/opt/oracle/oraloader-3.7.0-h2/jlib/commons-math-2.2.jar:/opt/oracle/oraloader-3.7.0-h2/jlib/jackson-core-asl-1.8.8.jar:/opt/oracle/oraloader-3.7.0-h2/jlib/jackson-mapper-asl-1.8.8.jar:/opt/oracle/oraloader-3.7.0-h2/jlib/ojdbc6.jar:/opt/oracle/oraloader-3.7.0-h2/jlib/oraclepki.jar:/opt/oracle/oraloader-3.7.0-h2/jlib/orai18n.jar:/opt/oracle/oraloader-3.7.0-h2/jlib/oraloader-examples.jar:/opt/oracle/oraloader-3.7.0-h2/jlib/oraloader.jar:/opt/oracle/oraloader-3.7.0-h2/jlib/osdt_cert.jar:/opt/oracle/oraloader-3.7.0-h2/jlib/osdt_core.jar:/opt/oracle/oraloader-3.7.0-h2/jlib/ora-hadoop-common.jar:/opt/oracle/oraloader-3.7.0-h2/jlib/orabalancer.jar:/opt/oracle/bda/bdcs/bdcs-rest-api-app/current/lib-hadoop/*
hadoop classpath
 
LOG_DIR=/tmp/bdcs-iot-sync-logs
if [ ! -d "$LOG_DIR" ] ; then
  mkdir $LOG_DIR
fi
outputFile=$LOG_DIR/bdcs-iot-sync-`date +%Y%m%d-%H%M`
 
/home/my_user_name/scripts/MyApp_MyBDCS_odcp-script.sh my_user_name myprovider /user/my_user_name/data > $outputFile 2>&1

Run the crontab -e command and set the frequency for the runsync.sh utility. For example, you can set the runsync.sh utility to run at the 45th minute of every hour by using the command line.
45 * * * * /home/opc/scripts/runsync.sh
Open Hue in a browser window and login as user_name.
Open File Browser and navigate to the destination folder path for incoming data.
Verify that the data is present in the folder.