Use OCI Anomaly Detection on Externally Stored IoT Data

If you are using the OCI Anomaly Detection AI Service, you can configure anomaly detection for externally stored IoT data.

OCI Anomaly Detection uses an Oracle-patented multivariate time-series anomaly detection algorithm originally developed by Oracle Labs. OCI Anomaly Detection helps avoid business disruptions through the early detection of multivariate anomalies.

To use OCI Anomaly Detection with IoT data, you must be using Oracle Autonomous Database. You can convert the sensor data stored in Oracle Autonomous Database into the format expected by the OCI Anomaly Detection service, and use this data to create and train the anomaly model.

See the following topics for more information on using Oracle Autonomous Database to externally store your IoT data:

OCI Anomaly Detection expects training data in the following format:

TimeStamp, [SensorColumn1], [SensorColumn2]…[SensorColumnN]

The following image displays a sample data segment in the desired format:

Sample OCI Anomaly Detection Training Data

Extract Distinct Entity Attribute Pairs from External IoT Data

To begin preparing the IoT data stored in Oracle Autonomous Database for OCI Anomaly Detection, we run a query to extract distinct entity-attribute pairs. These entity-attribute pairs will become our column headings in the training data.

Use a database client tool to make a connection with your Oracle Autonomous Database. We use the following query to filter out distinct entity-attribute pairs for the specified entity type and organization.

-- Obtains a list of entity/attribute instances which will become individual
--  column names in the MSET2 data view.
-- Note:
--  Replace <ENTITY_TYPE_ID> with the actual entity type id value in each WHERE clause to filter by entity type id. Remove if not needed.
--  Replace <ORG_ID> with the actual organization id value in each WHERE clause to filter by organization id. Remove if not needed.
--  Replace <ENTITIES_TABLE> with name of Data Externalization ENTITIES table name.
--  Replace <ATTRIBUTE_VALUES_TABLE> with name of Data Externalization ATTRIBUTE_VALUES table name.
--  Replace <ENTITY_TYPE_ATTR_TABLE> with name of Data Externalization ENTITY_TYPE_ATTR table name.
 
SELECT
    "ENTITY_ID",
    "ENTITY_NAME",
    "ATTRIBUTE_ID",
    "ATTRIBUTE_NAME"
FROM
    (
        SELECT DISTINCT
            "ENTITY_ID",
            "ATTRIBUTE_ID",
            "ENTITY_NAME"
        FROM
            (
                SELECT
                    "ENTITY_ID",
                    "ATTRIBUTE_ID"
                FROM
                    "<ATTRIBUTE_VALUES_TABLE>" inputdataset_0
                WHERE
                    inputdataset_0.ENTITY_TYPE_ID = '<ENTITY_TYPE_ID>' AND inputdataset_0.ORG_ID = '<ORG_ID>'
            ) selectcolumns_0
            INNER JOIN (
                SELECT
                    "ID",
                    "NAME" entity_name
                FROM
                    "<ENTITIES_TABLE>" inputdataset_1
                WHERE
                    inputdataset_1.ENTITY_TYPE_ID = '<ENTITY_TYPE_ID>' AND inputdataset_1.ORG_ID = '<ORG_ID>'
            ) selectcolumns_1 ON selectcolumns_0."ENTITY_ID" = selectcolumns_1."ID"
    ) selectcolumns_1
    INNER JOIN (
        SELECT
            "ID",
            "NAME" "ATTRIBUTE_NAME"
        FROM
            "<ENTITY_TYPE_ATTR_TABLE>" inputdataset_2
        WHERE
            inputdataset_2.ENTITY_TYPE_ID = '<ENTITY_TYPE_ID>'
    ) selectcolumns_2 ON selectcolumns_1."ATTRIBUTE_ID" = selectcolumns_2."ID"

Here is an example query for the preceding script:

-- Obtains a list of entity/attribute instances which will become individual
--  column names in the MSET2 data view.
-- Note:
--  Replace <ENTITY_TYPE_ID> with the actual entity type id value in each WHERE clause to filter by entity type id. Remove if not needed.
--  Replace <ORG_ID> with the actual organization id value in each WHERE clause to filter by organization id. Remove if not needed.
--  Replace <ENTITIES_TABLE> with name of Data Externalization ENTITIES table name.
--  Replace <ATTRIBUTE_VALUES_TABLE> with name of Data Externalization ATTRIBUTE_VALUES table name.
--  Replace <ENTITY_TYPE_ATTR_TABLE> with name of Data Externalization ENTITY_TYPE_ATTR table name.
 
SELECT
    "ENTITY_ID",
    "ENTITY_NAME",
    "ATTRIBUTE_ID",
    "ATTRIBUTE_NAME"
FROM
    (
        SELECT DISTINCT
            "ENTITY_ID",
            "ATTRIBUTE_ID",
            "ENTITY_NAME"
        FROM
            (
                SELECT
                    "ENTITY_ID",
                    "ATTRIBUTE_ID"
                FROM
                    "AI_ATTRIBUTE_VALUES" inputdataset_0
                WHERE
                    inputdataset_0.ENTITY_TYPE_ID = '3JB55AY42T90' AND inputdataset_0.ORG_ID = 'ORA_DEFAULT_ORG'
            ) selectcolumns_0
            INNER JOIN (
                SELECT
                    "ID",
                    "NAME" entity_name
                FROM
                    "AI_ENTITIES" inputdataset_1
                WHERE
                    inputdataset_1.ENTITY_TYPE_ID = '3JB55AY42T90' AND inputdataset_1.ORG_ID = 'ORA_DEFAULT_ORG'
            ) selectcolumns_1 ON selectcolumns_0."ENTITY_ID" = selectcolumns_1."ID"
    ) selectcolumns_1
    INNER JOIN (
        SELECT
            "ID",
            "NAME" "ATTRIBUTE_NAME"
        FROM
            "AI_ENTITY_TYPE_ATTR" inputdataset_2
        WHERE
            inputdataset_2.ENTITY_TYPE_ID = '3JB55AY42T90'
    ) selectcolumns_2 ON selectcolumns_1."ATTRIBUTE_ID" = selectcolumns_2."ID"

Here's some sample data returned by the query:

You can use the data returned by the query to define a naming convention for the column names in your training data. For example, you can choose between the following combinations:

ENTITY_ID_ATTRIBUTE_ID, such as 3jB5943W2T9G_3jB55B6W2T90.
ENTITY_ID_ATTRIBUTE_NAME, such as 3jB5943W2T9G_Humidity.
ENTITY_NAME_ATTRIBUTE_ID, such as Detector1_3jB55B6W2T90.
ENTITY_NAME_ATTRIBUTE_NAME, such as Detector1_Humidity.

When defining columns, choose column names that are unique and contain valid characters. Use a combination of ENTITY_ID/ENTITY_NAME and ATTRIBUTE_ID/ATTRIBUTE_NAME, making sure to remove any invalid characters. The column names should follow the Oracle rules for Database Object Names and Qualifiers.

Create a Database View Containing Formatted Data

Build the SQL command that creates a database view containing data in the format expected by the OCI Anomaly Detection service

Note that the number of MAX statements in the SQL is dynamic, as it depends on the number of Entity-Attribute pairs that you have.

-- Creates a database VIEW of MSET2 data out of Data Externalization data
--
-- Note: 
--  Replace <ENTITY_TYPE_ID> with the actual entity type id value in each WHERE clause to filter by entity type id. Remove if not needed.
--  Replace <ORG_ID> with the actual organization id value in each WHERE clause to filter by organization id. Remove if not needed. --  Replace <ENTITY_ID> with the actual entity id value.
--  Replace <ATTRIBUTE_ID> with the actual attribute id value
--  Replace <VIEW_NAME> with the actual name of the database VIEW
--  Replace <TIMEZONE_CODE> with the actual code for database timezone -- --  Replace <ENTITIES_TABLE> with name of Data Externalization ENTITIES table name.
--  Replace <ATTRIBUTE_VALUES_TABLE> with name of Data Externalization ATTRIBUTE_VALUES table name.
--  Replace <ENTITY_TYPE_ATTR_TABLE> with name of Data Externalization ENTITY_TYPE_ATTR table name.
 
 
CREATE VIEW <VIEW_NAME> AS
SELECT
    TO_CHAR(
        EVENT_TIME AT TIME ZONE '<TIMEZONE_CODE>',
        'YYYY-MM-DD"T"HH24:MI:SS"Z"'
     ) as TIMESTAMP,
 -- Create a MAX statement for each column-name as described in Step 4.
 ---- Begin MAX statement
    MAX(
        CASE
            WHEN("ATTRIBUTE_ID" = '<ATTRIBUTE_ID>'
                 AND "ENTITY_ID" = '<ENTITY_ID>') THEN
                "NUMERIC_VALUE"
        END
    ) "<ENTITY_NAME>_<ATTRIBUTE_NAME>",
 ---- End MAX statement.  Note: Comma is not needed in final MAX statement
FROM
    (
SELECT
    "ENTITY_ID",
    "ENTITY_NAME",
    "NUMERIC_VALUE",
    "EVENT_TIME",
    "ATTRIBUTE_ID",
    "ATTRIBUTE_NAME"
FROM
         (
        SELECT
            "ENTITY_ID",
            "ENTITY_NAME",
            "NUMERIC_VALUE",
            "EVENT_TIME",
            "ATTRIBUTE_ID"
        FROM
                 (
                SELECT
                    "ENTITY_ID",
                    "NUMERIC_VALUE",
                    "EVENT_TIME",
                    "ATTRIBUTE_ID"
                FROM
                    "<ATTRIBUTE_VALUES_TABLE>" inputdataset_0
                WHERE
                    inputdataset_0.ENTITY_TYPE_ID = '<ENTITY_TYPE_ID>' AND inputdataset_0.ORG_ID = '<ORG_ID>'
                        AND inputdataset_0.EVENT_TIME >= to_utc_timestamp_tz('YYYY-MM-DD"T"HH24:MI:SS"Z"')
                        AND inputdataset_0.EVENT_TIME <= to_utc_timestamp_tz('YYYY-MM-DD"T"HH24:MI:SS"Z"')            
            ) selectcolumns_0
            INNER JOIN (
                SELECT
                    "ID",
                    "NAME" entity_name
                FROM
                    "<ENTITIES_TABLE>" inputdataset_1
                WHERE
                    inputdataset_1.ENTITY_TYPE_ID = '<ENTITY_TYPE_ID>' AND inputdataset_1.ORG_ID = '<ORG_ID>'
            ) selectcolumns_1 ON selectcolumns_0."ENTITY_ID" = selectcolumns_1."ID"
    ) selectcolumns_2
    INNER JOIN (
        SELECT
            "NAME" attribute_name,
            "ID"
        FROM
            "<ENTITY_TYPE_ATTR_TABLE>" inputdataset_2
        WHERE
            inputdataset_2.ENTITY_TYPE_ID = '<ENTITY_TYPE_ID>'
    ) selectcolumns_3 ON selectcolumns_2."ATTRIBUTE_ID" = selectcolumns_3."ID"    )
GROUP BY
    "EVENT_TIME"

The following example helps better understand the script above.

-- Creates a database VIEW of MSET2 data out of Data Externalization data
--
-- Note: 
--  Replace <ENTITY_TYPE_ID> with the actual entity type id value in each WHERE clause to filter by entity type id. Remove if not needed.
--  Replace <ORG_ID> with the actual organization id value in each WHERE clause to filter by organization id. Remove if not needed. --  Replace <ENTITY_ID> with the actual entity id value.
--  Replace <ATTRIBUTE_ID> with the actual attribute id value
--  Replace <VIEW_NAME> with the actual name of the database VIEW
--  Replace <TIMEZONE_CODE> with the actual code for database timezone -- 
--  Replace <ENTITIES_TABLE> with name of Data Externalization ENTITIES table name.
--  Replace <ATTRIBUTE_VALUES_TABLE> with name of Data Externalization ATTRIBUTE_VALUES table name.
--  Replace <ENTITY_TYPE_ATTR_TABLE> with name of Data Externalization ENTITY_TYPE_ATTR table name.
 
 
CREATE VIEW DATA_VIEW AS
SELECT
    TO_CHAR(
        EVENT_TIME AT TIME ZONE 'UTC',
        'YYYY-MM-DD"T"HH24:MI:SS"Z"'
     ) as TIMESTAMP,
 -- Create a MAX statement for each column-name as described in Step 4.
 ---- Begin MAX statement
    MAX(
        CASE
            WHEN("ATTRIBUTE_ID" = '3JB55B6W2T90'
                 AND "ENTITY_ID" = '3JB5943W2T9G') THEN
                "NUMERIC_VALUE"
        END
    ) "Detector1_Humidity",
 ---- End MAX statement.  Note: Comma is not needed in final MAX statement
---- Begin MAX statement
    MAX(
        CASE
            WHEN("ATTRIBUTE_ID" = '3JB55B982T90'
                 AND "ENTITY_ID" = '3JB5943W2T9G') THEN
                "NUMERIC_VALUE"
        END
    ) "Detector1_Pressure",
 ---- End MAX statement.  Note: Comma is not needed in final MAX statement
---- Begin MAX statement
    MAX(
        CASE
            WHEN("ATTRIBUTE_ID" = '3JB55B4W2T90'
                 AND "ENTITY_ID" = '3JB5943W2T9G') THEN
                "NUMERIC_VALUE"
        END
    ) "Detector1_Temperature"
 ---- End MAX statement.  Note: Comma is not needed in final MAX statement
FROM
    (
SELECT
    "ENTITY_ID",
    "ENTITY_NAME",
    "NUMERIC_VALUE",
    "EVENT_TIME",
    "ATTRIBUTE_ID",
    "ATTRIBUTE_NAME"
FROM
         (
        SELECT
            "ENTITY_ID",
            "ENTITY_NAME",
            "NUMERIC_VALUE",
            "EVENT_TIME",
            "ATTRIBUTE_ID"
        FROM
                 (
                SELECT
                    "ENTITY_ID",
                    "NUMERIC_VALUE",
                    "EVENT_TIME",
                    "ATTRIBUTE_ID"
                FROM
                    "AI_ATTRIBUTE_VALUES" inputdataset_0
                WHERE
                    inputdataset_0.ENTITY_TYPE_ID = '3JB55AY42T90' AND inputdataset_0.ORG_ID = 'ORA_DEFAULT_ORG'
                        AND inputdataset_0.EVENT_TIME >= to_utc_timestamp_tz('2022-04-21T01:00:00Z')
                        AND inputdataset_0.EVENT_TIME <= to_utc_timestamp_tz('2022-04-21T10:00:00Z')            
            ) selectcolumns_0
            INNER JOIN (
                SELECT
                    "ID",
                    "NAME" entity_name
                FROM
                    "AI_ENTITIES" inputdataset_1
                WHERE
                    inputdataset_1.ENTITY_TYPE_ID = '3JB55AY42T90' AND inputdataset_1.ORG_ID = 'ORA_DEFAULT_ORG'
            ) selectcolumns_1 ON selectcolumns_0."ENTITY_ID" = selectcolumns_1."ID"
    ) selectcolumns_2
    INNER JOIN (
        SELECT
            "NAME" attribute_name,
            "ID"
        FROM
            "AI_ENTITY_TYPE_ATTR" inputdataset_2
        WHERE
            inputdataset_2.ENTITY_TYPE_ID = '3JB55AY42T90'
    ) selectcolumns_3 ON selectcolumns_2."ATTRIBUTE_ID" = selectcolumns_3."ID"    )
GROUP BY
    "EVENT_TIME"

The following image displays sample data from the view that we created:

Create Vault and Secrets to Store Your Database Credentials and Connection Details

Use the OCI vault to store your Oracle Autonomous Database credentials and connection details. We use these secrets to configure the data asset for OCI anomaly detection.

Create your OCI Vault.
After logging in to Oracle Cloud, navigate to Menu > Identity & Security > Vault.
Create a master encryption key in the vault.
The master encryption key is used to encrypt the secrets that you store in the vault.
Create the secrets to store your database user credentials and wallet password.
Create the secrets for the database wallet files.
You can download the wallet from the Database Connection page of your Oracle Autonomous Database console. Create secrets for the database wallet files (cwallet.sso, ewallet.p12, keystore.jks, ojdbc.properties, tnsnames.ora, truststore.jks). The wallet files' content should be base64-encoded.

Create and Train the Anomaly Detection Model

Create an anomaly detection project and add a data asset corresponding to your Oracle Autonomous Database view. Next, use the data asset to create and train the anomaly detection model.

For detailed information on OCI Anomaly Detection, refer to the OCI documentation:

Anomaly Detection Documentation

Create a new anomaly detection project.
To create an anomaly detection project in Oracle Cloud, navigate to Menu > Analytics & AI > Anomaly Detection.

Projects are collaborative workspaces for organizing data assets, models, and detection portals.
In the project, create a data asset of the Oracle Autonomous Transaction Processing type.
Click Data Assets > Create Data Asset and complete the requisite fields.

When you select Oracle Autonomous Transaction Processing, as the data asset Type, you get additional fields for specifying the credentials and vault secrets containing the database connection details.

Use the view name containing formatted data for the database Table Name.
Create and train the anomaly detection model.

Use the data asset that you created to create and train the model.

Export Data for Anomaly Detection

Select data for the desired time period, and export it into a csv file for anomaly detection.

The following sample script exports data from the DATA_VIEW view. The view contains the timestamp column and various attribute columns. The script extracts data for the specified time period.

set feedback off
set markup csv on quote off
set heading off
spool "output.csv"
select 'timestamp,Detector1_Humidity,Detector1_Pressure,Detector1_Temperature' from DUAL;
select mv.timestamp as "timestamp", "Detector1_Humidity", "Detector1_Pressure", "Detector1_Temperature" from (select timestamp, to_timestamp(timestamp, 'YYYY-MM-DD"T"HH24:MI:SS"Z"') as ts, "Detector1_Humidity", "Detector1_Pressure", "Detector1_Temperature" from DATA_VIEW) mv
where mv.ts > to_timestamp('2022-04-21 00:00:00', 'YYYY-mm-DD HH24:MI:SS') and mv.ts < to_timestamp('2022-04-21 23:55:43', 'YYYY-mm-DD HH24:MI:SS') order by mv.ts;
spool off;
quit;

Detect Anomalies Using OCI Anomaly Detection

Use the exported csv file to detect anomalies using a previously created model.

To start the process of anomaly detection, click Detect Anomalies on the anomaly model page.

Select an exported csv data file to detect anomalies.

For detailed information on OCI Anomaly Detection, refer to the OCI documentation:

Anomaly Detection Documentation