Load Oracle Identity Role Intelligence Data Using Data Ingestion (Day 0)
Introduction
Data ingestion is the process of importing data into the Oracle Identity Role Intelligence (OIRI) database. OIRI uses a Command Line Interface (CLI) tool to fetch and load data from external sources such as the Oracle Identity Governance (OIG) database or flat files.
As part of the data ingestion process, data is loaded into the following tables of the OIRI database:
- USERS
- APPLICATIONS
- ACCOUNTS
- ENTITLEMENTS
- ASSIGNED_ENTS
- ROLES
- ROLE_USER_MSHIP
- ROLE_ENT_COMPOSI
- ROLE_HIERARCHY
- ORGANIZATIONS
You can run the data import process in the following modes:
-
Full
You run data import in full mode when you first install OIRI and want to load all the data from the source to the OIRI database. This is called Day 0 data import. If you run data import in full mode after Day 0, which is referred to as Day N data import, when the OIRI database already contains entity data, the existing data is truncated and new data from the sources is loaded.
-
Incremental
You run data import in incremental mode on Day N, which matches the data in OIRI and the source, and loads only the data that has been added, updated, or deleted in the source after the last data import run.
When you install OIRI, you set the default data ingestion sources. These can include:
- OIG Database
- Flat Files
- OIG Database and Flat Files
Objectives
On completion of this lab, you should be able to load data into the OIRI database using the data ingestion method.
- Update data ingestion configuration
- Perform dry run of the data ingestion and validate the input database
- Perform the data ingestion process
- Review the results
Prerequisites
Before starting this tutorial you must have:
- A running Oracle Identity Role Intelligence container. Instructions on how to download and install the OIRI container can be found in Installing and Configuring Oracle Identity Role Intelligence
- Flat files containing data for loading into the OIRI database. Sample data used in this tutorial can be reviewed in the oiri_ding_etl_samples.zip
- NFS mount location exists that can be used for creating a persistent volume which will be used across the Kubernetes cluster nodes. For the purposes of this tutorial this is assumed to be
/nfs
. - The following directories exist (with the same or similar naming):
- DING :
/nfs/ding
- OIRI :
/nfs/oiri
- LOCAL :
/local/k8s/
- DING :
Configure the Data Ingestion Process
Before running the data ingestion process you should complete the following configuration tasks.
-
Copy the Kubernetes Certificate Authority certificate to the data ingestion directory,
/nfs/ding
.cp /etc/kubernetes/pki/ca.crt /nfs/ding/
-
Run the
oiri-ding
container. This allows you to issue data ingestion commands and configure data ingestion specific parameters.docker run -d --name ding-cli \ -v /nfs/ding/:/app/ \ -v /nfs/oiri/:/app/oiri \ -v /local/k8s/:/app/k8s \ oiri-ding-12.2.1.4.<TAG> \ tail -f /dev/null
Confirm that the container is running.
docker ps -a | grep ding-cli
Should return output similar to:
968676d7db91 oiri-ding-12.2.1.4.<TAG> "tail -f /dev/null" 21 seconds ago Up 14 seconds ding-cli
-
Connect to the
ding-cli
container and verify parameters.Connect to the
ding-cli
container.docker exec -it ding-cli bash
Run the
ding-cli
command to verify the data ingestion configuration.ding-cli --config=/app/data/conf/config.yaml data-ingestion verify /app/data/conf/data-ingestion-config.yaml
You will see output similar to the following.
###### ### # # ##### ##### # ### # # # ## # # # # # # # # # # # # # # # # # # # # # # # # #### ##### # # # # # # # # # # # # # # # # # # ## # # # # # # ###### ### # # ##### ##### ####### ### Verifying Data Ingestion Config. Wallet and DB connections will be validated SUCCESS: Data Ingestion Config is valid.
-
Verify that flat files are enabled as a data source.
Connect to the
ding-cli
container.docker exec -it ding-cli bash
View the
data-ingestion-config.yaml
file.view /app/data/conf/data-ingestion-config.yaml
Locate the
sources
section and confirm thatflatFile
is enabled and that OIGDB is disabled.sources: oracleIdentityGovernance: enabled: false version: 12.2.1.3 format: jdbc option: url: jdbc:oracle:thin:@oighost@example.com:1522/oimpdb.example.com driver: oracle.jdbc.driver.OracleDriver queryTimeout: 300 fetchSize: 50 flatFile: enabled: true version: 1.0 format: csv option: sep: ',' timestampFormat: yyyy-MM-dd
If flat files are not enabled or OIGDB is enabled as a data source then reset them using the
updateDataIngestionConfig.sh
script:./updateDataIngestionConfig.sh --useoigdbforetl false --useflatfileforetl true
You will see output similar to the following.
INFO: OIG DB as source for ETL is true INFO: Setting up /app/data/conf/config.yaml INFO: Setting up /app/data/conf/data-ingestion-config.yaml INFO: Setting up /app/data/conf/custom-attributes.yaml INFO: Setting up /app/oiri/data/conf/application.yaml INFO: Setting up /app/oiri/data/conf/authenticationConf.yaml INFO: Setting up /app/data/conf/dbconfig.yaml
-
Download and copy the sample data to your OIRI environment.
Download the oiri_ding_etl_samples.zip file and copy to a staging directory. Unzip the file.
cd /staging/ unzip oiri_ding_etl_samples.zip
You will see output similar to the following.
Archive: oiri_ding_etl_samples.zip creating: oiri_ding_etl_samples/ inflating: oiri_ding_etl_samples/accounts.csv inflating: oiri_ding_etl_samples/applications.csv inflating: oiri_ding_etl_samples/assignedEntitlements.csv inflating: oiri_ding_etl_samples/entitlements.csv inflating: oiri_ding_etl_samples/file_list_data_to_be_deleted.properties inflating: oiri_ding_etl_samples/oig_list_data_to_be_deleted.properties inflating: oiri_ding_etl_samples/roleEntitlementComposition.csv inflating: oiri_ding_etl_samples/roles.csv inflating: oiri_ding_etl_samples/roleUserMembership.csv inflating: oiri_ding_etl_samples/role_hierarchy.csv inflating: oiri_ding_etl_samples/users.csv inflating: oiri_ding_etl_samples/users_with_ca.csv
-
Copy sample files to the relevant location for loading into OIRI.
When you install OIRI fixed container locations are created into which you can put flat files containing entity data to be loaded. These locations are as follows.
- /app/data/input/users/
- /app/data/input/applications/
- /app/data/input/entitlements/
- /app/data/input/assignedentitlements/
- /app/data/input/roles/
- /app/data/input/rolehierarchy/
- /app/data/input/roleusermemberships/
- /app/data/input/roleentitlementcompositions/
- /app/data/input/accounts/
Recall that the host directory
/nfs/ding
is mapped to the container directory/app/data/
. Copy the sample files into the location on your host as directed in the table below, for example:cp /staging/oiri_ding_etl_samples/users.csv /nfs/ding/input/users/
Location Sample File /nfs/ding/input/users/ users.csv /nfs/ding/input/applications/ applications.csv /nfs/ding/input/entitlements/ entitlements.csv /nfs/ding/input/assignedentitlements/ assignedEntitlements.csv /nfs/ding/input/roles/ roles.csv /nfs/ding/input/rolehierarchy/ role_hierarchy.csv /nfs/ding/input/roleusermemberships/ roleUserMembership.csv /nfs/ding/input/roleentitlementcompositions/ roleEntitlementComposition.csv /nfs/ding/input/accounts/ accounts.csv Confirm that the sample files can be accessed from within the
oiri-ding
container.docker exec -it ding-cli bash
ls -ltr /app/data/input/*
You will see output similar to the following.
/app/data/input/users: total 40 -rwxrwxrwx 1 root root 38909 Apr 21 12:59 users.csv /app/data/input/applications: total 4 -rwxrwxrwx 1 root root 788 Apr 21 13:00 applications.csv /app/data/input/entitlements: total 8 -rwxrwxrwx 1 root root 8055 Apr 21 13:01 entitlements.csv /app/data/input/assignedentitlements: total 104 -rwxrwxrwx 1 root root 105099 Apr 21 13:02 assignedEntitlements.csv /app/data/input/roles: total 4 -rwxrwxrwx 1 root root 3064 Apr 21 13:02 roles.csv /app/data/input/rolehierarchy: total 4 -rwxrwxrwx 1 root root 266 Apr 21 13:03 role_hierarchy.csv /app/data/input/roleusermemberships: total 4 -rwxrwxrwx 1 root root 2631 Apr 21 13:04 roleUserMembership.csv /app/data/input/roleentitlementcompositions: total 12 -rwxrwxrwx 1 root root 10822 Apr 21 13:05 roleEntitlementComposition.csv /app/data/input/accounts: total 12 -rwxrwxrwx 1 root root 10898 Apr 21 13:05 accounts.csv
Perform a Dry Run of the Data Ingestion Process
Before running data ingestion you can perform a dry run to validate if the source data against the OIRI database. This process fetches data from the source, in this case flat files, and validates it against the metadata of the OIRI database. The dry run process will pick up any issues with the input data, for example, user name cannot be more than 50 characters, or duplicate entries are not permitted. A dry run can be executed with the following steps.
-
Connect to the
ding-cli
container and invoke the dry run.docker exec -it ding-cli bash
ding-cli --config=/app/data/conf/config.yaml data-ingestion dry-run /app/data/conf/data-ingestion-config.yaml
This will produce a long output, the top and tail of which are shown below.
ding-cli --config=/app/data/conf/config.yaml data-ingestion dry-run /app/data/conf/data-ingestion-config.yaml ###### ### # # ##### ##### # ### # # # ## # # # # # # # # # # # # # # # # # # # # # # # # #### ##### # # # # # # # # # # # # # # # # # # ## # # # # # # ###### ### # # ##### ##### ####### ### Extract and Validate Identity data from given Identity source to OIRI Database. Found driver jar file at : /ding-cli/driver/spark-apps.jar applicationJarLocation /ding-cli/driver/spark-apps.jar Apr 22, 2021 12:18:20 PM org.apache.spark.launcher.OutputRedirector redirect ... Successfully Initiated Extract and Validate Identity data from given Identity source to OIRI Database. INFO: Job submitted successfully. Please check the Spark driver logs for more details
This command picks each entity, fetches the data from the flat files, and validates it with the metadata defined in the OIRI database.
-
Review the Dry Run Data Import.
Sign into Identity Role Intelligence UI
In a browser, navigate to the following URL:
http://HOST_NAME:PORT/oiri/ui/v1/console
For example:
https://oirihost.example.com:30305/oiri/ui/v1/console
The OIRI login screen is displayed.
Enter the credentials for your environment and click Sign in .
In the OIRI UI Home screen select Data Import.
In the Manage Data Import screen you can search for the dry run import task that you want to review. When you have identified the task, select View Task or click on the task name.
The View Results screen is displayed which shows the imports for each of the entities specified in the flat files.
From here you can drill down into the details of the specific entities, as with the example below which shows the details for the User entity.
If you have errors in the source data, for example duplicate entries, then these will be flagged in the dry run results. In the example below, you can see that the Users entity has a duplicate entry for
EXT_USER_ID 70
.If the issue is not corrected in the source data, the subsequent data load will fail. In the case of the duplicate user entry, you will see a failed Spark pod as shown below. Interrogation of the pod logs shows that the duplicate causes a unique constraint error in the OIRI database.
Note: to access the pod information and logs you need to connect to the
oiri-cli
container which haskubectl
configured.docker exec -it oiri-cli bash
kubectl get pods -n ding
You will see output similar to the following.
NAME READY STATUS RESTARTS AGE oiri-ding-8337ee78f9845138-driver 0/1 Completed 0 25h oiri-ding-b70d9f78febf73ea-driver 0/1 Error 0 85m oiri-ding-baa5ce78fa1f4b2f-driver 0/1 Completed 0 22h oiri-ding-c9a54178fecedd1a-driver 0/1 Completed 0 68m spark-history-server-74799bd496-fznrr 1/1 Running 0 8d
For the driver pod that is in error, return the logs.
[oiri@oiri-cli scripts] kubectl logs oiri-ding-b70d9f78febf73ea-driver -n ding ... Caused by: java.sql.BatchUpdateException: ORA-00001: unique constraint (OII_OIRI.USERS_PK) violated at oracle.jdbc.driver.OraclePreparedStatement.executeLargeBatch(OraclePreparedStatement.java:9711) at oracle.jdbc.driver.T4CPreparedStatement.executeLargeBatch(T4CPreparedStatement.java:1447) at oracle.jdbc.driver.OraclePreparedStatement.executeBatch(OraclePreparedStatement.java:9487) at oracle.jdbc.driver.OracleStatementWrapper.executeBatch(OracleStatementWrapper.java:237) at oracle.oiri.apps.tasks.dal.impl.UserDataAccess.lambda$loadDataset$eefa0bc6$1(UserDataAccess.java:847) ... 13 more 21/04/23 12:43:17 INFO ShutdownHookManager: Shutdown hook called
Perform the Data Ingestion Process
Once you have reviewed the dry run, and corrected any errors, you can run the data import using the following steps.
-
Connect to the
oiri-ding
container and invoke the data ingestion process.docker exec -it ding-cli bash
ding-cli --config=/app/data/conf/config.yaml data-ingestion start /app/data/conf/data-ingestion-config.yaml
This will produce a long output, the top and tail of which are shown below.
ding-cli --config=/app/data/conf/config.yaml data-ingestion start /app/data/conf/data-ingestion-config.yaml ###### ### # # ##### ##### # ### # # # ## # # # # # # # # # # # # # # # # # # # # # # # # #### ##### # # # # # # # # # # # # # # # # # # ## # # # # # # ###### ### # # ##### ##### ####### ### Extract Transform Load Identity data from given Identity source to OIRI Database. Found driver jar file at : /ding-cli/driver/spark-apps.jar applicationJarLocation /ding-cli/driver/spark-apps.jar ... INFO: 21/04/22 15:17:52 INFO ShutdownHookManager: Deleting directory /tmp/spark-212ade80-69d0-47fe-9e23-2b80a50bfc03 INFO: Job submitted successfully. Please check the Spark driver logs for more details
-
Review the Data Import.
When you sign into Identity Role Intelligence UI, you will notice that the number of data loads has incremented by 1.
Select the data load and review the import in the same way as you did for the dry run.
The row count shows that 208 users have been added to the OIRI database. This can be confirmed by connecting to the OIRI database and selecting information from the
users
table. Notice that 208 rows are returned. Also, that the value of theUSER_ID
column has been prefixed with āFā to indicate that it was loaded from a flat file. OIG loaded data will be prefixed with āGā.
Related Links
Provide links to additional resources. This section is optional; delete if not needed.
Acknowledgements
- Authors - Mike Howlett
- Contributors - Anup Gautam, Gaurav Kumar, Rishi Agarwal, Tanmay Garg
More Learning Resources
Explore other labs on docs.oracle.com/learn or access more free learning content on the Oracle Learning YouTube channel. Additionally, visit education.oracle.com/learning-explorer to become an Oracle Learning Explorer.
For product documentation, visit Oracle Help Center.
Load Oracle Identity Role Intelligence Data Using Data Ingestion (Day 0)
F41671-02
May 2021
Copyright © 2021, Oracle and/or its affiliates.
This software and related documentation are provided under a license agreement containing restrictions on use and disclosure and are protected by intellectual property laws. Except as expressly permitted in your license agreement or allowed by law, you may not use, copy, reproduce, translate, broadcast, modify, license, transmit, distribute, exhibit, perform, publish, or display any part, in any form, or by any means. Reverse engineering, disassembly, or decompilation of this software, unless required by law for interoperability, is prohibited.
The information contained herein is subject to change without notice and is not warranted to be error-free. If you find any errors, please report them to us in writing.
If this is software or related documentation that is delivered to the U.S. Government or anyone licensing it on behalf of the U.S. Government, then the following notice is applicable:
U.S. GOVERNMENT END USERS: Oracle programs, including any operating system, integrated software, any programs installed on the hardware, and/or documentation, delivered to U.S. Government end users are "commercial computer software" pursuant to the applicable Federal Acquisition Regulation and agency-specific supplemental regulations. As such, use, duplication, disclosure, modification, and adaptation of the programs, including any operating system, integrated software, any programs installed on the hardware, and/or documentation, shall be subject to license terms and license restrictions applicable to the programs. No other rights are granted to the U.S. Government.
This software or hardware is developed for general use in a variety of information management applications. It is not developed or intended for use in any inherently dangerous applications, including applications that may create a risk of personal injury. If you use this software or hardware in dangerous applications, then you shall be responsible to take all appropriate fail-safe, backup, redundancy, and other measures to ensure its safe use. Oracle Corporation and its affiliates disclaim any liability for any damages caused by use of this software or hardware in dangerous applications.
Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners.
Intel and Intel Xeon are trademarks or registered trademarks of Intel Corporation. All SPARC trademarks are used under license and are trademarks or registered trademarks of SPARC International, Inc. AMD, Opteron, the AMD logo, and the AMD Opteron logo are trademarks or registered trademarks of Advanced Micro Devices. UNIX is a registered trademark of The Open Group.
This software or hardware and documentation may provide access to or information about content, products, and services from third parties. Oracle Corporation and its affiliates are not responsible for and expressly disclaim all warranties of any kind with respect to third-party content, products, and services unless otherwise set forth in an applicable agreement between you and Oracle. Oracle Corporation and its affiliates will not be responsible for any loss, costs, or damages incurred due to your access to or use of third-party content, products, or services, except as set forth in an applicable agreement between you and Oracle.