Load Oracle Identity Role Intelligence Data Using Data Ingestion (Day 0)

Introduction

Data ingestion is the process of importing data into the Oracle Identity Role Intelligence (OIRI) database. OIRI uses a Command Line Interface (CLI) tool to fetch and load data from external sources such as the Oracle Identity Governance (OIG) database or flat files.

As part of the data ingestion process, data is loaded into the following tables of the OIRI database:

You can run the data import process in the following modes:

When you install OIRI, you set the default data ingestion sources. These can include:

Objectives

On completion of this lab, you should be able to load data into the OIRI database using the data ingestion method.

Prerequisites

Before starting this tutorial you must have:

Configure the Data Ingestion Process

Before running the data ingestion process you should complete the following configuration tasks.

  1. Copy the Kubernetes Certificate Authority certificate to the data ingestion directory, /nfs/ding.

    cp /etc/kubernetes/pki/ca.crt /nfs/ding/
    
  2. Run the oiri-ding container. This allows you to issue data ingestion commands and configure data ingestion specific parameters.

    docker run -d --name ding-cli \
    -v /nfs/ding/:/app/ \
    -v /nfs/oiri/:/app/oiri \
    -v /local/k8s/:/app/k8s \
    oiri-ding-12.2.1.4.<TAG> \
    tail -f /dev/null
    

    Confirm that the container is running.

    docker ps -a | grep ding-cli
    

    Should return output similar to:

    968676d7db91     oiri-ding-12.2.1.4.<TAG>      "tail -f /dev/null"      21 seconds ago      Up 14 seconds                                                               ding-cli
    
  3. Connect to the ding-cli container and verify parameters.

    Connect to the ding-cli container.

    docker exec -it ding-cli bash
    

    Run the ding-cli command to verify the data ingestion configuration.

    ding-cli --config=/app/data/conf/config.yaml data-ingestion verify /app/data/conf/data-ingestion-config.yaml
    

    You will see output similar to the following.

    ######  ### #     #  #####         #####  #       ###
    #     #  #  ##    # #     #       #     # #        #
    #     #  #  # #   # #             #       #        #
    #     #  #  #  #  # #  #### ##### #       #        #
    #     #  #  #   # # #     #       #       #        #
    #     #  #  #    ## #     #       #     # #        #
    ######  ### #     #  #####         #####  ####### ###
    Verifying Data Ingestion Config. Wallet and DB connections will be validated
    SUCCESS: Data Ingestion Config is valid.
    
  4. Verify that flat files are enabled as a data source.

    Connect to the ding-cli container.

    docker exec -it ding-cli bash
    

    View the data-ingestion-config.yaml file.

    view /app/data/conf/data-ingestion-config.yaml
    

    Locate the sources section and confirm that flatFile is enabled and that OIGDB is disabled.

    sources:
      oracleIdentityGovernance:
     enabled: false
     version: 12.2.1.3
     format: jdbc
     option:
       url: jdbc:oracle:thin:@oighost@example.com:1522/oimpdb.example.com
       driver: oracle.jdbc.driver.OracleDriver
       queryTimeout: 300
       fetchSize: 50
      flatFile:
     enabled: true
     version: 1.0
     format: csv
     option:
       sep: ','
       timestampFormat: yyyy-MM-dd
    

    If flat files are not enabled or OIGDB is enabled as a data source then reset them using the updateDataIngestionConfig.sh script:

    ./updateDataIngestionConfig.sh --useoigdbforetl false --useflatfileforetl true
    

    You will see output similar to the following.

    INFO:  OIG DB as source for ETL is true
    INFO: Setting up /app/data/conf/config.yaml
    INFO: Setting up /app/data/conf/data-ingestion-config.yaml
    INFO: Setting up /app/data/conf/custom-attributes.yaml
    INFO: Setting up /app/oiri/data/conf/application.yaml
    INFO: Setting up /app/oiri/data/conf/authenticationConf.yaml
    INFO: Setting up /app/data/conf/dbconfig.yaml
    
  5. Download and copy the sample data to your OIRI environment.

    Download the oiri_ding_etl_samples.zip file and copy to a staging directory. Unzip the file.

    cd /staging/
    unzip oiri_ding_etl_samples.zip
    

    You will see output similar to the following.

    Archive:  oiri_ding_etl_samples.zip
    creating: oiri_ding_etl_samples/
    inflating: oiri_ding_etl_samples/accounts.csv
    inflating: oiri_ding_etl_samples/applications.csv
    inflating: oiri_ding_etl_samples/assignedEntitlements.csv
    inflating: oiri_ding_etl_samples/entitlements.csv
    inflating: oiri_ding_etl_samples/file_list_data_to_be_deleted.properties
    inflating: oiri_ding_etl_samples/oig_list_data_to_be_deleted.properties
    inflating: oiri_ding_etl_samples/roleEntitlementComposition.csv
    inflating: oiri_ding_etl_samples/roles.csv
    inflating: oiri_ding_etl_samples/roleUserMembership.csv
    inflating: oiri_ding_etl_samples/role_hierarchy.csv
    inflating: oiri_ding_etl_samples/users.csv
    inflating: oiri_ding_etl_samples/users_with_ca.csv
    
  6. Copy sample files to the relevant location for loading into OIRI.

    When you install OIRI fixed container locations are created into which you can put flat files containing entity data to be loaded. These locations are as follows.

    • /app/data/input/users/
    • /app/data/input/applications/
    • /app/data/input/entitlements/
    • /app/data/input/assignedentitlements/
    • /app/data/input/roles/
    • /app/data/input/rolehierarchy/
    • /app/data/input/roleusermemberships/
    • /app/data/input/roleentitlementcompositions/
    • /app/data/input/accounts/

    Recall that the host directory /nfs/ding is mapped to the container directory /app/data/. Copy the sample files into the location on your host as directed in the table below, for example:

    cp /staging/oiri_ding_etl_samples/users.csv /nfs/ding/input/users/
    
    Location Sample File
    /nfs/ding/input/users/ users.csv
    /nfs/ding/input/applications/ applications.csv
    /nfs/ding/input/entitlements/ entitlements.csv
    /nfs/ding/input/assignedentitlements/ assignedEntitlements.csv
    /nfs/ding/input/roles/ roles.csv
    /nfs/ding/input/rolehierarchy/ role_hierarchy.csv
    /nfs/ding/input/roleusermemberships/ roleUserMembership.csv
    /nfs/ding/input/roleentitlementcompositions/ roleEntitlementComposition.csv
    /nfs/ding/input/accounts/ accounts.csv

    Confirm that the sample files can be accessed from within the oiri-ding container.

    docker exec -it ding-cli bash
    
    ls -ltr /app/data/input/*
    

    You will see output similar to the following.

    /app/data/input/users:
    total 40
    -rwxrwxrwx 1 root root 38909 Apr 21 12:59 users.csv
    
    /app/data/input/applications:
    total 4
    -rwxrwxrwx 1 root root 788 Apr 21 13:00 applications.csv
    
    /app/data/input/entitlements:
    total 8
    -rwxrwxrwx 1 root root 8055 Apr 21 13:01 entitlements.csv
    
    /app/data/input/assignedentitlements:
    total 104
    -rwxrwxrwx 1 root root 105099 Apr 21 13:02 assignedEntitlements.csv
    
    /app/data/input/roles:
    total 4
    -rwxrwxrwx 1 root root 3064 Apr 21 13:02 roles.csv
    
    /app/data/input/rolehierarchy:
    total 4
    -rwxrwxrwx 1 root root 266 Apr 21 13:03 role_hierarchy.csv
    
    /app/data/input/roleusermemberships:
    total 4
    -rwxrwxrwx 1 root root 2631 Apr 21 13:04 roleUserMembership.csv
    
    /app/data/input/roleentitlementcompositions:
    total 12
    -rwxrwxrwx 1 root root 10822 Apr 21 13:05 roleEntitlementComposition.csv
    
    /app/data/input/accounts:
    total 12
    -rwxrwxrwx 1 root root 10898 Apr 21 13:05 accounts.csv
    

Perform a Dry Run of the Data Ingestion Process

Before running data ingestion you can perform a dry run to validate if the source data against the OIRI database. This process fetches data from the source, in this case flat files, and validates it against the metadata of the OIRI database. The dry run process will pick up any issues with the input data, for example, user name cannot be more than 50 characters, or duplicate entries are not permitted. A dry run can be executed with the following steps.

  1. Connect to the ding-cli container and invoke the dry run.

    docker exec -it ding-cli bash
    
    ding-cli --config=/app/data/conf/config.yaml data-ingestion dry-run /app/data/conf/data-ingestion-config.yaml
    

    This will produce a long output, the top and tail of which are shown below.

    ding-cli --config=/app/data/conf/config.yaml data-ingestion dry-run /app/data/conf/data-ingestion-config.yaml
    ######  ### #     #  #####         #####  #       ###
    #     #  #  ##    # #     #       #     # #        #
    #     #  #  # #   # #             #       #        #
    #     #  #  #  #  # #  #### ##### #       #        #
    #     #  #  #   # # #     #       #       #        #
    #     #  #  #    ## #     #       #     # #        #
    ######  ### #     #  #####         #####  ####### ###
    Extract and Validate Identity data from given Identity source to OIRI Database.
    Found driver jar file at : /ding-cli/driver/spark-apps.jar
    applicationJarLocation  /ding-cli/driver/spark-apps.jar
    Apr 22, 2021 12:18:20 PM org.apache.spark.launcher.OutputRedirector redirect
    ...
    Successfully Initiated Extract and Validate Identity data from given Identity source to OIRI Database.
    INFO: Job submitted successfully. Please check the Spark driver logs for more details
    

    This command picks each entity, fetches the data from the flat files, and validates it with the metadata defined in the OIRI database.

  2. Review the Dry Run Data Import.

    Sign into Identity Role Intelligence UI

    In a browser, navigate to the following URL:

    http://HOST_NAME:PORT/oiri/ui/v1/console

    For example:

    https://oirihost.example.com:30305/oiri/ui/v1/console

    The OIRI login screen is displayed.

    OIRI Login Screen

    Enter the credentials for your environment and click Sign in .

    Sign In

    In the OIRI UI Home screen select Data Import.

    Data Import

    In the Manage Data Import screen you can search for the dry run import task that you want to review. When you have identified the task, select View Task or click on the task name.

    View Task

    The View Results screen is displayed which shows the imports for each of the entities specified in the flat files.

    View Task

    From here you can drill down into the details of the specific entities, as with the example below which shows the details for the User entity.

    Users Entity Detail

    If you have errors in the source data, for example duplicate entries, then these will be flagged in the dry run results. In the example below, you can see that the Users entity has a duplicate entry for EXT_USER_ID 70.

    Users Entity Detail

    If the issue is not corrected in the source data, the subsequent data load will fail. In the case of the duplicate user entry, you will see a failed Spark pod as shown below. Interrogation of the pod logs shows that the duplicate causes a unique constraint error in the OIRI database.

    Note: to access the pod information and logs you need to connect to the oiri-cli container which has kubectl configured.

    docker exec -it oiri-cli bash
    
    kubectl get pods -n ding
    

    You will see output similar to the following.

    NAME                                    READY   STATUS      RESTARTS   AGE
    oiri-ding-8337ee78f9845138-driver       0/1     Completed   0          25h
    oiri-ding-b70d9f78febf73ea-driver       0/1     Error       0          85m
    oiri-ding-baa5ce78fa1f4b2f-driver       0/1     Completed   0          22h
    oiri-ding-c9a54178fecedd1a-driver       0/1     Completed   0          68m
    spark-history-server-74799bd496-fznrr   1/1     Running     0          8d
    

    For the driver pod that is in error, return the logs.

    [oiri@oiri-cli scripts] kubectl logs oiri-ding-b70d9f78febf73ea-driver -n ding
    ...
    Caused by: java.sql.BatchUpdateException: ORA-00001: unique constraint (OII_OIRI.USERS_PK) violated
    
         at oracle.jdbc.driver.OraclePreparedStatement.executeLargeBatch(OraclePreparedStatement.java:9711)
         at oracle.jdbc.driver.T4CPreparedStatement.executeLargeBatch(T4CPreparedStatement.java:1447)
         at oracle.jdbc.driver.OraclePreparedStatement.executeBatch(OraclePreparedStatement.java:9487)
         at oracle.jdbc.driver.OracleStatementWrapper.executeBatch(OracleStatementWrapper.java:237)
         at oracle.oiri.apps.tasks.dal.impl.UserDataAccess.lambda$loadDataset$eefa0bc6$1(UserDataAccess.java:847)
         ... 13 more
    21/04/23 12:43:17 INFO ShutdownHookManager: Shutdown hook called
    

Perform the Data Ingestion Process

Once you have reviewed the dry run, and corrected any errors, you can run the data import using the following steps.

  1. Connect to the oiri-ding container and invoke the data ingestion process.

    docker exec -it ding-cli bash
    
    ding-cli --config=/app/data/conf/config.yaml data-ingestion start /app/data/conf/data-ingestion-config.yaml
    

    This will produce a long output, the top and tail of which are shown below.

    ding-cli --config=/app/data/conf/config.yaml data-ingestion start /app/data/conf/data-ingestion-config.yaml
    ######  ### #     #  #####         #####  #       ###
    #     #  #  ##    # #     #       #     # #        #
    #     #  #  # #   # #             #       #        #
    #     #  #  #  #  # #  #### ##### #       #        #
    #     #  #  #   # # #     #       #       #        #
    #     #  #  #    ## #     #       #     # #        #
    ######  ### #     #  #####         #####  ####### ###
    Extract Transform Load Identity data from given Identity source to OIRI Database.
    Found driver jar file at : /ding-cli/driver/spark-apps.jar
    applicationJarLocation  /ding-cli/driver/spark-apps.jar
    ...
    INFO: 21/04/22 15:17:52 INFO ShutdownHookManager: Deleting directory /tmp/spark-212ade80-69d0-47fe-9e23-2b80a50bfc03
    INFO: Job submitted successfully. Please check the Spark driver logs for more details
    
  2. Review the Data Import.

    When you sign into Identity Role Intelligence UI, you will notice that the number of data loads has incremented by 1.

    Data Import

    Select the data load and review the import in the same way as you did for the dry run.

    Users Entity Detail

    The row count shows that 208 users have been added to the OIRI database. This can be confirmed by connecting to the OIRI database and selecting information from the users table. Notice that 208 rows are returned. Also, that the value of the USER_ID column has been prefixed with ā€˜Fā€™ to indicate that it was loaded from a flat file. OIG loaded data will be prefixed with ā€˜Gā€™.

    Users Entity Detail

Provide links to additional resources. This section is optional; delete if not needed.

Acknowledgements

More Learning Resources

Explore other labs on docs.oracle.com/learn or access more free learning content on the Oracle Learning YouTube channel. Additionally, visit education.oracle.com/learning-explorer to become an Oracle Learning Explorer.

For product documentation, visit Oracle Help Center.