4 Installation

After you download the installer kit, you can begin the installation, or deployment of the images and containers in the kubernetes environment. Perform the following steps to complete the installation:

Extract the Installer Kit

After downloading the .zip folder, follow these steps to extract the folder contents:

1. Extract the contents of the installer archive file in the download directory using the following commands:

<FCC_Studio_Installer_Archive_File>_1of2.zip

<FCC_Studio_Installer_Archive_File>_2of2.zip

Both FCC Studio installer files are extracted to the same directory and the OFS_FCCM_STUDIO directory is obtained and is referred to as <Studio_Installation_Path>.

WARNING

Do not rename the application installer directory name after extraction from the archive.

2. Navigate to the download directory where the installer archive is extracted and assign execute permission to the installer directory using the following command:

chmod 0755 OFS_FCCM_STUDIO -R

Place Files in the Installation Directories

To place the required jars and Kerberos files in the required locations, follow these steps:

1. To place the additional jar files, follow these steps:

a. Navigate to the <Studio_Installation_Path>/batchservice/user/lib directory.

b. Place the following additional jar files:

§ hive-exec-*.jar. For example, hive-exec-1.1.0.jar.

§ HiveJDBC4.jar

§ hive-metastore-*.jar. For example, hive-metastore-1.1.0.jar.

§ hive-service-*.jar. For example, hive-service-1.1.0.jar.

NOTE	· The version of the jars is client or user-specific. These jars can be obtained from the existing jars of the Cloudera installation. · The HiveJDBC4.jar file is not available in the Cloudera setup. You must download the same from the Cloudera website.

2. To place the Kerberos files, follow these steps:

a. Navigate to the <Studio_Installation_Path>/batchservice/user/conf directory.

b. Place the following Kerberos files:

§ krb5.conf

§ keytab file name as mentioned in the config.sh file.

Generate an Encrypted Password

To generate an encrypted password, follow these steps:

1. Set the export FIC_DB_HOME path in the <Studio Installed Directory>/ficdb directory.

2. Run the echo $FIC_DB_HOME command.

3. Go to the <Studio_Installation_Path>/ficdb/bin directory and run the ./FCCM_Studio_Base64Encoder.sh <password to be encrypted> command.

Generate the Public and Private Keys

The Public and Private keys are JSON Web Tokens (JWT) that are generated for PGX Authentication from FCC Studio.

To generate the keys, follow these steps:

NOTE	The following steps are mandatory for the first time FCC Studio installation.

1. Navigate to the <Studio_Installation_Path>/ficdb/bin directory.

2. Run the Shell Script FCCM_Studio_JWT_Keygen.sh from the directory.

The Public and Private Keys are generated and available in the <Studio_Installation_Path>/secrets/keys directory.

3. Copy the private.key and public.key files to the <Studio_Installation_Path>/ficdb/conf directory.

After generating the key store file and adding the batch service to the PGX trust store, configure the user mapping for GDPR and Redaction changes in the database.

Apply GDPR and Redaction Changes for FCC Studio

The General Data Protection Regulation (GDPR) is a regulation in EU law on data protection and privacy in the European Union and the European Economic Area. You can apply the GDPR changes that is required for FCC Studio.

To apply GDPR and Redaction, you must configure the following:

· Generate the Key Store File for Secure Batch Service

· Add the Batch Service (SSL) to PGX Trust Store

Generate the Key Store File for Secure Batch Service

Generating the Key Store file for Secure Batch Service is a process of generating the key store parameters and changing the key store parameters from HTTP to HTTPS protocol.

To configure the Key Store file for Secure Batch Service, follow these steps:

1. Run the keytool -genkey -alias batchservice -keyalg RSA -keysize 2048 -keystore <Studio_Installation_Path>/OFS_FCCM_STUDIO/batchservice/conf/<Keystore file name>.jks command in the Studio Server.

When generating the keytool ensure to provide the hostname in first name. For example:

Question: What is your first and last name?

Answer: Provide the batch service name.

2. Specify the keystore password. The <Keystore file name>.jks file is created in the path <Studio_Installation_Path>/OFS_FCCM_STUDIO/batchservice/conf directory.

3. Specify the following parameters in the config.sh file.

§ export KYESTORE_FILE_NAME=<Keystore file name>.jks

§ export KYESTORE_PASS=password

Add the Batch Service (SSL) to PGX Trust Store

Adding the Batch Service (SSL) to PGX Trust Store facilitates you to apply redaction on the graph batch service and connect with PGX.

To add the Batch Service to PGX Trust Store, follow these steps:

1. Copy the <Keystore file name>.jks file to the <PGX Server path>/server/conf directory.

2. Navigate to the <PGX Server path>/server/bin directory.

3. Open the start-server file in <PGX Server path>/server/bin directory and add the following lines in export JAVA_OPTS:

§ Djavax.net.ssl.trustStore=<PGX Server path>/conf/<Keystore file name>.jks

§ Djavax.net.ssl.trustStorePassword=<Keystore file password>

The code snippet shows an example of the file when the code is added:

#!/bin/bash

export HADOOP_EXTRA_CLASSPATH="$APP_HOME/hdfs-libs/*:$APP_HOME/conf/hadoop_cluster"

export CLASSPATH="$APP_HOME/shared-lib/common/*:$APP_HOME/shared-lib/server/*:$APP_HOME/shared-lib/embedded/*:$APP_HOME/shared-lib/third-party/*:$APP_HOME/conf:$APP_HOME/shared-memory/server/*:$APP_HOME/shared-memory/common/*:$APP_HOME/shared-memory/third-party/*:$HADOOP_EXTRA_CLASSPATH"

export JAVA_OPTS="-Dpgx.max_off_heap_size=$PGX_SERVER_OFF_HEAP_MB -Xmx${PGX_SERVER_ON_HEAP_MB}m -Xms${PGX_SERVER_ON_HEAP_MB}m -XX:MaxNewSize=${PGX_SERVER_YOUNG_SPACE_MB}m -XX:NewSize=${PGX_SERVER_YOUNG_SPACE_MB}m -Dsun.security.krb5.debug=false -Djavax.security.auth.useSubjectCredsOnly=false -Djava.security.krb5.conf=$APP_HOME/conf/kerberos/krb5.conf -Dpgx_conf=$APP_HOME/conf/pgx.conf -Djavax.net.ssl.trustStore=/scratch/fccstudio/OFS_FCCM_STUDIO/pgx/server /conf/keystore.jks -Djavax.net.ssl.trustStorePassword=password"

java -cp "$CLASSPATH" -Dfile.encoding=UTF-8 $JAVA_OPTS oracle.pgx.server.Main $APP_HOME/shared-memory/server/pgx-webapp-*.war $APP_HOME/conf/server.conf

After generating the key store file and adding the batch service to PGX trust store, in the database you must configure the user mapping for the changes made. For more information about how to configure user mapping, see the FCC Studio Administration Guide.

Configure the Preferred Services

To configure the preferred services to be deployed during deployment of FCC Studio, follow these steps:

1. Navigate to the <Studio_Installation_Path>/bin/ directory.

2. Set the deployment parameter depending on the services you want to deploy in the serviceMapping.sh file. Set the deployment parameter to All if you want to deploy all services, or set the deployment parameter to Custom if you want to choose specific services to deploy.

If the deployment parameter is set to Custom, then set the values of the desired services to true and the values of undesired services to false.

A sample serviceMapping.sh file is as follows.

Figure 4: Sample serviceMapping.sh File

NOTE	Do not set false for the server service.

Configure the studio-env.yml File

To configure the studio-env.yml file for installing FCC Studio, follow these steps:

1. Login to the server as a non-root user.

2. Navigate to the <Studio_Installation_Path>/secrets/ directory.

3. Configure the studio-env.yml file as shown in the following table.

A sample studio-env.yml file is as follows.

Figure 4: Sample studio-env.yml file

NOTE

· Do not alter the parameter values that are already set in the studio-env.yml file

· Retain the existing placeholder values for the parameters which are not shown as mandatory in the following table.

· You must manually set the parameter value in the studio-env.yml file. If a value is not applicable, enter NA and ensure that the value is not entered as NULL.

· Depending on the installation architecture, ensure to provide the correct hostname for URL of PGX service in the studio-env.yml file.

· When upgrading FCC Studio with OFSAA, ensure to provide the same BD database, Studio schema, Hive schema, wallet related information that you used during the installation of the existing instance of FCC Studio.

· When upgrading FCC Studio with non-OFSAA, ensure to provide the same Studio schema and wallet related information that you used during the installation of the existing instance of FCC Studio.

Table 12: studio-env.yml Parameters


Parameter	Significance	Installing with OFSAA (Mandatory)	Upgrading with OFSAA (Mandatory)	Installing without OFSAA (Mandatory)
apiVersion	Indicates the current API version. For example: v1	Yes	Yes	Yes
kind	Indicates the object in which the file information is stored. For example: Secret	Yes	Yes	Yes
metadata
name	Indicates the file name. For example: studio-env	Yes	Yes	Yes
stringData
NON_OFSAA	Indicates the type of installation. · To install FCC Studio with OFSAA on the Kubernetes cluster, enter false. · To install FCC Studio without OFSAA on the Kubernetes cluster, enter true.	Enter false	Enter false	Enter true
REALM	Realm indicates functional grouping of database schemas and roles that must be secured for an application. Realms protect data from access through system privileges; realms do not give additional privileges to its owner or participants. FCC Studio uses realm based authorization and authentication for its users. For more information, see the Realm Based Authorization for FCC Studio section in the OFS Crime and Compliance Studio Administration Guide. The FCC Studio application can be accessed using the following realms: · FCCMRealm Value=com.oracle.ofss.fccm.studio.datastudio.auth.FCCMRealm · IdcsRealm Value=oracle.datastudio.realm.idcs.IdcsRealm · DemoRealm Value=com.oracle.ofss.fccm.studio.datastudio.auth.DemoRealm NOTE: The DemoRealm is used only for demo purpose and is not recommended for usage.	Yes	Yes	Yes
OFSAA_SERVICE_URL	Indicates the URL of the main database being used for installation. For example: https://<HostName>:<PortNo>/<ContextName>/rest-api	Yes	Yes	No
LIVY_HOST_URL	Indicates the URL of the Livy application. Example: http://<HostName>:<PortNo> NOTE: This parameter is applicable only if fcc-spark-sql, fcc-spark-scala, and (or) fcc-pyspark interpreters are to be used.	No	No	No
DB Details for Studio Schema
STUDIO_DB_HOSTNAME	Indicates the hostname of the database where Studio schema is created.	Yes	Yes	Yes
STUDIO_DB_PORT	Indicates the port number where Studio schema is created.	Yes	Yes	Yes
STUDIO_DB_SERVICE_NAME	Indicates the service name of the database where Studio schema is created.	Yes	Yes	Yes
STUDIO_DB_SID	Indicates the SID of the database where Studio schema is created.	Yes	Yes	Yes
STUDIO_DB_USERNAME	Indicates the username of the Studio Schema (newly created Oracle Schema).	Yes	Yes	Yes
STUDIO_DB_PASSWORD	Indicates the password for the newly created schema. This value must not be blank.	Yes	Yes	Yes
STUDIO_DB_ENCRYPTED_PASSWORD	Indicates the encrypted password that is provided for the Studio schema. For example, cGFzc3dvcmQ.	Yes	Yes	Yes
Studio DB Wallet Details For more information on creating wallet, see Appendix - Setting Up Password Stores with Oracle Wallet .
STUDIO_ALIAS_NAME	Indicates the Studio alias name. NOTE: Enter the alias name that was created during wallet creation.	Yes	Yes	Yes
STUDIO_WALLET_LOCATION	Indicates the Studio wallet location.	Yes	Yes	Yes
STUDIO_TNS_ADMIN_PATH	Indicates the path of the tnsnames.ora file where an entry for the STUDIO_ALIAS_NAME is present.	Yes	Yes	Yes
Hadoop Connection Details
STUDIO_HADOOP_CREDENTIAL_ALIAS	Indicated the alias password saved in Hadoop. For example, studio.password.alias	Yes	Yes	Yes
STUDIO_HADOOP_CREDENTIAL_PATH	Indicates the credentials path. For example, <Studio Installed Path>oracle.password.jceks	Yes	Yes	Yes
DB Details for BD Config Schema
BD_CONFIG_HOSTNAME	Indicates the hostname of the database where BD or ECM config schema is installed.	Yes	Yes	Enter NA
BD_CONFIG_PORT	Indicates the port of the database where BD or ECM config schema is installed.	Yes	Yes	Enter NA
BD_CONFIG_SERVICE_NAME	Indicates the service name of the database where BD or ECM config schema is installed.	Yes	Yes	Enter NA
BD_CONFIG_SID	Indicates the SID of the database where BD or ECM config schema is installed.	Yes	Yes	Enter NA
BD_CONFIG_USERNAME	Indicates the username for the BD or ECM config schema.	Yes	Yes	Enter NA
BD_CONFIG_PASSWORD	Indicates the password for the BD or ECM config schema. This value must not be blank.	Yes	Yes	Enter NA
BD Config Wallet Details For more information on creating wallet, see Appendix - Setting Up Password Stores with Oracle Wallet .
BD_CONFIG_ALIAS_NAME	Indicates the BD or ECM config alias name. NOTE: Enter the alias name that was created during wallet creation.	Yes	Yes	Enter NA
BD_CONFIG_WALLET_LOCATION	Indicates the BD or ECM config wallet location.	Yes	Yes	Enter NA
BD_CONFIG_TNS_ADMIN_PATH	Indicates the path of the tnsnames.ora file where an entry for the BD_CONFIG_ALIAS_NAME is present.	Yes	Yes	Enter NA
DB Details for BD Atomic Schema
BD_ATOMIC_HOSTNAME	Indicates the BD or ECM atomic schema hostname.	Yes	Yes	Enter NA
BD_ATOMIC_PORT	Indicates the BD or ECM atomic schema port number.	Yes	Yes	Enter NA
BD_ATOMIC_SERVICE_NAME	Indicates the BD or ECM atomic schema service name.	Yes	Yes	Enter NA
BD_ATOMIC_SID	Indicates the BD or ECM atomic schema SID.	Yes	Yes	Enter NA
BD_ATOMIC_USERNAME	Indicates the username of the BD or ECM atomic schema.	Yes	Yes	Enter NA
BD_ATOMIC_PASSWORD	Indicates the password of the BD or ECM atomic schema. This value must not be blank.	Yes	Yes	Enter NA
BD Atomic Wallet Details. For more information on creating wallet, see Appendix - Setting Up Password Stores with Oracle Wallet .
BD_ATOMIC_ALIAS_NAME	Indicates the BD or ECM atomic alias name. NOTE: Enter the alias name that was created during wallet creation.	Yes	Yes	Enter NA
BD_ATOMIC_WALLET_LOCATION	Indicates the BD or ECM atomic wallet location.	Yes	Yes	Enter NA
BD_ATOMIC_TNS_ADMIN_PATH	Indicates the path of the tnsnames.ora file where an entry for the BD_ATOMIC_ALIAS_NAME is present.	Yes	Yes	Enter NA
SQL Scripts
FSINFODOM	Indicates the name of the BD or ECM Infodom.	Yes	Yes	Enter NA
FSSEGMENT	Indicates the name of the BD or ECM segment.	Yes	Yes	Enter NA
DATAMOVEMENT_LINK_NAME	· If the newly created schema is in a different database host, you must create a DB link and provide the same link in this parameter. Alternatively, you can provide the source schema name. If no DB link is present, provide the <SCHEMA_NAME> in this parameter. · If the newly created schema is in the same database host, the value for this parameter is the user name of the BD or ECM atomic schema.	Yes	Yes	Enter NA
DATAMOVEMENT_LINK_TYPE	If the DB link is used, enter DBLINK in this field. If the DB link is not used, enter SCHEMA in this field.	Yes	Yes	Enter NA
Cloudera Setup Details Contact System Administrator to obtain the required details for these parameters.
HADOOP_CREDENTIAL_PROVIDER_PATH	Indicates the path where Hadoop credential is stored.	Yes	Yes	Enter NA
HADOOP_PASSWORD_ALIAS	Indicates the Hadoop alias given when creating the Hadoop credentials. NOTE: Enter the alias name that was created during wallet creation. For information on how to create a credential keystore, see Creating the Credential Keystore	Yes	Yes	Enter NA
Hive_Host_Name	Indicates the Hive hostname.	Yes	Yes	Enter NA
Hive_Port_number	Indicates the Hive port number. Contact your System Administrator to obtain the port number.	Yes	Yes	Enter NA
HIVE_PRINCIPAL	Indicates the Hive Principal. Contact your System Administrator to obtain this value.	Yes	Yes	Enter NA
HIVE_SCHEMA	Indicates the new Hive schema name.	Yes	Yes	Enter NA
JAAS_CONF_FILE_PATH	Created for future use.	No	No	No
Krb_Host_FQDN_Name	Indicates the Kerberos host FQDN name.	Yes	Yes	Enter NA
Krb_Realm_Name	Indicates the Kerberos realm name.	Yes	Yes	Enter NA
Krb_Service_Name	Indicates the Kerberos service name. Example: Hive	Yes	Yes	Enter NA
KRB5_CONF_FILE_PATH	Created for future use.	No	No	No
security_krb5_kdc_server	Created for future use.	No	No	No
security_krb5_realm	Created for future use.	No	No	No
server_kerberos_keytab_file	Created for future use.	Yes	Yes	Enter NA
server_kerberos_krb5_conf_file	Created for future use.	Yes	Yes	Enter NA
server_kerberos_principal	Created for future use.	Yes	Yes	Enter NA
SQOOP_HOSTMACHINE_USER_NAME	Indicates the user name of the Big Data server where SQOOP will run.	Yes	Yes	Enter NA
SQOOP_PARAMFILE_PATH	1. Create a file with the name sqoop.properties in the Big Data server and add the following entry to the same: oracle.jdbc.mapDateToTimestamp=false 2. Enter the location of the sqoop.properties file as the value for this parameter. Example: /scratch/ofsaa/ NOTE: Ensure that the location name ends with a ’/’.	Yes	Yes	Enter NA
SQOOP_PARTITION_COL	Indicates the column in which the HIVE table is partitioned. The value must be SNAPSHOT_DT	Yes	Yes	Enter NA
SQOOP_TRG_HOSTNAME	Indicates the hostname of the Big Data server where SQOOP will run.	Yes	Yes	Enter NA
SQOOP_TRG_PASSWORD	Indicates the password of the user of the Big Data server where SQOOP will run. This value must not be blank.	Yes	Yes	Enter NA
SQOOP_WORKDIR_HDFS	Indicates the SQOOP working directory in HDFS. Example: /user/ofsaa	Yes	Yes	Enter NA
Internal Services
AUTH_SERVICE_URL	Indicates the AUTH service URL that gets activated after the fcc-studio.sh file runs. Example: http://<HostName>:7041/authservice	Yes	Yes	No
BATCH_SERVICE_URL	Indicates the Batch service URL that gets activated after the fcc-studio.sh file runs. Example: http://<HostName>:7043/batchservice	Yes	Yes	Yes
META_SERVICE_URL	Indicates the META service URL that gets activated after the fcc-studio.sh file runs. Example: http://<HostName>:7045/metaservice	Yes	Yes	Yes
SESSION_SERVICE_URL	Indicates the Session service URL that gets activated after the fcc-studio.sh file runs. Example: http://<HostName>:7047/sessionservice	Yes	Yes	Yes
PGX_SERVER_URL	Indicates the URL of the PGX server. Example: http://<HostName>:<PortNo> The value for PortNo must be 7007.	Yes	Yes	Yes
ORE Interpreter Settings NOTE: This section is applicable only if ORE interpreter is used.
RSERVE_USERNAME	Indicates the RServe username. Contact your System Administrator for the username.	No	No	No
RSERVE_PASSWORD	Indicates the RServe password. Contact your System Administrator for the username.	No	No	No
HTTP_PROXY	Indicates the proxy for the host where FCC Studio is deployed.	No	No	No
HTTPS_PROXY	Indicates the proxy for the host where FCC Studio is deployed.	No	No	No
REPO_CRAN_URL	Indicates the URL from where the R packages are obtained. The format for the REPO_CRAN_URL is as follows: https://cran.r-project.org/	No	No	No
USERS_LIB_PATH	Indicates the path where the R packages are installed. Default value: /usr/lib64/R/library	No	No	No
RSERVE_CONF_PATH	Indicates the path where the Rserve.conf file is present. Default value: /var/ore-interpreter/rserve	No	No	No
ElasticSearch Cluster details
ELASTIC_SEARCH_HOSTNAME	Indicates the hostname of the database where the elastic search service is installed.	Yes	Yes	Yes
ELASTIC_SEARCH_PORT	Indicates the port number where the elastic search service is installed.	Yes	Yes	Yes
Matching Service
EXECUTOR_THREADS	Indicates the number of threads to run in parallel during one scroll. For example: 10	Yes	Yes	Yes
SCROLL_TIME	Indicates the duration for which the scroll_size output is active. For example: 5	Yes	Yes	Yes
SCROLL_SIZE	Indicates the amount of data that must be obtained in one attempt when a query is fired on an index in the elastic search service. For example: 1000	Yes	Yes	Yes
ELASTICRESPONSE_BUFFERLIMIT_BYTE	Indicates the buffer size of the response obtained from the elastic search service. For example: 200	Yes	Yes	Yes
MATCHING_SERVICE_HOSTNAME	Indicates the hostname of the database where matching service is installed.	Yes	Yes	Yes
MATCHING_SERVICE_PORT	Indicates the port number where matching service is installed.	Yes	Yes	Yes
ER_SERVICE_URL	Indicates the URL of the entity resolution service.	Yes	Yes	Yes
ER_SERVICE_PORT	Indicates the port number where the entity resolution service is installed. Default value: 7051	Yes	Yes	Yes
Graphs
HDFS_GRAPH_FILES_PATH	Indicates the file path in the HDFS where the graph.json file is formed.	Yes	Yes	Yes
GRAPH_FILES_PATH	Indicates the directory in the Big Data server for graph files.	Yes	Yes	Yes
GRAPH_NAME	Indicates the name you want to assign to the global graph at the end of ETL.	Yes	Yes	Yes
ETL
HDFS_GRAPH_FILES_PATH	Indicates the filepath in the HDFS where the graph.json is formed.	Yes	Yes	No
GRAPH_FILES_PATH	Indicates the directory in the Big Data server for graph files.	Yes	Yes	No
GRAPH_NAME	Indicates the name you want to assign to the global graph at the end of ETL.	Yes	Yes	No
ETL_PROCESSING_RANGE	Indicates the duration for which the data would be moved from Oracle to Hive. For example: If the ETL_PROCESSING_RANGE = 2Y, 3M, 10D, that is, 2 years, 3 months, and 10 days, and the present date is 20200814, then the data movement occurs for the range 20180504 to 20200814.	Yes	Yes	No
OLD_GRAPH_SESSION_DURATION	Indicates the session older than this specified duration will be removed from the PGX server. If unsure, you can set this value for a week (7D).	Yes	Yes	No
REMOVE_TRNXS_EDGE_AFTER_DURATION	Indicates the date range for which transaction edges will be maintained in graph. For example: 6Y, 3M, 10D, which means 6 years, 3 months and 15 days.	Yes	Yes	No
CONNECTOR_CHANGESET_SIZE	Indicates the number of nodes or edges you want to process during an update of graph. If unsure, you can set it to 10000.	Yes	Yes	No
PGX_SERVER_URLS	Indicates the comma ‘,’ separated values of PGX URLs. If you have only one PGX URL, then the value is http://<k8s master hostname FQDN>:7007.	Yes	Yes	No
Quantifind Details For Quantifind, the generated Quantifind token must be encoded. Use the <Fic_DB_path>/FCCM_Studio_Base64Encoder.sh file for encoding Quantifind token.
QUANTIFIND_URL	Indicates the URL of the Quantifind. For example, https://api-test.quantifind.com	Yes	Yes	Yes
ENCRYPTED_QUANTIFIND_TOKEN	Indicates the token that is generated when integrating with Quantifind. For example, c2FtcGxlX2VuY3J5cHRlZF9xdWFudGlmaW5kX3Rva2Vu.	Yes	Yes	Yes
QUANTIFIND_APPNAME	Indicates the Quantifind App Name. For example, OracleIntegrationTest	Yes	Yes	Yes
QUANTIFIND_ENABLED	Indicates that Quantifind is enabled. Options are True or False.	Yes	Yes	Yes
HTTPS_PROXY_HOST	Indicates the proxy host that is used. For example, www-proxy-idc.in.oracle.com	Yes	Yes	Yes
HTTPS_PROXY_PORT	Indicates the proxy port that is used. For example, 80	Yes	Yes	Yes
HTTPS_PROXY_USERNAME	Indicates the proxy username used if there is any. For example, ##HTTP_PROXY_USERNAME##	Yes	Yes	Yes
HTTPS_PROXY_PASSWORD	Indicates the proxy password used if there is any. For example, ##HTTP_PROXY_PASSWORD##	Yes	Yes	Yes
SAML The SAML related parameters are applicable only if SAMLRealm is used in the Realm parameter.	1. For SAML Realm, the certificate from IDP (key.cert file) is required. 2. The certificate that is obtained from the IDP must be renamed to key.cert and placed in the <Studio_Installation_Path>/OFS_FCCM_STUDIO/datastudio/server/conf directory. 3. This certificate is used to identify the trust of the SAML response from the Identity Provider. 4. Specify the Role Attribute name from IDP, in which the User Roles are present in the SAML response.
SAML_ISSUER	Indicates the SAML entity ID (Studio URL) configured in the IDP.	Yes	Yes	Yes
SAML_DESTINATION	Indicates the SAML IDP URL that is provided by the Identity Provider after creating the SAML Application.	Yes	Yes	Yes
SAML_ASSERTION	Indicates the SAML consume URL (Studio/URL/saml/consume) that is configured in IDP.	Yes	Yes	Yes
SAML_ROLE_ATTRIBUTE	Indicates the SAML client identifier provided by the SAML Administrator for the Role and Attributes information, while creating the SAML application for FCC Studio.	Yes	Yes	Yes
SAML_LOGOUT_URL	Indicates the SAML client identifier provided by the SAML Administrator for the Logout URL information when creating the SAML application for FCC Studio.	Yes	Yes	Yes
SAML_COOKIE_DOMAIN	Indicates the SAML client identifier provided by the SAML Administrator for the Logout URL information when creating the SAML application for FCC Studio.	Yes	Yes	Yes
API_USER	Indicates the API users.	Yes	Yes	Yes
IDCS NOTE: The IDCS related parameters are applicable only if IdcsRealm is used in the Realm parameter.
IDCS_HOST	Indicates the server of the Oracle Identity Cloud Service (IDCS) instance.	Yes	Yes	Yes
IDCS_PORT	Indicates the port number of the IDCS instance.	Yes	Yes	Yes
IDCS_SSL_ENABLED	Indicates if SSL is enabled for the IDCS application. Default value: true	Yes	Yes	Yes
LOGOUT_URL	Indicates the URL to redirect after logout from FCC Studio.	Yes	Yes	Yes
IDCS_TENANT	Indicates the IDCS tenant provided by the IDCS Administrator while creating the IDCS application for FCC Studio.	Yes	Yes	Yes
IDCS_CLIENT_ID	Indicates the IDCS client identifier provided by the IDCS Administrator while creating the IDCS application for FCC Studio.	Yes	Yes	Yes
IDCS_CLIENT_SECRET	Indicates the IDCS client secret provided by the IDCS Administrator while creating the IDCS application for FCC Studio.	Yes	Yes	Yes
FCDM_SOURCE	Indicates the source database for FCC Studio. The available options are ECM and BD. NOTE: · FCC Studio can use either the BD or ECM schema as the source of FCDM data for the graph. · Ensure to enter the value as ECM whenever ECM integration is required with Investigation Hub. Here, ECM schema is used as the source of the FCDM data to load the case information into the graph.	Enter BD or ECM	Enter BD or ECM	Enter NA
Graph Settings
CB_CONFIGURED	Indicates the setting of the graph edges. When the corresponding edges of the graph is required, set the value to true.	Enter true or false	Enter true or false	Enter NA
Keystore file and pass details for batch service
KEYSTORE_FILE_NAME	Indicates the keystore file name used for secure batch service. For example: keystore.jks	Yes	Yes	Yes
KEYSTORE_PASS	Indicates the keystore password used for the secure batch service.	Yes	Yes	Yes
KEYSTORE_ALIAS	Indicates the keystore alias name used for the secure batch service. For example: batchservice	Yes	Yes	Yes

Configure the Extract Transfer and Load (ETL) Process

Extract Transfer and Load (ETL) is the procedure of copying data from one or more sources into a destination system which represents the data differently from the source or in a different context than the source. Data movement and graph loading is performed using ETL.

NOTE	In case you have 8.0.7.4.0 installed and the spark cluster has both batchservice-8.0.7..0.jar and elasticsearch-spark-20_2.11-7. jar files installed, you must remove them from the spark class path.

To configure the Data Movement and Graph Load, copy the FCCM_Studio_SqoopJob.sh,FCCM_Studio_ETL_Graph.sh, FCCM_Studio_ETL_Connector.sh, and FCCM_Studio_ETL_BulkSimilarityEdgeGeneration.sh files from the <Studio_Installed_Path>/out/ficdb/bin directory and add in the <FIC_HOME of OFSAA_Installed_Path>/ficdb/bin directory. For information on performing Data Movement and Graph Load, see the Data Movement and Graph Loading for Big Data Environment section in the OFS Crime and Compliance Studio Administration Guide.

NOTE	Before you run the sqoop job, ensure that the serverconfig.properties file from the <Studio_Installed_Path>/ batchservice/conf directory has the correct values.

Configure the Extract Transfer Load (ETL) Services

To configure the ETL services, follow these steps:

1. Place the Hadoop Cluster files in the <Studio_Installation_Path>/configmaps/spark directory. For more information on the file structure, see Required File Structure.

2. Place the Kerberos files in the <Studio_Installation_Path>/configmaps/batchservice/user/conf/ directory. For more information on the file structure, see Required File Structure.

3. Place the following jars in the <Studio_Installation_Path>/docker/user/batchservice/lib/ directory:

• hive-exec-1.1.0.jar

• HiveJDBC4.jar

• hive-metastore-1.1.0.jar

• hive-service-1.1.0.jar

NOTE	· The version of the jars are client/user-specific. These jars can be obtained from existing jars of Cloudera installation. · The HiveJDBC4.jar file is not available in the Cloudera setup. You must download the same from the Cloudera website.

4. Configure the config.sh file in <Studio_Installation_Path>/bin directory to replace the placeholder values in the applicable files in the configmaps directory as described in the following table:

NOTE	Do not alter the parameter values that are already set in the config.sh file

Table 13: Configuring config.sh File


Parameter	Description
Deployment Configuration
NAMESPACE	Enter a value to create a namespace with the specified value. For example: fccs
PGX Service
PGX_SERVER_NUM_REPLICAS	Indicates the number of replicas of the PGX server. For example: 1
PGX_GLOBAL_GRAPH_NAME	Indicates the name that the pre-loaded global graph is published with and the FCC Studio users can use to reference the global graph. For example: GlobalGraphIH
URL_GLOBAL_GRAPH_CONFIG_JSON	Indicates the HDFS URL where the PGX graph configuration .json file is stored at the end of the ETL. The location can be either local or hdfs path. For example: hdfs:///user/fccstudio/graph.json
HDFS_GRAPH_FILES_PATH	Indicates the filepath in the HDFS where the graph.json is formed.
Quantifind Details
QUANTIFIND_URL	Indicates the URL of the Quantifind. For example, https://api-test.quantifind.com
ENCRYPTED_QUANTIFIND_TOKEN	Indicates the token that is generated when integrating with Quantifind. For example, c2FtcGxlX2VuY3J5cHRlZF9xdWFudGlmaW5kX3Rva2Vu.
QUANTIFIND_APPNAME	Indicates the Quantifind App Name. For example, OracleIntegrationTest.
QUANTIFIND_ENABLED	Indicates that Quantifind is enabled. Options are True or False.

5. Grant Execute permission to the <Studio_Installation_Path>/bin directory using the following command:

chmod 755 install.sh config.sh

6. Run the following command:

./install.sh

NOTE

· Execution of the install.sh command does not generate any log file.

· The values for the <URL_GLOBAL_GRAPH_CONFIG_JSON> and <PGX_GLOBAL_GRAPH_NAME> parameters in the <Studio_Installation_Path>/configmaps/pgx-server/pgx.conf file are auto-populated with the values that are configured in the <Studio_Installation_Path>/bin/config.sh file.

7. Navigate to the <Studio_Installation_Path>/configmaps/pgx-server/ directory and modify the pgx.conf file as follows:

Comment the following preload graph section:

<!--

"preload_graphs": [

{

"path": "<URL_GLOBAL_GRAPH_CONFIG_JSON>",

"name": "<PGX_GLOBAL_GRAPH_NAME>"

}

]

-->

Verify the Resource Allocation for FCC Studio Services

The required resources must be allocated to the FCC Studio services as per the architecture.

Topics:

· Resource Limits

· Resource Types

· Resource Parameters in FCC Studio

Resource Limits

For FCC Studio to run reliably, the available resources of the Kubernetes cluster must be allocated accordingly. The components are memory-intensive and therefore it is recommended to set memory constraints for each component.

Resource Types

Each container requires a memory request and memory limit size as defined by the Kubernetes API. In short, containers specify a request, which is the amount of that resource that the system will guarantee to the container and a limit which is the maximum amount that the system will allow the container to use. For more information on troubleshooting tips, see Managing Compute Resources for Containers.

Some components require additional resource limits which are set as environment variables.

Resource Parameters in FCC Studio

After extracting the FCC Studio application installer software, the resource limits must be adjusted for each component. The configuration files can be found in the <Studio_Installation_Path> directory.

NOTE

· The sizing recommendations are preliminary. In the case of deployment failures, a manual configuration of the sizing parameters is required.

· Depending on the use case, the recommended value changes.

· The default value in the following table is the value that is already set in the file.

Table 14: Resource Parameters in FCC Studio


Configuration File/Container	Parameter type	Parameter Name	Description	Recommendation
server.yml / server	k8	spec.containers[].resources.requests.memory	Memory request size for the FCC server (web application) component.	default
	k8	spec.containers[].resources.requests.memory	Memory limit size for the FCC server (web application) component.	default
agent.yml / agent	k8	spec.containers[].resources.requests.memory	Memory request size for the Agent (manages all interpreters) component.	default
	k8	spec.containers[].resources.limits.memory	Memory limit size for the Agent (manages all interpreters) component.	default
pgx-server.yml / pgx-server	k8	spec.containers[].resources.requests.memory	Memory request size for the PGX server (manages graph processing) component.	Slightly less than the memory of the PGX server as calculated in the sizing guide.
	k8	spec.containers[].resources.requests.memory	Memory limit size for the PGX server (manages graph processing) component.	The same as the request size above.
	ENV VAR (JAVA_OPTS)	-Xmx -Xms	The maximum and minimum heap memory size (mainly used for storing graphs' string properties) for the Java process of PGX.	58% of the container's memory limit size above. For a better understanding of this sizing parameter, see the PGX 20.0.2 Memory Consumption documentation.
	ENV VAR (JAVA_OPTS)	-Dpgx.max_off_heap_size	The maximum off-heap memory size in megabytes (mainly used for storing graphs except for their string properties) that PGX tries to respect.	42% of the container's memory limit size above. For a better understanding of this sizing parameter, see the PGX 20.0.2 Memory Consumption documentation.
fcc-pgx.yml / pgx-interpreter	k8	spec.containers[].resources.requests.memory	Memory request size for the PGX interpreter.	4Gi
	k8	spec.containers[].resources.limits.memory	Memory limit size for the PGX interpreter.	16Gi Sizing should depend on the number and behavior (memory requirements of sessions) of concurrent users
authservice.yml / authservice	k8	spec.containers[].resources.requests.memory	Memory request size for the authservice (used for getting roles of a user from DB) component.	default
	k8	spec.containers[].resources.limits.memory	Memory limit size for the authservice (used for getting roles of a user from DB) component.	default
metaservice.yml / metaservice	k8	spec.containers[].resources.requests.memory	Memory request size for the metaservice (used for custom interpreter api's like loaddataset, listdataset in scala interpreter etc.) component.	default
	k8	spec.containers[].resources.limits.memory	Memory limit size for the metaservice (used for custom interpreter APIs such as loaddataset, listdataset in Scala interpreter and so on) component.	default
sessionservice.yml / sessionservice	k8	spec.containers[].resources.requests.memory	Memory request size for the sessionservice (used for managing session between PGX and Scala interpreter) component.	default
	k8	spec.containers[].resources.limits.memory	Memory limit size for the sessionservice (used for managing session between PGX and Scala interpreter) component.	default
batchservice.yml / batchservice	k8	spec.containers[].resources.requests.memory	Memory request size for the batchservice (used for managing batches like sqoopjob, graph load, notebook execution and so on) component.	default Depends on volume of data processed in ETL.
	k8	spec.containers[].resources.limits.memory	Memory limit size for the batchservice (used for managing batches like sqoopjob, graph load, notebook execution and so on) component.	default Depends on volume of data processed in ETL.
entity-resolution.yml/entity resolution	k8	spec.containers[].resources.requests.memory	Memory request size for the Entity Resolution component.	default
	k8	spec.containers[].resources.limits.memory	Memory limit size for the Entity Resolution component.	default
matching-service.yml/ matching service	k8	spec.containers[].resources.requests.memory	Memory request size for the Matching Service component.	default
	k8	spec.containers[].resources.limits.memory	Memory limit size for the Matching Service component.	default
spark.yml/spark and pyspark Interpreter	k8	spec.containers[].resources.requests.memory	Memory request size for the Spark interpreter.	default
	k8	spec.containers[].resources.limits.memory	Memory limit size for the Spark interpreter.	default
fcc-jdbc.yml / fcc-jdbc	k8	spec.containers[].resources.requests.memory	Memory request size for the JDBC connection.	default
	k8	spec.containers[].resources.limits.memory	Memory limit size for the JDBC connection.	default
fcc-livy.yml / fcc-spark-scala, fcc-spark-sql, and fcc-pyspark interpreters	k8	spec.containers[].resources.requests.memory	Memory request size for the livy connection to big data Spark cluster.	default
	k8	spec.containers[].resources.limits.memory	Memory limit size for the livy connection to Big data Spark cluster.	default
fcc-markdown.yml / markdown-interpreter	k8	spec.containers[].resources.requests.memory	Memory request size for the Markdown interpreter.	default
	k8	spec.containers[].resources.limits.memory	Memory limit size for the Markdown interpreter.	default
fcc-ore.yml / ore-interpreter	k8	spec.containers[].resources.requests.memory	Memory request size for the ORE connection.	default
	k8	spec.containers[].resources.limits.memory	Memory limit size for the ORE connection.	default
fcc-python.yml / python-interpreter	k8	spec.containers[].resources.requests.memory	Memory request size for the Python interpreter.	depending on use case
	k8	spec.containers[].resources.limits.memory	Memory limit size for the Python interpreter.	depending on use case

Deploy FCC Studio on the K8s Cluster

To deploy FCC Studio on the K8s cluster, follow these steps:

1. Navigate to the <Studio_Installation_Path> directory.

2. Execute the following command:

./fcc-studio.sh --registry <registry URL>:<registry port>

NOTE	Execute the ./fcc-studio.sh -h command for usage instructions.

Congratulations! Your installation is complete.

After successful completion, the script displays a URL that can be used to access the FCC Studio Application. For more information, see Access the FCC Studio Application.