2 Install
Planning Your Installation
To plan the installation of GoldenGate Stream Analytics (GGSA) 19.1.0.0.* efficiently, ensure that you have the required hardware and software. You should also perform the prerequisite procedures before starting the installation process.
-
Oracle JDK 8 Update 131 and higher versions
-
Repository Database
-
Oracle Database versions 12.2.0.1 or higher, 12.1.0.1 or higher, and 11.2.0.4 or higher
-
Else, you can use MySQL version 5.6 or 5.7
-
-
A running Kafka cluster:
- Version 0.10.2 to 2.2.1 for releases 19.1.0.0.1 to 19.1.0.0.7.
- Version 0.10.2 to 3.4.0 for release 19.1.0.0.8.
-
Locally installed Spark Libraries. GGSA does not package the Spark client libraries, so you will also need locally installed Spark:
For releases 19.1.0.0.0 to 19.1.0.0.7:-
Spark release:
2.4.3
-
Package type:
Pre-built for Apache Hadoop 2.7 and later
-
Download Spark:
spark-2.4.3-bin-hadoop2.7.tgz
For release 19.1.0.0.8:-
Spark release:
3.4.0
-
Package type:
Pre-built for Apache Hadoop 3.3 and later
-
Download Spark:
spark-3.4.0-bin-hadoop3.3.tgz
Note:
Install Spark and JDK in the same node on which you plan to install Oracle Stream Analytics. See Installing GoldenGate Stream Analytics. -
-
Google Chrome browser with version 6.0 or higher
Installing GoldenGate Stream Analytics
Configuring the Metadata Store
Please follow steps below for configuring your metadata store.
- Configure your data source in
OSA-19.1.0.0.*/osa-base/etc/jetty-osa-datasource.xml
as per instructions below. This step is essential for creating OSA’s database schema. The OSA database user referred to in the document will be created by the installation process. - Uncomment and edit one of the two Data source configurations, either for Oracle Database or MySQL depending on the database you want to use as metadata store. The uncommented fragment for Oracle database is shown below: a.
<New id="osads" class="org.eclipse.jetty.plus.jndi.Resource"> <Arg> <Ref refid="wac"/> </Arg> <Arg>jdbc/OSADataSource</Arg> <Arg> <New class="oracle.jdbc.pool.OracleDataSource"> <Set name="URL">jdbc:oracle:thin:@myhost.example.com:1521:OSADB</Set> <Set name="User">OSA_USER</Set> <Set name="Password"> <Call class="org.eclipse.jetty.util.security.Password" name="deobfuscate"> <Arg> OBF:OBFUSCATED_PASSWORD</Arg> </Call> </Set> <Set name="connectionCachingEnabled">true</Set> <Set name="connectionCacheProperties"> <New class="java.util.Properties"> <Call name="setProperty"><Arg>MinLimit</Arg><Arg>1</Arg></Call> <Call name="setProperty"><Arg>MaxLimit</Arg><Arg>15</Arg></Call> <Call name="setProperty"><Arg>InitialLimit</Arg><Arg>1</Arg></Call> </New> </Set> </New> </Arg> </New>
- Decide on an OSA schema username and a plain-text password. For illustration, say osa as schema user name and alphago as password.
Change directory to top-level folder
OSA-19.1.0.0.*
and execute the following command:java -cp ./lib/ jetty-util-9.4.17.v20190418.jar org.eclipse.jetty.util.security.Password osa <your password>
For example,
java -cp ./lib/ jetty-util-9.4.17.v20190418.jar org.eclipse.jetty.util.security.Password osa alphago
You should see results like below on console:
2019-06-18 14:14:45.114:INFO::main: Logging initialized @1168ms to org.eclipse.jetty.util.log.StdErrLogalphago
OBF:<obfuscated password>
MD5:34d0a556209df571d311b3f41c8200f3
CRYPT:osX/8jafUvLwA
-
Note down the obfuscated password string that is displayed (shown in bold), by copying it to clipboard or notepad.
- Change database host, port, SID, osa schema user name and osa schema password fields marked in bold in the code in Step 2a.
Example - jdbc:oracle:thin:@myhost.example.com:1521:ORCL
SAMPLE JETTY-OSA-DATASOURCE.XML
<?xml version="1.0"?>
<!DOCTYPE Configure PUBLIC "-//Jetty//Configure//EN" "http://www.eclipse.org/jetty/configure_9_3.dtd">
<!-- =============================================================== -->
<!-- Configure jdbc/OSADataSource data source -->
<!-- =============================================================== -->
<Configure id="Server" class="org.eclipse.jetty.server.Server">
<!-- SAMPLE OSA DATASOURCE CONFIGURATION FOR ORACLE-->
<New id="osads" class="org.eclipse.jetty.plus.jndi.Resource">
<Arg>
<Ref refid="wac"/>
</Arg>
<Arg>jdbc/OSADataSource</Arg>
<Arg>
<New class="oracle.jdbc.pool.OracleDataSource">
<Set name="URL">jdbc:oracle:thin:@myhost.example.com:1521:OSADB</Set>
<Set name="User">osa_prod</Set>
<Set name="Password">
<Call class="org.eclipse.jetty.util.security.Password" name="deobfuscate">
<Arg>OBF:1ggz1j1u1k8q1leq1v2h1w8v1v1x1lcs1k5g1iz01gez</Arg>
</Call>
</Set>
<Set name="connectionCachingEnabled">true</Set>
<Set name="connectionCacheProperties">
<New class="java.util.Properties">
<Call name="setProperty"><Arg>MinLimit</Arg><Arg>1</Arg></Call>
<Call name="setProperty"><Arg>MaxLimit</Arg><Arg>15</Arg></Call>
<Call name="setProperty"><Arg>InitialLimit</Arg><Arg>1</Arg></Call>
</New>
</Set>
</New>
</Arg>
</New>
<!-- SAMPLE OSA DATASOURCE CONFIGURATION FOR ADW-->
<!--
<New id="osads" class="org.eclipse.jetty.plus.jndi.Resource">
<Arg>
<Ref refid="wac"/>
</Arg>
<Arg>jdbc/OSADataSource</Arg>
<Arg>
<New class="oracle.jdbc.pool.OracleDataSource" type="adw">
<Set name="URL">jdbc:oracle:thin:@oracletestdb_high?TNS_ADMIN=/scratch/oracletest/Wallet_oracletestdb</Set>
<Set name="User">{OSA_USER}</Set>
<Set name="Password">
<Call class="org.eclipse.jetty.util.security.Password" name="deobfuscate">
<Arg>{OBF:OBFUSCATE_PASSWORD}</Arg>
</Call>
</Set>
<Set name="connectionCachingEnabled">true</Set>
<Set name="connectionCacheProperties">
<New class="java.util.Properties">
<Call name="setProperty"><Arg>MinLimit</Arg><Arg>1</Arg></Call>
<Call name="setProperty"><Arg>MaxLimit</Arg><Arg>15</Arg></Call>
<Call name="setProperty"><Arg>InitialLimit</Arg><Arg>1</Arg></Call>
</New>
</Set>
</New>
</Arg>
</New>
-->
<!-- SAMPLE OSA DATASOURCE CONFIGURATION FOR MYSQL-->
<!--
<New id="osads" class="org.eclipse.jetty.plus.jndi.Resource">
<Arg>
<Ref refid="wac"/>
</Arg>
<Arg>jdbc/OSADataSource</Arg>
<Arg>
<New class="com.mysql.cj.jdbc.MysqlConnectionPoolDataSource">
<Set name="URL">jdbc:mysql://examplehost.com:3306/OSADB</Set>
<Set name="User">{OSA_USER}</Set>
<Set name="Password">
<Call class="org.eclipse.jetty.util.security.Password" name="deobfuscate">
<Arg>{OBF:OBFUSCATE_PASSWORD}</Arg>
</Call>
</Set>
</New>
</Arg>
</New>
-->
</Configure>
Note:
Do not use a hyphen in the OSA metadata username, in the jetty-osa-datasource.xmlConfiguring ATP/ADW as Metadata Store
GoldenGate Stream Analytics creates the metadata schema, as part of initial configuration of the system, using the script: ${OSA_HOME}/osa-base/bin/configure.sh dbroot=<sys user of database> dbroot_password=<sys user password of the database>
However, before running the above script, you must configure the datasource in the datasource configuration file at${OSA_HOME}/osa-base/etc/jetty-osa-datasource.xml
.
To configure ATP/ADW as metadata store, first comment the Oracle and MYSQL sections, while uncommenting the ADW/APT section in jetty-osa-datasource.xml
file.
Below is the template for the datasource configuration for ATP/ADW database:
jetty-osa-datasource.xml
<?xml version="1.0"?>
<!DOCTYPE Configure PUBLIC "-//Jetty//Configure//EN" "http://www.eclipse.org/jetty/configure_9_3.dtd">
<Configure id="Server" class="org.eclipse.jetty.server.Server">
<New id="osads" class="org.eclipse.jetty.plus.jndi.Resource">
<Arg>
<Ref refid="wac"/>
</Arg>
<Arg>jdbc/OSADataSource</Arg>
<Arg>
<New class="oracle.jdbc.pool.OracleDataSource" type="adw">
<Set name="URL">jdbc:oracle:thin:@{service_name}?TNS_ADMIN={wallet_absolute_path}</Set>
<Set name="User">{osa_db_user}</Set>
<Set name="Password">
<Call class="org.eclipse.jetty.util.security.Password" name="deobfuscate">
<Arg>{obfuscated_password}</Arg>
</Call>
</Set>
<Set name="connectionCachingEnabled">true</Set>
<Set name="connectionCacheProperties">
<New class="java.util.Properties">
<Call name="setProperty"><Arg>MinLimit</Arg><Arg>1</Arg></Call>
<Call name="setProperty"><Arg>MaxLimit</Arg><Arg>15</Arg></Call>
<Call name="setProperty"><Arg>InitialLimit</Arg><Arg>1</Arg></Call>
</New>
</Set>
</New>
</Arg>
</New>
</Configure>
Note:
In the above template, replace the variables in {} as below:
- {service_name} - one of the service names listed in the tnsnames.ora file inside the wallet
- {wallet_absolute_path} - the absolute path of wallet folder on the machine where OSA is installed
- {osa_db_user} - the username to create the osa metadata. This username and schema will be created by the 'dbroot' user provided in above script.
- {obfuscated_password} - the Obfuscated password for {osa_db_user}
Initializing Metadata Store
This topic applies only to Oracle user-managed services.
Jetty Properties File
Use the jetty properties available at OSA-19.1.0.0.*/osa-base/etc/jetty.properties
, to modify certain security features.
Note:
It is recommended that you configure these properties at the installation stage, to avoid restarting your server, if configured at a later stage.- jetty.session.timeout
You can set the timeout for OSA web session. This sets the timeout for OSA web session. By default the timeout is set to 30 minutes. The value can be changed to any integer greater than 1.
- host.headers.whitelist
You can restrict the x-forwarded-host header values to the values defined with this property.
Example: host.headers.whitelist= www.oracle.com, www.microsoft.com, localhost:9080
Here the value of the host header can be only of these three domains listed. Commenting out this property with a # will allow all values for the header.
Note:
If you do not specify explicitly the host header in your request, the default value ishost-server:port
, where the OSA jetty server is running. Hence you must specify the port number along with the server address. - xforwarded.host.headers.whitelist
You can restrict the x-forwarded-host header values to the values defined with this property.
Example: xforwarded.host.headers.whitelist= www.oracle.com, www.microsoft.com, localhost
Here the value of the x-forwarded-host header can be only of these three domains listed. Commenting out this property with a # will allow all values for the header. If no domain is entered, that is, if the value of the property is empty, then this header is not supported.
- response.headers.list
A comma separated list of response headers, which will be sent along with response for every request.
Example: response.headers.list="x-frame-options: sameorigin, X-Content-Type-Options: nosniff"
By default the above 2 response headers are set.x-frame-options
: sameorigin will prevent clickjack attacking.X-Content-Type-Options
: nosniff will prevent sniffing of the response content by the browsers.
Adjusting Jetty Threadpool
Edit OSA-19.1.0.0.*/etc/jetty-threadpool.xml
to change minimum and maximum thread configuration to 100 and 2000 respectively. Sample shown below.
<New id="threadPool"
class="org.eclipse.jetty.util.thread.QueuedThreadPool">
<Set name="minThreads" type="int"><Property name="jetty.threadPool.minThreads"
deprecated="threads.min" default="100"/></Set>
<Set name="maxThreads" type="int"><Property name="jetty.threadPool.maxThreads"
deprecated="threads.max" default="2000"/></Set>
<Set name="reservedThreads" type="int"><Property
name="jetty.threadPool.reservedThreads" default="-1"/></Set>
<Set name="idleTimeout" type="int"><Property
name="jetty.threadPool.idleTimeout" deprecated="threads.timeout"
default="60000"/></Set>
<Set name="detailedDump" type="boolean"><Property
name="jetty.threadPool.detailedDump" default="false"/></Set>
</New>
</Configure>
Integrating Stream Analytics with Oracle GoldenGate
Follow the below steps to integrate Oracle Goldengate with Stream Analytics:
- Download and install Oracle GoldenGate Big Data. For a compatible version of Oracle GoldenGate Big Data, see the latest certification matrix.
Note:
Install Oracle GoldenGate Big Data on the same machine and with the same user as OSA. - Set the following environment variables:
- KAFKA_HOME – set this variable to the path where Kafka is installed. Example :
export KAFKA_HOME=/u01/app/kafka
. - LD_LIBRARY_PATH – set this variable to the directory path that contains JVM shared library.
Example:
export LD_LIBRARY_PATH=/u01/app/java/jre/lib/amd64/server:$LD_LIBRARY_PATH
- GGBD_HOME – set this variable to the path where Goldengate for Bigdata is installed.
Example:
export GGBD_HOME=/u01/app/OGG_BigData_Linux_x64_19.1.0.0.0
- KAFKA_HOME – set this variable to the path where Kafka is installed. Example :
- Start the manager process on port 7801.
For installation steps, see Installing GoldenGate for Big Data.
Maven Setting for GoldenGate Big Data Handlers
Set the Maven Home Path
To configure maven home:
Update theOSA-19.1.0.0.*/osa-base/bin/configure-osa.sh
with the correct M2_HOME path, as below:
Change the path from
OSA_HOME="$( cd "$(dirname "../../")" >/dev/null 2>&1 ; pwd -P )"
to
OSA_HOME="$( cd "$(dirname "../../../")" >/dev/null 2>&1 ; pwd -P )"
Note:
Update the maven home path before initialization of the metadata store, or you will have to restart GGSA after this update.
Configure Maven Proxy Settings
If your GGSA installation is behind proxy, to use the GGBD handlers, you have to configure the settings.xml
that comes with the Maven distribution.
<OSA_INSTALLATION_PATH>/apache-maven-3.6.3/conf/settings.xml
with the correct proxy entries in the <proxies> </proxies>
section, as shown below:<proxy>
<id>optional</id>
<active>true</active>
<protocol>http</protocol>
<username>proxyuser</username>
<password>proxypass</password>
<host>proxy.host.net</host>
<port>80</port>
<nonProxyHosts>local.net|some.host.com</nonProxyHosts>
</proxy>
Note:
Username and password field is required if the proxy is protected.Note:
Update the settings.xml
before initialization of the metadata store, or you will have to restart GGSA after this update.
GoldenGate Stream Analytics Hardware Requirements for Enterprise Deployment
This chapter provides the hardware requirements for GoldenGate Stream Analytics Design and Data tiers.
Design Tier
GoldenGate Stream Analytics' Design-tier is a multi-user environment that allows users to implement and test dataflow pipelines. The design-tier also serves dashboards for streaming data. Multiple users can build, test, and deploy pipelines based on the capacity of the Runtime-tier (YARN/Spark cluster) simultaneously.
GGSA uses Jetty as the web-server with support for HA. For production deployments of GGSA Design-tier, you require the minimum hardware configuration listed below:
- Web server – Jetty with High Availability (HA) support
- 2 nodes with 4+ cores and 32+ GB of RAM for running two instances of Jetty.
- 1 node with 4+ cores and 16+ GB of RAM for running MySQL or Oracle meta-store.
- 2 nodes with 4+ cores and 16+ GB of RAM for running two instances of Kafka and 3 instances of ZooKeeper. Please note this is a separate Kafka cluster for GGSA’s internal use and for interactively designing pipelines. ZooKeeper end-point of this Kafka cluster must be specified in GGSA’s system settings UI.
Note:
The two-node Kafka cluster can be avoided if customer already has a Kafka cluster in place and is fine with OSA leveraging that cluster for its internal usage.
Based on the above estimates, total cores for design-tier is 12 and approximate memory is 112 GB RAM. Jetty instances can be independently scaled as the number of users increase. Diagram below illustrates GGSA’s Design-tier topology.
Data Tier
The deployed pipelines are run on the YARN or Spark cluster. You can use existing YARN/Spark clusters if you have sufficient spare capacity.
Sizing Guidelines
-
- 2 nodes with 4+ cores, 16+GB RAM, and 500 GB local disk to run HDFS cluster, two instances of HDFS name and data nodes.
- Number of pipelines that will simultaneously run
- Logic in each pipeline
- Desired degree of parallelism
For each streaming pipeline the number of cores and memory gets computed based on a required degree of parallelism. As an example, consider a pipeline ingesting data from customer’s Kafka topic T with 3 partitions using direct ingestion. Direct ingestion is where no Spark Receivers are used. In this case, the minimum number of processes that you need to run for optimal performance is as follows: 1 Spark Driver Process + 3 Executor processes, 1 for each Kafka Topic partition. Each Executor process needs a minimum of 2 cores.
The number of cores for a pipeline can be computed as
--executor-cores = 1 + Number of Executors * 2
In case of Receiver-based ingestion as in JMS, it is computed as
--executor-cores = 2 + Number of Executors * 2
This is rough estimates and environments where fine-grained scheduling is not available. In environments like Kubernetes, we have the luxury of more fine-grained scheduling.
The formula for sizing memory is
(Number of Windows * Average Window Range * Event Rate * Event Size) + (Number of Lookup/Reference Objects being cached * Size of Lookup Object).
Diagram below illustrates GGSA’s Data-tier topology.
If you are considering GGSA for POCs and not production, then you can use the following configuration:
Design Tier
- An instance of the Jetty running on a 4+ core node with a 32+ GB of RAM.
- An instance of MySQL/Oracle for metadata store on a 4+ core node with a 16+ GB of RAM.
- A node of the Kafka cluster running on a 4+ core node with 16+ GB of RAM.
Note:
This is a separate Kafka cluster for GGSA’s internal use and for interactively designing pipelines.Data Tier
- A Hadoop Distributed File System (HDFS) cluster node running on 4+ core physical node with 16+ GB of RAM.
- 2 nodes of the YARN/Spark cluster each running on a 4+ core physical node with a 16+ GB of RAM.
Development Mode Configurations
Design Tier
- 1 node with 4+ cores and 16+ GB of RAM for 1 instance of Jetty, 1 instance of MySQL DB, and 1 instance of Kafka+ZooKeeper
Data Tier
- 1 node with 4+ cores and 16+ GB of RAM for 1 instance of HDFS and 1 instance of YARN/Spark.
Retaining https and Disabling http
- By default, the GGSA web application is available on both http (port 9080) and https (port 9443). Follow the procedure below if you intend to disable http.
- Edit file
osa-base/start.d/http.ini
. - Comment out as follows:
##--module=http
. - Start GGSA web server by running
osa-base/bin/start-osa.sh
.
Setting up Runtime for GoldenGate Stream Analytics Server
Before you start using GoldenGate Stream Analytics, you need to specify the runtime server, environment, and node details. You must do this procedure right after you launch GoldenGate Stream Analytics (GGSA) for the first time.
- Change directory to
OSA-19.1.0.0.*.*/osa-base/bin and run ./start-osa.sh
. You should see the following message on console.Supported OSA schema versions are: [18.4.3, 18.1.0.1.0, 18.1.0.1.1, 19.1.0.0.0, 19.1.0.0.1, 19.1.0.0.2, 19.1.0.0.3, 19.1.0.0.5, 19.1.0.0.6, 19.1.0.0.7, 19.1.0.0.8]
The schema is preconfigured and current. No changes or updates are required.
If you do not see the above message, please check the log file in
OSA-19.1.0.0.*.*/osa-base/logs
folder. - the Chrome browser, enter
localhost:9080/osa
to access Oracle Stream Analytics login page, and login using your credentials.Note:
The password is a plain-text password. - Click the user name at the top right corner of the screen.
- Click System Settings.
- Click Environment.
- Select the Runtime Server. See the sections below for Yarn and Spark Standalone runtime configuration details.
Yarn Configuration
-
-
YARN Resource Manager URL: Enter the URL where the YARN Resource Manager is configured.
-
Storage: Select the storage type for pipelines. To submit a GGSA pipeline to Spark, the pipeline has to be copied to a storage location that is accessible by all Spark nodes.
- If the storage type is WebHDFS:
-
Path: Enter the WebHDFS directory (hostname:port/path), where the generated Spark pipeline will be copied to and then submitted from. This location must be accessible by all Spark nodes. The user specified in the authentication section below must have read-write access to this directory.
-
HA Namenodes: Set the HA namenodes. If the hostname in the above URL refers to a logical HA cluster, specify the actual namenodes here, in the format:
Hostname1:Port, Hostname2:Port
.
-
- If storage type is HDFS:
-
Path: The path could be
<HostOrIPOfNameNode><HDFS Path>
. For example,xxx.xxx.xxx.xxx/user/oracle/ggsapipelines
. Hadoop user must haveWrite
permissions. The folder will automatically be created if it does not exist. -
HA Namenodes: If the hostname in the above URL refers to a logical HA cluster, specify the actual namenodes here, in the format:
Hostname1:Port, Hostname2:Port
.
-
- If storage type is NFS:
Path: The path could be
/oracle/spark-deploy
.Note:
/oracle should exist and spark-deploy will automatically be created if it does not exist. You will needWrite
permissions on the /oracle directory.
- If the storage type is WebHDFS:
-
- Hadoop Authentication:
- Simple authentication credentials:
- Protection Policy: Select a protection policy from the drop-down list. This value should match the value on the cluster.
- Username: Enter the user account to use for submitting Spark pipelines. This user must have read-write access to the Path specified above.
- Kerberos authentication credentials:
- Protection Policy: Select a protection policy from the drop-down list. This value should match the value on the cluster.
- Kerberos Realm: Enter the domain on which Kerberos authenticates a user, host, or service. This value is in the
krb5.conf file
. - Kerberos KDC: Enter the server on which the Key Distribution Center is running. This value is in the
krb5.conf file
. - Principal: Enter the GGSA service principal that is used to authenticate the GGSA web application against Hadoop cluster, for application deployment. This user should be the owner of the folder used to deploy the GGSA application in HDFS. You have to create this user in the yarn node manager as well.
- Keytab: Enter the keytab pertaining to GGSA service principal.
- Yarn Resource Manager Principal: Enter the yarn principal. When Hadoop cluster is configured with Kerberos, principals for hadoop services like hdfs, https, and yarn are created as well.
- Simple authentication credentials:
- Yarn master console port: Enter the port on which the Yarn master console runs. The default port is
8088
. - Click Save.
Spark Standalone
- Select the Runtime Server as Spark Standalone, and enter the following details:
-
Spark REST URL: Enter the Spark standalone REST URL. If Spark standalone is HA enabled, then you can enter comma-separated list of active and stand-by nodes.
-
Storage: Select the storage type for pipelines. To submit a GGSA pipeline to Spark, the pipeline has to be copied to a storage location that is accessible by all Spark nodes.
- If the storage type is WebHDFS:
-
Path: Enter the WebHDFS directory (hostname:port/path), where the generated Spark pipeline will be copied to and then submitted from. This location must be accessible by all Spark nodes. The user specified in the authentication section below must have read-write access to this directory.
-
HA Namenodes: If the hostname in the above URL refers to a logical HA cluster, specify the actual namenodes here, in the format:
Hostname1:Port, Hostname2:Port
.
-
- If storage type is HDFS:
-
Path: The path could be
<HostOrIPOfNameNode><HDFS Path>
. For example,xxx.xxx.xxx.xxx/user/oracle/ggsapipelines
. Hadoop user must haveWrite
permissions. The folder will automatically be created if it does not exist. -
HA Namenodes: If the hostname in the above URL refers to a logical HA cluster, specify the actual namenodes here, in the format:
Hostname1:Port, Hostname2:Port
.This field is applicable only when the storage type is HDFS.
-
- Hadoop Authentication for WebHDFS and HDFS Storage Types:
- Simple authentication credentials:
- Protection Policy: Select a protection policy from the drop-down list.
- Username: Enter the user account to use for submitting Spark pipelines. This user must have read-write access to the Path specified above.
- Kerberos authentication credentials:
- Protection Policy: Select a protection policy from the drop-down list.
- Kerberos Realm: Enter the domain on which Kerberos authenticates a user, host, or service. This value is in the
krb5.conf file
. - Kerberos KDC: Enter the server on which the Key Distribution Center is running. This value is in the
krb5.conf file
. - Principal: Enter the GGSA service principal that is used to authenticate the GGSA web application against Hadoop cluster, for application deployment. This user should be the owner of the folder used to deploy the GGSA application in HDFS. You have to create this user in the yarn node manager as well.
- Keytab: Enter the keytab pertaining to GGSA service principal.
- Yarn Resource Manager Principal: Enter the yarn principal. When Hadoop cluster is configured with Kerberos, principals for hadoop services like hdfs, https, and yarn are created as well.
- Simple authentication credentials:
- If storage type is NFS:
Path: The path could be
/oracle/spark-deploy
.Note:
/oracle should exist and spark-deploy will automatically be created if it does not exist. You will needWrite
permissions on the /oracle directory.
- If the storage type is WebHDFS:
-
- Spark standalone master console port: Enter the port on which the Spark standalone console runs. The default port is
8080
.Note:
The order of the comma-separated ports should match the order of the comma-separated spark REST URLs mentioned in the Path. - Spark master username: Enter your Spark standalone server username.
- Spark master password: Click Change Password, to change your Spark standalone server password.
Note:
You can change your Spark standalone server username and password in this screen. The username and password fields are left blank, by default. - Click Save.
Validating Data Flow to GoldenGate Stream Analytics
After you have configured GoldenGate Stream Analytics (GGSA) with the runtime details, you need to ensure that sample data is being detected and correctly read by GGSA.
Terminating GoldenGate Stream Analytics
You can terminate GoldenGate Stream Analytics by running a simple command.
Use the following command to terminate GoldenGate Stream Analytics:
./stop-osa.sh from OSA-19.1.0.0.*/osa-base/bin folder
Upgrading GoldenGate Stream Analytics
To upgrading from an existing version of GGSA to newer version:
- Backup your metadata store using any of Oracle or MySQL backup tools. The backup is required to restore the tools in case the upgrade fails.
- Run the ./stop-osa.sh command to stop the GGSA server.
- Create a OSA-19 folder and download the
OSA-19.1.0.0.*.zip
file into the newly- created folder. - Unzip to extract the contents of the
OSA-19.1.0.0.*.zip
file. - Copy
<YourVersion>/osa-base/etc/osa-env.sh to OSA-19.1.0.0.*/osa-base/etc
.Note:
You can skip this step, if you are upgrading to GGSA version 19.1.0.0.8. - Copy
<YourVersion>/osa-base/etc/jetty-osa-datasource.xml
toOSA-19.1.0.0.*/osa-base/etc
. - Run the
./start-osa.sh
command to start the OSA server. This performs the schema migration. - To validate the upgrade, see Validating your Installation.