6 Creating a Production Cluster for GGSA

This chapter describes how to create a production cluster for GGSA on OCI, with an example production cluster with six compute instances configured as follows:

A production cluster should have a minimum of 1 web-tier compute instance, 2 spark master compute instances, and a variable number of worker instances for Spark workers and Kafka. A minimum of 2 worker instances are required to run spark workers and Kafka. GGSA MA Cluster

  1. A single compute instance running GGSA Web-tier. Minimal shape is VM 2.4. This web-tier instance also acts as a Bastion Host for the cluster and should be created in a public regional subnet.
  2. Two spark master compute instances running spark master processes. Minimal shape is VM 2.2. Both instances can be of same shape and should be created in a private regional subnet.
  3. Three worker instances running Spark worker, Kafka broker, and Zookeeper processes. Minimal shape is VM 2.2. All three instances can be of same shape and should be created in a private regional subnet.

All instances will use the GGSA image that comes packaged with GGSA, GGBD, Spark, Kafka, MySQL, and Nginx.

To provision a GGSA production cluster you must first provision GGSA compute instances, using the custom GGSA VM image. The GGSA VM image contains binaries for GGSA, GGBD, Spark, Kafka, MySQL, OSA, Nginx, etc. You do not require any additional software other than the GGSA image. The image packages the following scripts:
  • init-kafka-zk.sh: Bash script to initialize Zookeeper and Kafka on worker VMs
  • init-spark-master.sh: Bash script to initialize Spark master on Spark master VMs
  • init-spark-slave.sh: Bash script to initialize Spark worker on worker VMs
  • init-web-tier.sh: Bash script to initialize GGSA web-tier on the Web-tier VM