**OKE Advanced Terraform Root Module: File: ~/oke_advanced_module_orm/README.md**   
**Author: Author: Mahamat H. Guiagoussou, Payal Sharma and Matthew McDaniel** 
**Copyright (c) 2025 Oracle**

# REAME.md 

# Deploying an OCI OKE Cluster with a Bastion Host Using Oracle Resource Manager (ORM)

This project provisions an Oracle Cloud Infrastructure (OCI) environment for a Kubernetes cluster, following a proven, modular approach. The architecture is designed with a clear separation of concerns, placing networking and container resources in different compartments.

## Architecture Overview

The root module orchestrates the creation of three main components:

1. **VCN Module (`modules/vcn`):** Provisions a Virtual Cloud Network (VCN) with subnets, gateways, and security resources. This module also creates a separate compartment for networking and containers.
2. **OKE Module (`modules/oke`):** Deploys an Oracle Kubernetes Engine (OKE) cluster, including the control plane and a worker node pool, into the designated container compartment. It leverages the subnets created by the VCN module.
3. **Bastion Module (`modules/bastion`):** Deploys a bastion host instance into a public subnet to provide secure access to private resources.

Each module is self-contained and reusable, with its own variables and outputs.

## Prerequisites

* **OCI Account:** You must have an active Oracle Cloud Infrastructure account.
* **OCI CLI:** The OCI CLI must be installed and configured with the necessary credentials.
* **Terraform:** You need Terraform CLI installed on your local machine (version `~> 1.0`).
* * **IAM Policies:** Check in this [link](https://docs.oracle.com/en-us/iaas/Content/ContEng/Concepts/contengpolicyconfig.htm) to make sure required polcies and permissions are properly set.

## Setup and Usage

1. **Clone this repository or copy the files into a new directory.**

---

# Deploying OCI OKE Cluster with Bastion Host via Oracle Resource Manager (ORM)

This guide automates the deployment of an Oracle Kubernetes Engine (OKE) cluster with a Bastion host using **Oracle Resource Manager (ORM)**. It provides a managed, cloud-native solution for infrastructure provisioning, ideal for enterprise environments.

## Prerequisites

- **OCI Account:** Active Oracle Cloud Infrastructure account.
- **OCI CLI:** Installed and configured with necessary credentials.
- **ORM Stack Configuration:** Download the pre-configured ORM module:  
  [`oke_advanced_module_orm.zip`](./files/oke_advanced_module_orm.zip).

## Workflow Overview

1. **Source Configuration:** Define the origin of the IaC configuration.
2. **Stack Creation:** ORM uses the Terraform template to create a stack.
3. **Plan Job:** Generates a plan for infrastructure changes.
4. **Apply Job:** Provisions the resources based on the plan.
5. **Destroy Job:** Cleans up resources when no longer needed.

## Steps

### 1. Prepare ORM Module

#### **Variables**
- Update `variables.tf` with environment-specific details. Key variables and their default values are listed below.
- Use the provided scripts in `oke_advanced_module_orm/scripts` to automate the process.

### **Key Variables and Defaults**

Below are the key variables from `variables.tf` with their default values. Replace placeholders with your specific values.

-----------------------------------------------------------------------------------------------------------------------------------------------
| Variable Name                         | Default Value                          | Description                                                |
|---------------------------------------|----------------------------------------|------------------------------------------------------------|
| `tenancy_ocid`                        | `REPLACE_WITH_YOUR_TENANCY_OCID`       | Tenancy ID where resources will be created.                |
| `region`                              | `us-ashburn-1`                         | OCI region where resources will be created.                |
| `compartment_id`                      | `REPLACE_WITH_DEFAULT_COMPARTMENT_OCID`| Compartment OCID for OKE cluster and node pools.           |
| `networking_compartment_id`           | `REPLACE_WITH_NETWORK_COMPARTMENT_OCID`| Compartment OCID for networking resources.                 |
| `is_networking_compartment_separate`  | `false`                                | Flag to use separate compartments for networking and OKE.  |
| `vcn_cidr_block`                      | `10.0.0.0/16`                          | CIDR block for the VCN.                                    |
| `display_name_prefix`                 | `REPLACE_DISPLAY_NAME_PREFIX`          | Prefix for resource display names.                         |
| `host_name_prefix`                    | `avcn`                                 | Prefix for resource hostnames.                             |
| `control_plane_kubernetes_version`    | `v1.32.1`                              | Kubernetes version for the OKE control plane.              |
| `worker_nodes_kubernetes_version`     | `v1.32.1`                              | Kubernetes version for worker nodes.                       |
| `cni_type`                            | `FLANNEL_OVERLAY`                      | CNI type for the cluster (alternative: `VCN-NATIVE`).      |
| `cluster_type`                        | `BASIC_CLUSTER`                        | Type of cluster (e.g., `ENHANCED_CLUSTER`).                |
| `worker_node_pools`                   | (See default map)                      | Configuration for worker node pools.                       |
| `ssh_public_key_path`                 | `oke_node_key.pub`                     | Path to SSH public key.                                    |
| `linux_images`                        | (See default map)                      | Map of Linux image OCIDs by region and version.            |
| `bastion_params`                      | (See default map)                      | Configuration for the Bastion host.                        |
| `is_vcn_created`                      | `true`                                 | Flag to control VCN creation.                              |
| `is_k8cluster_created`                | `true`                                 | Flag to control OKE cluster creation.                      |
| `is_nodepool_created`                 | `true`                                 | Flag to control worker node pool creation.                 |
| `is_bastion_created`                  | `true`                                 | Flag to control Bastion host creation.                     |
-----------------------------------------------------------------------------------------------------------------------------------------------

#### **Alternatively, Configure `terraform.tfvars`:**
Create a `terraform.tfvars` file in the root directory and populate it with your specific values. Refer to the `variables.tf` file for a complete list of variables and their descriptions.

**Example `terraform.tfvars`:**
```hcl
# Network compartment if OKE and Network resources are in different compartments.
networking_compartment_id = "ocid1.compartment.oc1..xxxx_networking_compartment"

# Default compartment for all OKE deployment resources in the same compartment
container_compartment_id = "ocid1.compartment.oc1..xxxx_container_compartment"

# Set this flag to false if you want to keep both network and container in the 
# same compartment (default is compartment_id)
is_networking_compartment_separate = true # use separate compartments

display_name_prefix = "my-k8s-project" # make sure the name is not too long
host_name_prefix = "avcn"  # choose a name that complies with hostname size limit

# VCN and subnet configuration
vcn_cidr_block = "10.0.0.0/16"
# ...other VCN variables...

# OKE configuration
is_oke_created = true
is_node_pool_created = true
# ...other OKE variables...

# Specify the OKE Configuration
worker_nodes_kubernetes_version = "v1.32.1"
control_plane_kubernetes_version = "v1.32.1"
cni_type = "FLANNEL_OVERLAY" # ALTERNATIVE "VCN-NATIVE"
cluster_type = "ENHANCED_CLUSTER"

# Update the worker_node_pools map fields
shape = "VM.Standard.E5.Flex"
shape_config = {
  memory = 16
  ocpus = 1
}
operating_system = "Oracle-Linux"
kubernetes_version = "v1.32.1"
availability_domains = ["AD-1", "AD-2", "AD-3"]
number_of_nodes = 1
# ...other worker node pool variables...

# Bastion configuration
is_bastion_created = true

# Replace with your own SSH KEYS 
ssh_public_key_path = "oke_node_key.pub"
ssh_private_key_path = "oke_node_key"

# Update bastion host parameters map (bastion_map) fields:
shape = "VM.Standard.E5.Flex"
shape_config = {
  memory = 12
  ocpus = 1
}
version = "ol8_1_25_4"
# ...other Bastion variables...

# Make sure to populate the bastion host image map
linux_images = {
    us-ashburn-1 = {
        ol8_1_25_4 = "ocid1.image.oc1.iad.xxxxxxx_image1"
        ol8_1_24_1 = "ocid1.image.oc1.iad.xxxxxxx_image2"
    }
    us-phoenix-1 = {
        ol8_1_25_4 = "ocid1.image.oc1.phx.xxxxxxx_image1"
        ol8_1_24_1 = "ocid1.image.oc1.phx.xxxxxxx_image2"
    }
}

# ...other relevant variables...
```
**Notes**:     
  - A sample file with generic variable values, `variables_sample.tf`, is provided.
  - You must rename this file (for example, to `variables.tf`), and replace all generic values with your own deployment values.


### 2. Bash Shell and OCI CLI Automation

#### Execute Scripts

From your working directory (`~/oke_advanced_module_orm/scripts`), run the following scripts in sequence:

1. **Create Source Configuration**: `/create_new_oke_stack_source.sh`
   ```bash
      #!/bin/bash
      set -e

      # Define the source directory for the stack
      src_dir="./../oke_app_src"

      # Create the zip archive from the source code with overwrite 
      rm -f "$src_dir/stackconfig.zip"
      cd $src_dir 
      zip -r "../stackconfig.zip" * modules/

      # List the contents of the zip file for verification
      unzip -l "../stackconfig.zip"
   ```

2. **Create ORM Stack**: `create_new_oke_stack.sh`
   ```bash
      ##!/bin/bash

      # This script automates the creation of an OCI Resource Manager (ORM) stack.
      # It uses a pre-existing ZIP file containing the Terraform configuration for deployment.

      # Exit immediately if a command exits with a non-zero status.
      set -e

      # Load environment variables (e.g., COMPARTMENT_ID, STACK_NAME, STAC_DESC) 
      source "./env-vars"

      # Create the Oracle Resource Manager stack and capture the OCID
      stack_output=$(/users/mguiagou/lib/oracle-cli/bin/oci resource-manager stack create \
      --compartment-id "$COMPARTMENT_ID" --display-name "$STACK_NAME" \
      --description "$STACK_DESC" --config-source "$CONFIG_SOURCE")

      # Extract the OCID of the newly created stack and display it
      STACK_OCID=$(echo "$stack_output" | /usr/local/bin/jq -r '.data.id')

      if [ -z "$STACK_OCID" ]; then
      echo "Error: Failed to retrieve Stack OCID from output."
      exit 1
      fi

      echo "Stack OCID: $STACK_OCID"

      # Add the Stack OCID to the environment file
      echo "" >> "./env-vars" # Add a carriage return
      echo "export STACK_OCID=\"$STACK_OCID\"" >> "./env-vars"
   ```

3. **Create Plan Job**: `create_oke_stack_plan_job.sh`
   ```bash
      #!/bin/bash

      # Exit immediately if a command exits with a non-zero status.
      set -e

      # Load environment variables (e.g., COMPARTMENT_ID, STACK_NAME, etc.)
      source "./env-vars"

      # Create a plan job for the specified stack
      plan_job_output=$(/users/mguiagou/lib/oracle-cli/bin/oci resource-manager job create-plan-job \
      --stack-id "$STACK_OCID")

      # Extract the OCID of the plan job and check for errors
      PLAN_JOB_OCID=$(echo "$plan_job_output" | /usr/local/bin/jq -r '.data.id')

      if [[ -z "$PLAN_JOB_OCID" ]]; then
      echo "Error: Failed to retrieve plan job OCID." >&2
      exit 1
      fi

      echo "Plan job OCID: $PLAN_JOB_OCID"

      # Add the Plan Job OCID to the environment file
      echo "" >> "./env-vars" # Add a carriage return 
      echo "export PLAN_JOB_OCID=\"$PLAN_JOB_OCID\"" >> "./env-vars"
   ```

4. **Create Apply Job**: `create_oke_stack_apply_job.sh`  
   ```bash

      #!/bin/bash

      # Exit immediately if a command exits with a non-zero status.
      set -e

      # Load environment variables (e.g., STACK_OCID, EXEC_PLAN_STRATEGY)
      source "./env-vars"

      # Create an apply job for the specified stack
      apply_job_output=$(/users/mguiagou/lib/oracle-cli/bin/oci resource-manager job create-apply-job \
      --stack-id "$STACK_OCID" \
      --execution-plan-strategy "$EXEC_PLAN_STRATEGY")

      # Extract the OCID of the apply job and check for errors
      APPLY_JOB_OCID=$(echo "$apply_job_output" | /usr/local/bin/jq -r '.data.id')

      if [[ -z "$APPLY_JOB_OCID" ]]; then
      echo "Error: Failed to retrieve apply job OCID." >&2
      exit 1
      fi

      echo "Apply job OCID: $APPLY_JOB_OCID"

      # Add the Apply Job OCID to the environment file
      echo "" >> "./env-vars" # Add a carriage return for the next line
      echo "export APPLY_JOB_OCID=\"$APPLY_JOB_OCID\"" >> "./env-vars"   
   ```

5. **Running All Script in One Command**: `run_all.sh`
   
Alternatively, use the `run_all.sh` script to sequence all steps:  
```bash
   #!/bin/bash
   set -e

   echo "Creating Source Code Zip ......"
   sh create_new_oke_stack_source.sh

   echo "Creating New ORM Stack ......"
   sh create_new_oke_stack.sh

   #echo "Creating Stack Plan Job  ......"
   #sh create_oke_stack_plan_job.sh

   echo "Creating Stack Apply Job  ......"
   sh create_oke_stack_apply_job.sh
```

1. **Cleanup - Destroy All Resources***: `create_oke_stack_destroy_job.sh`
To destroy the resources, run:  
```bash
   #!/bin/bash

   set -e

   # Load environment variables (e.g., STACK_OCID, EXEC_PLAN_STRATEGY)
   source "./env-vars"

   # Create an  jotroy for the specified stack
   apply_job_output=$(/users/mguiagou/lib/oracle-cli/bin/oci resource-manager job create-destroy-job \
   --stack-id "$STACK_OCID"  \
   --execution-plan-strategy "$EXEC_PLAN_STRATEGY")

   # Extract the OCID of the destroy job and check for errors
   DESTROY_JOB_OCID=$(echo "$apply_job_output" | /usr/local/bin/jq -r '.data.id')

   if [[ -z "$DESTROY_JOB_OCID" ]]; then
   echo "Error: Failed to retrieve destrot job OCID." >&2
   exit 1
   fi

   echo "Apply job OCID: $DESTROY_JOB_OCID"

   # Export OCID for subsequent scripts
   echo "" >> "./env-vars"
   echo "export DESTROY_JOB_OCID=\"$DESTROY_JOB_OCID\"" >> "./env-vars"
```

#### Prerequisites and Script Execution

Before running the scripts, please follow these steps:

1. **Position the Source Code Properly:**

    - The source directory for the stack must be correctly referenced in `create_new_oke_stack_source.sh` via the variable `src_dir="./../oke_app_src"`.
    - Make sure your stack’s source code is present at the correct relative path (e.g., `./../oke_app_src`). If not, update `src_dir` in the script or move your source code accordingly.


2. **Environment Variables Setup**

All scripts in this repository rely on environment variables for configuration. Before running any script, ensure you export the following variables in your shell environment:

```sh
export COMPARTMENT_ID="<your-compartment-ocid>"   # Compartment where the OCI Stack will be created
export STACK_NAME="ORM-OKE-Module-Test"           # Prefix name for the ORM Stack (add a timestamp if needed)
export STACK_DESCRIPTION="This is a test ORM Stack created using OCI CLI"  # Stack Description 
# Location of the ZIP file to submit to ORM in the console
export CONFIG_SOURCE="~/oke_advanced_module_orm/stackconfig.zip"  
export EXEC_PLAN_STRATEGY="AUTO_APPROVED"
```

#### Script Execution Order:
Run the scripts in the following order for full stack lifecycle management:

```sh
# Frist
sh create_new_oke_stack_source.sh        # "Creating Source Code Zip ......"
```

```sh
# Second
sh create_new_oke_stack.sh               # "Creating New ORM Stack ......"
```

```sh
# Third
sh create_oke_stack_plan_job.sh          # "Creating Stack Plan Job  ......"
```

```sh
# Fourth
sh create_oke_stack_apply_job.sh         # "Creating Stack Apply Job  ......"
```

```sh
# Fifth - last after your test is done
sh create_oke_stack_destroy_job.sh       # "Creating Stack Destroy Job  ......"
```

To run all scripts except the destroy operation in one go, you can execute:

```sh
sh run_all.sh
```

## Variable Reference

The main `variables.tf` file contains all the top-level variables used to configure the entire project. For detailed information on module-specific variables, please refer to the `variables.tf` file inside each module directory.

## Main
The root `main.tf` orchestrates the deployment of three key modules (`modules/vcn`, `modules/oke`, and `modules/bastion`) in a sequenced manner. Each module is conditionally invoked based on configuration flags (`is_vcn_created`, `is_oke_created`, `is_bastion_created`), providing flexibility in deployment. Use the provided `main.tf` sample and customize it to meet your specific OKE deployment requirements. In this code, we did not implement versioning, but the next update will include module versioning.

### **Alternative 2: Using ORM Console for Testing**
You can also test the module directly from the ORM console:  
1. Navigate to the ORM service in the OCI Console.  
2. Create a new stack using the subdirectory `~/oke_advanced_module_orm/oke_app_src` as the source.  
3. Run a **Plan Job** to verify the configuration.  
4. If satisfied, run an **Apply Job** to deploy the resources.  
5. Use a **Destroy Job** to clean up when done.  

## Outputs
After successful deployment, ORM outputs key information:
- **VCN ID:** OCID of the created VCN.
- **OKE Cluster ID:** OCID of the OKE cluster.
- **Bastion Public IPs:** Map of bastion hostnames to public IPs.
- **Worker Node Pool Image:** Automatically computed optimal OKE image.

## Notes
- ORM securely manages the Terraform state file within OCI.
- Ideal for collaborative, enterprise environments with governance needs.

For detailed steps and scripts, refer to the `scripts` directory.

## 🛠️ Release Notes / Known Issues

**1. Corrupted Map Key/Value Pairs:** When editing the `Worker_Node_Pool Map` in the OCI Resource Manager (ORM) Console, the system may incorrectly replace the colon (`:`) separator in map keys (e.g., `"AQob:US-ASHBURN-AD-1"`) with an equals sign (`=`). To address this, always double-check and manually correct the JSON syntax for the `Worker_Node_Pool Map` value in the ORM console. Ensure all key/value are accurate.

**2. Unrecognized Field Changes (e.g., Tags):** The ORM Console sometimes fails to properly register changes for certain fields, such as tags or labels. As a result, the ORM plan may use the old, existing value instead of the new one. After editing any values, especially tags or labels, review the ORM stack variables screen to confirm the change has been recognized. Verify the console field's value matches the desired syntax before executing the plan.

### 🚨 Important Note

These issues are specific to the ORM Console's display/read layer and do not affect the integrity of the underlying Terraform configuration files. Always rely on and scrutinize the ORM Plan Output to ensure the infrastructure changes reflect your intentions before applying.

**Need Help?**  
For further assistance, visit our [Support Page](Support) or check out our [Documentation](Docs).


## Release Notes - 10/07/2025

As Flannel IPs are created by the Flannel overlay network, Flannel does not need to request OCI IPs.  
This is **not** the case for CNI OCI Native. OKE requires an IAM policy so Kubernetes clusters can request these IPs.

### Required IAM Policy

Here is a sample policy that is required:

`Allow any-user to use private-ips in tenancy where all { request.principal.type = 'cluster' }`

**Note:** This works only if all resources are deployed in the same compartment (container and networks are co-located).

If not, you need to add additional policies see [OCI documentation](https://docs.oracle.com/en-us/iaas/Content/ContEng/Concepts/contengpodnetworking_topic-OCI_CNI_plugin.htm). 

In cases where not all resources are in the same compartment, include policy statements similar to the following (in addition to the one above):

`Allow any-user to manage instances in tenancy where all { request.principal.type = 'cluster' }`
`Allow any-user to use private-ips in tenancy where all { request.principal.type = 'cluster' }`
`Allow any-user to use network-security-groups in tenancy where all { request.principal.type = 'cluster' }`

For more details, check the [OCI documentation](https://docs.oracle.com/en-us/iaas/Content/ContEng/Concepts/contengpodnetworking_topic-OCI_CNI_plugin.htm) 


### Error Handling 

If the appropriate policies are not set, you may encounter a 404 error such as:
Error: Work Request error
│ Service: Containerengine Node Pool
│ Error Message: work request did not succeed, workId: ocid1.clustersworkrequest.oc1.iad.xxxx, entity: nodepool, action: CREATED. Message: 1 nodes(s) pod network configuration timeout.
│ Resource OCID: ocid1.nodepool.oc1.iad.axxxxxxxq 

Looking at the Work Request Log will provide a similar message:
Pod network configuration timeout reason: ListPrivateIPsFailed, Message: (combined from similar events): Error returned by VirtualNetwork Service. Http Status Code: 404. Error Code: NotAuthorizedOrNotFound. Opc request id: 9e1fc9ea8b36f300058126d79eb0fb36/A057EDC74575C730B67D97455C3E8311/78E1B03ACB3429FFBB3E42D36CDD7A4A. Message: Authorization failed or requested resource not found.. Operation Name: ListPrivateIps Timestamp: 2025-09-29 23:41:28 +0000 GMT Client Version: Oracle-GoSDK/65.96.0 Request Endpoint: GET https://iaas.us-ashburn-1.oraclecloud.com/20160918/privateIps?vnicId=ocid1.vnic.oc1.iad.x  

## Troubleshooting Tips 

- See https://docs.oracle.com/iaas/Content/API/References/apierrors.htm#apierrors_404__404_notauthorizedornotfound  for more information about resolving this error.
- Also see https://docs.oracle.com/iaas/api/#/en/iaas/20160918/PrivateIp/ListPrivateIps  for details on this operation’s requirements.
- To get more information on the failing request, you can set the OCI_GO_SDK_DEBUG environment variable to info or higher to log request/response details.
- If you are unable to resolve the VirtualNetwork issue, please contact Oracle support and provide the full error message.
