**OKE Advanced Terraform Root Module: File: ~/oke_advanced_module/README.md**   
**Author: Author: Mahamat H. Guiagoussou, Payal Sharma and Matthew McDaniel** 
**Copyright (c) 2025 Oracle**

# REAME.md 

# Deploying an OCI OKE Cluster with a Bastion Host

This project provisions an Oracle Cloud Infrastructure (OCI) environment for a Kubernetes cluster, following a proven, modular approach. The architecture is designed with a clear separation of concerns, giving the use the option of placing networking and container resources in different compartments or in the same.

## Architecture Overview

The root module orchestrates the creation of three main components:

1. **VCN Module (`modules/vcn`):** Provisions a Virtual Cloud Network (VCN) with subnets, gateways, and security resources. This module also creates a separate compartment for networking and containers.
2. **OKE Module (`modules/oke`):** Deploys an Oracle Kubernetes Engine (OKE) cluster, including the control plane and a worker node pool, into the designated container compartment. It leverages the subnets created by the VCN module.
3. **Bastion Module (`modules/bastion`):** Deploys a bastion host instance into a public subnet to provide secure access to private resources.

Each module is self-contained and reusable, with its own variables and outputs.

## Prerequisites

* **OCI Account:** You must have an active Oracle Cloud Infrastructure account.
* **OCI CLI:** The OCI CLI must be installed and configured with the necessary credentials.
* **Terraform:** You need Terraform CLI installed on your local machine (version `~> 1.0`).
* **IAM Policies:** Check in this [link](https://docs.oracle.com/en-us/iaas/Content/ContEng/Concepts/contengpolicyconfig.htm) to make sure required polcies and permissions are properly set.


## Setup and Usage

1. **Clone this repository or copy the files into a new directory.**
Make sure to rename the `terraform_sample.tfvars` to `terraform.tfvars` or `input.auto.tfvars`

2. **Configure `terraform.tfvars`:** Update the `terraform.tfvars` file in the root directory by populating it with your own values. Refer to the `variables.tf` file for a complete list of variables and their descriptions.

    **Example `terraform.tfvars`:**
    ```hcl
    # Network compartment if OKE and Network resources are in different compartments.
    networking_compartment_id = "ocid1.compartment.oc1..xxxx_networking_compartment"

    # Default compartment for all OKE deployment resources in the same compartment
    container_compartment_id = "ocid1.compartment.oc1..xxxx_container_compartment"

    # Set this flag to false if you want to keep both network and container in the 
    # same compartment (default is compartment_id)
    is_networking_compartment_separate = true # use separate compartments
    
    display_name_prefix = "my-k8s-project" # make sure the name is not too long
    host_name_prefix = "avcn"  # choose a name that complies with hostname size limit
    
    # VCN and subnet configuration
    vcn_cidr_block = "10.0.0.0/16"
    # ...other VCN variables...
    
    # OKE configuration
    is_oke_created = true
    is_node_pool_created = true
    # ...other OKE variables...

    # Specify the OKE Configuration
    worker_nodes_kubernetes_version = "v1.32.1"
    control_plane_kubernetes_version = "v1.32.1"
    cni_type = "FLANNEL_OVERLAY"       # ALTERNATIVE "OCI_VCN_IP_NATIVE"
    cluster_type = "ENHANCED_CLUSTER"  # ALTERNATIVE "BASIC"  
    # Note: If the cni_type is "OCI_VCN_IP_NATIVE" the flag is_cni_type_native is turned.       
    is_cni_type_native =  = (var.cni_type == "OCI_VCN_IP_NATIVE")
    # Make sure to set required IAM policies to allow K8s clusters to request OCI IPs (see release notes at the end)  
    
    # Update the worker_node_pools map fields
    shape = "VM.Standard.E5.Flex"
    shape_config = {
      memory = 16
      ocpus = 1
    }
    operating_system = "Oracle-Linux"
    kubernetes_version = "v1.32.1"
    availability_domains = ["AD-1", "AD-2", "AD-3"]
    number_of_nodes = 1
    # ...other worker node pool variables...

    # Bastion configuration
    is_bastion_created = true
    
    # Replace with your own SSH KEYS 
    ssh_public_key_path = "oke_node_key.pub"
    ssh_private_key_path = "oke_node_key"

    # Update bastion host parameters map (bastion_map) fields:
    shape = "VM.Standard.E5.Flex"
    shape_config = {
      memory = 12
      ocpus = 1
    }
    version = "ol8_1_25_4"
    # ...other Bastion variables...

    # Make sure to populate the bastion host image map
    linux_images = {
        us-ashburn-1 = {
            ol8_1_25_4 = "ocid1.image.oc1.iad.xxxxxxx_image1"
            ol8_1_24_1 = "ocid1.image.oc1.iad.xxxxxxx_image2"
        }
        us-phoenix-1 = {
            ol8_1_25_4 = "ocid1.image.oc1.phx.xxxxxxx_image1"
            ol8_1_24_1 = "ocid1.image.oc1.phx.xxxxxxx_image2"
        }
    }

    # ...other relevant variables...
    ```

3. **Initialize Terraform:**
    ```sh
    terraform init
    ```

4. **Review the plan:**
    ```sh
    terraform plan
    ```

5. **Apply the changes:**
    ```sh
    terraform apply
    ```

## main.tf
The root `main.tf` orchestrates the deployment of three key modules (`modules/vcn`, `modules/oke`, and `modules/bastion`) in a sequenced manner. Each module is conditionally invoked based on configuration flags (`is_vcn_created`, `is_oke_created`, `is_bastion_created`), providing flexibility in deployment. Use the provided `main.tf` sample and customize it to meet your specific OKE deployment requirements. In this code, we did not implement versioning, but the next update will include module versioning.

## Variable Reference

The main `variables.tf` file contains all the top-level variables used to configure the entire project. For detailed information on module-specific variables, please refer to the `variables.tf` file inside each module directory.

## Outputs

After a successful `terraform apply`, the root module will output key information about the deployed infrastructure:

* **`vcn_id`**: The OCID of the newly created VCN.
* **`oke_cluster_id`**: The OCID of the OKE cluster.
* **`bastion_public_ips`**: A map of bastion hostnames to their public IP addresses.
* **`worker_node_pools_image`**: The OCID of the Worker node pool image - we automatically compute the optimal OKE image based on the K8s version and Oracle Linux.


## Release Notes - 10/07/2025

As Flannel IPs are created by the Flannel overlay network, Flannel does not need to request OCI IPs.  
This is **not** the case for CNI OCI Native. OKE requires an IAM policy so Kubernetes clusters can request these IPs.

### Required IAM Policy

Here is a sample policy that is required:

`Allow any-user to use private-ips in tenancy where all { request.principal.type = 'cluster' }`

**Note:** This works only if all resources are deployed in the same compartment (container and networks are co-located).

If not, you need to add additional policies see [OCI documentation](https://docs.oracle.com/en-us/iaas/Content/ContEng/Concepts/contengpodnetworking_topic-OCI_CNI_plugin.htm). 

In cases where not all resources are in the same compartment, include policy statements similar to the following (in addition to the one above):

`Allow any-user to manage instances in tenancy where all { request.principal.type = 'cluster' }`
`Allow any-user to use private-ips in tenancy where all { request.principal.type = 'cluster' }`
`Allow any-user to use network-security-groups in tenancy where all { request.principal.type = 'cluster' }`

For more details, check the [OCI documentation](https://docs.oracle.com/en-us/iaas/Content/ContEng/Concepts/contengpodnetworking_topic-OCI_CNI_plugin.htm) 


### Error Handling 

If the appropriate policies are not set, you may encounter a 404 error such as:
Error: Work Request error
│ Service: Containerengine Node Pool
│ Error Message: work request did not succeed, workId: ocid1.clustersworkrequest.oc1.iad.xxxx, entity: nodepool, action: CREATED. Message: 1 nodes(s) pod network configuration timeout.
│ Resource OCID: ocid1.nodepool.oc1.iad.axxxxxxxq 

Looking at the Work Request Log will provide a similar message:
Pod network configuration timeout reason: ListPrivateIPsFailed, Message: (combined from similar events): Error returned by VirtualNetwork Service. Http Status Code: 404. Error Code: NotAuthorizedOrNotFound. Opc request id: 9e1fc9ea8b36f300058126d79eb0fb36/A057EDC74575C730B67D97455C3E8311/78E1B03ACB3429FFBB3E42D36CDD7A4A. Message: Authorization failed or requested resource not found.. Operation Name: ListPrivateIps Timestamp: 2025-09-29 23:41:28 +0000 GMT Client Version: Oracle-GoSDK/65.96.0 Request Endpoint: GET https://iaas.us-ashburn-1.oraclecloud.com/20160918/privateIps?vnicId=ocid1.vnic.oc1.iad.x  

## Troubleshooting Tips 

- See https://docs.oracle.com/iaas/Content/API/References/apierrors.htm#apierrors_404__404_notauthorizedornotfound  for more information about resolving this error.
- Also see https://docs.oracle.com/iaas/api/#/en/iaas/20160918/PrivateIp/ListPrivateIps  for details on this operation’s requirements.
- To get more information on the failing request, you can set the OCI_GO_SDK_DEBUG environment variable to info or higher to log request/response details.
- If you are unable to resolve the VirtualNetwork issue, please contact Oracle support and provide the full error message.


