Deploy the Blueprint

Make sure that you have access to an OCI tenancy and that you have admin privileges to provide sufficient CPU/GPU instances.

  1. Install an OKE cluster using the Terraform module below. An OKE cluster with a single node pool will be created with 6 worker nodes.
    1. Click Deploy to Oracle Cloud below.

      Deploy to Oracle Cloud

      If you aren't already signed in, enter the tenancy and user credentials.

    2. Give your stack a name (for example oke-stack).
    3. Select the compartment where you want OCI AI Blueprints deployed.
    4. Provide any additional parameters (such as node size, node count) according to your preferences.
    5. Click Next, then Create, and finally click Run apply to provision your cluster.
  2. Sign in to the Oracle Cloud Infrastructure console and from the home page click, Storage and then under Object Storage click Buckets. Provide a bucket name that you will use in the OCI AI Blueprints deployment.

    Click the bucket that you created and click Pre-authenticated requests (PAR) link under Resources. Give a name for PAR, change the Access type to Permit object reads and writes and click Pre-authenticated requests. When completed, copy the PAR URL and save it somewhere as you will need it when you deploy the CPU Inferencing blueprint.

  3. Install OCI AI Blueprints in the new OKE cluster. After successful deployment, you will have a URL to access the OCI AI Blueprints console.

    Deploy to Oracle Cloud

  4. In the OCI AI Blueprints main page, there are several blueprints available for deployment under Blueprint Binary, scroll down and click Deploy under CPU Inference.
    1. Select CPU Inference with mistral and the VM-Standard.E4. Flex option.

      Use the default parameters and change the Pre-authenticated request URL created above. Do not change any values in the Configure Parameters section.

      Please note the parameters:

      "recipe_container_env": [
      {
      
      "key": "MODEL_NAME",
      
      "value": "mistral"
      
      },
      
      {
      
      "key": "PROMPT",
      
      "value": "What is the capital of Spain?"
      
      }
      

      After successful deployment of the CPU Inferencing blueprint, the output of the above prompt will be displayed in the Kubernetes Pod log.

    2. Click Deploy Blueprint. Once the blueprint is successfully deployed, you will notice a name such as cpu inference mistral E4Flex that lists a deployment status of Monitoringwith the creation date and the number of nodes and E4 shapes.
    3. In the Deployment list, click the cpu inference mistral E4Flex link which will take you to the deployment details with the public endpoint. Click the public endpoint and you will see a message such as "ollama is running".
    4. Return to the the cpu inference mistral E4Flex page and scroll down to Pod logs . Click View and Log details. A page similar to the following will be displayed giving you the output.

      Shows an example of the blueprint pod log output.