Using NVIDIA GPU Cloud with Oracle Cloud Infrastructure
NVIDIA GPU Cloud (NGC) is a GPU-accelerated cloud platform optimized for deep learning and scientific computing. This topic provides an overview of how to use NGC with Oracle Cloud Infrastructure.
NVIDIA makes available on Oracle Cloud Infrastructure a customized Compute image that is optimized for the NVIDIA Tesla Volta and Pascal GPUs. Running NGC containers on this instance provides optimum performance for deep learning jobs.
Before You Begin
Prepare the following things:
-
An Oracle Cloud Infrastructure tenancy with a GPU quota. For more information about quotas, see Compute Quotas.
-
A cloud network to launch the instance in. For information about setting up cloud networks, see Managing VCNs and Subnets in VCNs and Subnets.
-
A key pair, to use for connecting to the instance via SSH. For information about generating a key pair, see Managing Key Pairs on Linux Instances.
-
Security group and policy configured for the File Storage service. For more information, see Managing Groups, Managing Identity Domains, and Details for the File Storage Service.
-
An NGC API key for authenticating with the NGC service.
To generate your NGC API key-
Sign in to the NGC website.
-
On the NGC Registry page, click Get API Key.
-
Click Generate API Key and then click Confirm to generate the key. If you have an existing API key it will become invalid once you generate a new key.
-
Launching an Instance Based on the NGC Image
Using the Console
- Open the Console. For steps, see Signing In for the First Time.
- Open the navigation menu and select Compute. Under Compute, select Instances.
- Select a Compartment that you have permission to work in.
- Click Create instance.
-
Enter a name for the instance. Avoid entering confidential information.
- In the Placement section, select the Availability Domain that you want to create the instance in.
- In the Image and shape section:
- On the Shape card, click Change shape. Then, do the following:
-
For Instance type, select Virtual machine or Bare metal machine.
-
Select a GPU shape for the instance. For more information about GPU shapes, see virtual machine GPU shapes and bare metal GPU shapes.
Important
In order to access the GPU shapes, your tenancy must have a GPU quota. If your tenancy does not have a GPU quota, the GPU shapes will not be in the shape list. See Before You Begin for more information. - Click Select shape.
-
-
To select the NGC image, on the Image card, click Change image. Then do the following.Important
In order to access the NVIDIA GPU Cloud images, your tenancy must have a GPU quota and you must select a GPU shape.- In the Image source list, select Oracle images.
- Select the check box next to NVIDIA GPU Cloud Machine Image.
- Review and accept the terms of use, and then click Select image.
- On the Shape card, click Change shape. Then, do the following:
-
In the Networking section, leave Select existing virtual cloud network selected, and then select the virtual cloud network (VCN) compartment, VCN, subnet compartment, and subnet.
-
In the Add SSH keys section, upload the public key portion of the key pair that you want to use for SSH access to the instance. Browse to the key file that you want to upload, or drag and drop the file into the box.
-
Click Create.
You should now see the NGC instance with the state of Provisioning. After the state changes to Running, you can connect to the instance. For general information about launching compute instances, see Creating an Instance.
See the following topics for steps to access and work with the instance:
When you connect to the instance using SSH, you are prompted for the NGC API key. If you supply the API key at the prompt, the instance automatically logs you in to the NGC container registry so that you can run containers from the registry. You can choose not to supply the API key at the prompt and still log in to the instance. You can then log in later to the NGC container registry. See Logging in to the NGC Container Registry for more information.
Using the CLI
Oracle Cloud Infrastructure provides a Command Line Interface (CLI) you can use to complete tasks. For more information, see Quickstart and Configuring the CLI.
Use the launch command to create an instance, specifying image for sourceType and the image OCID ocid1.image.oc1..aaaaaaaaknl6phck7e3iuii4r4axpwhenw5qtnnsk3tqppajdjzb5nhoma3q
in InstanceSourceDetails for LaunchInstanceDetails.
Using the File Storage Service for Persistent Data Storage
You can use the File Storage service for data storage when working with NGC. For more information, see Overview of File Storage. See the following tasks for creating and working with the File Storage service:
Using the Block Volume Service for Persistent Data Storage
You can use the Block Volume service for data storage when working with NGC. For more information, see Overview of Block Volume. See the following tasks for creating and working with the Block Volume service:
You can also use the CLI to manage block volumes, see the volume commands.
Using the Object Storage Service for Persistent Data Storage
You can use the Object Storage service for data storage when working with NGC. For more information, see Overview of Object Storage. See the following tasks for creating and working with the Object Storage service:
- Creating an Object Storage Bucket
- Ways to Access Object Storage
- Object Storage Objects
- Uploading an Object Storage Object to a Bucket
You can also use the CLI to manage object storage, see the os command.
Examples of Running Containers
You first need to log into the NGC container registry. You can skip this section if you provided your API key when logging into the instance via SSH. If you did not provide your API key when connecting to your instance, then you must perform this step.
-
Run the following Docker command:
docker login nvcr.io
-
When prompted for a username, enter
$oauthtoken
. -
When prompted for a password enter your NGC API key.
At this point you can run Docker commands and access the NGC container registry from the instance.
This sample demonstrates running the MNIST example under PyTorch. This example downloads the MNIST dataset from the web.
-
Pull and run the PyTorch container with the following Docker commands:
docker pull nvcr.io/nvidia/pytorch:17.10 docker run --gpus all --rm -it nvcr.io/nvidia/pytorch:17.10
-
Run the MNIST example with the following commands:
cd /opt/pytorch/examples/mnist python main.py
This sample demonstrates running the MNIST example under TensorFlow. This example downloads the MNIST dataset from the web.
-
Pull and run the TensorFlow container with the following Docker commands:
docker pull nvcr.io/nvidia/tensorflow:17.10 docker run --gpus all --rm -it nvcr.io/nvidia/tensorflow:17.10
-
Run the MNIST_with_summaries example with the following commands:
cd /opt/tensorflow/tensorflow/examples/tutorials/mnist python mnist_with_summaries.py