Manually Configuring Your Tenancy for Data Science

Before you can use the Data Science service, you must complete these tasks to configure the service:

Note

To deploy models with Oracle Functions, use Configuring Your Tenancy for Function Development.
  1. Creating User Groups and Users for Data Science, if they don’t already exist.

  2. Creating Compartments for Network Resources and Data Science Resources in a Tenancy, if they don't already exist.

  3. Creating the VCN and Subnets to Use with Data Science to create the VCN and subnets to use with Data Science, if they don't already exist.

  4. Creating Policies to Control Access to Network and Data Science Related Resources, if they don't already exist.

  5. Creating Dynamic Groups and Policies.

Creating User Groups and Users for Data Science

Before you can start using Data Science, your tenancy administrator has to create OCI user accounts, create a group for these users to belong to, and then assign these user accounts to that group.

Define policies to give the user group access to your Data Science related resources. If a suitable group and user accounts exist, you don't need to create a group though the Data Science policies have to be implemented in the group.

  1. Open the navigation menu and click Identity & Security. Under Identity, click Groups.

    A list of the groups in your tenancy displays.

  2. Click Create Group and create a new group.
  3. Enter a meaningful name. For example, acme-datascientists.
  4. Enter a description. Avoid entering confidential information.
  5. Click Create.

    You are advanced to the group detail page that you created.

  6. Click Add User to Group and create one or more new users.
  7. Select a user to add, and then click Add.

    The selected user is added and appears in the group member list.

  8. Repeat adding data scientist users until all of your users are added to the group you created:

    A list of the users in your tenancy displays.

Creating Compartments for Network Resources and Data Science Resources in a Tenancy

Before you can use Data Science, a tenancy administrator must create these resources:

  • A compartment to contain network resources. A VCN, a public or private subnet, and other resources such as, an internet gateway or service gateway, a route table, and security lists.
  • A compartment to contain Data Science resources (projects, notebook sessions, models, and work requests). The compartment can contain the resources for multiple services not just Data Science.
Note

The same compartment can own both network resources, Data Science-related resources, and the resources of other OCI services. Alternatively, you can have multiple compartments for network resources and Data Science related resources, structuring them for what is best for your organization.

If you already have suitable compartments, there is no need to create new ones.

  1. Open the navigation menu and click Identity & Security. Under Identity, click Compartments.
    A list of the compartments in your tenancy displays.
  2. Click Create Compartment to create a new compartment, see To create a compartment.
  3. Enter a meaningful name. For example, acme-network or acme-datascience-compartment.
  4. Enter a description. Avoid entering confidential information.
  5. Click Create Compartment.

    The compartment is created, and added to the compartments list when it successfully creates.

Creating the VCN and Subnets to Use with Data Science

To create and use a Data Science notebook session, you must have a VCN that contains a subnet. Then, you can create the notebook session in that subnet.

All egress from a notebook session is routed through this subnet. To access data and install additional packages to use in the notebook session, you must configure the subnet with appropriate access.

Note

For a private subnet to have egress access to the internet, it must have a route to a NAT Gateway. For egress access to the public internet, we recommend that you use a private subnet with a route to a NAT Gateway. A NAT gateway gives instances in a private subnet access to the internet.

If you already have suitable VCN and subnets, you don't need to create new ones. You can create an OCI VCN with these basic steps.

Each subnet in the VCN must have a CIDR block that provides at least one IP address for each concurrent notebook session. We recommend that you have a minimum of 12 free IP addresses for AD-specific subnets and a minimum of 32 free IP addresses for regional subnets.

Note

We strongly recommend that each subnet has a CIDR block that provides more than the minimum number of free IP addresses.

Use these steps to create a simple VCN and subnet to access the Public Internet from a Data Science notebook session:

  1. Open the navigation menu and click Networking. Click Virtual Cloud Networks.
  2. Select the compartment that you want to create the VCN in.
  3. Click Start VCN Wizard.
  4. Use the VCN with Internet Connectivity default and click Start VCN Wizard.
  5. Choose the compartment to own the network resources.
  6. Enter the following:
    Important

    None of your CIDR blocks can overlap.
    • VCN Name: A meaningful name for the cloud network. For example, acme-datascience-vcn. The name doesn't have to be unique within your tenancy and it cannot be changed later through the Console. Do not enter confidential information.
    • VCN CIDR Block: The IP address to your VCN. For example, 10.0.0.0/16.
    • Public Subnet CIDR Block: The IP address to your public subnet. For example, 10.0.0.0/24.
    • Private Subnet CIDR Block: The IP address to your private subnet. For example, 10.0.1.0/24.
  7. Ensure that Use DNS Hostnames in this VCN is selected.
  8. Click Next.

    A review of the VCN configuration is displayed.

  9. Review your selections and click Previous to modify any.
  10. Click Create to create the VCN and the related resources (three public subnets and an internet gateway).

    Use this VCN and its private subnet to use when you launch a notebook session.

  11. (Optional) Click View Virtual Cloud Network to review your VCN and subnets.

Creating the VCN and Subnets for Notebook Sessions Running on GPUs

For Tokyo (NRT), you can use the same VCN and subnet your are using for notebook sessions on CPU shapes, see Creating the VCN and Subnets to Use with Data Science.

If you are working in the Frankfurt (FRA), Ashburn (IAD), or London (LHR) regions, you must create new subnets. GPU availability in those regions is limited to certain availability domains. You must create subnets that are availability domain-specific with access to the public internet.

Use these steps to create a simple VCN and an availability domain-specific subnet to access the Public Internet from a Data Science notebook session:

  1. Open or create a notebook session.
  2. Select the region-specific VCN that you created.
    Note

    If the VCN is not available in one of the GPU regions, then use Creating the VCN and Subnets to Use with Data Science to create it.

  3. Click Create Subnet.
  4. Enter the following:
    Important

    None of your CIDR blocks can overlap.
    • VCN Name: A meaningful name for the cloud network. We recommend that you put the GPU shape family in the name of the subnet to differentiate them. The name doesn't have to be unique within your tenancy and it cannot be changed later through the Console. Do not enter confidential information.
    • Subnet Type: Select Availability Domain-Specific, and then select the proper availability domain for the GPU family shape you want to use. Ask the OCI Data Science team for the mapping between shape families and ADS.
    • Route Table: Select the route table associated with the NAT gateway for public internet egress access.
    • Subnet Access: Select Private Subnet.
    • Private Subnet CIDR Block: The IP address to your private subnet. For example, 10.0.1.0/24.
    • DNS Resolution: Select Use dns hostnames in this subnet.
    • DHCP Options: Select Default DHCP Options for <VCN Name>.
    • Security Lists: Select the security list associated with the NAT Gateway you selected.
  5. Click Create Subnet.
  6. Repeat all of these steps for each availability domain you want to launch GPUs in.