Manually Configuring Your Tenancy for Data Science
Before you can use the Data Science service, you must complete these tasks to configure the service:
-
Creating User Groups and Users for Data Science, if they don’t already exist.
-
Creating Compartments for Network Resources and Data Science Resources in a Tenancy, if they don't already exist.
-
Creating the VCN and Subnets to Use with Data Science to create the VCN and subnets to use with Data Science, if they don't already exist.
-
Creating Policies to Control Access to Network and Data Science Related Resources, if they don't already exist.
Creating User Groups and Users for Data Science
Before you can start using Data Science, your tenancy administrator has to create OCI user accounts, create a group for these users to belong to, and then assign these user accounts to that group.
Define policies to give the user group access to your Data Science related resources. If a suitable group and user accounts exist, you don't need to create a group though the Data Science policies have to be implemented in the group.
Creating Compartments for Network Resources and Data Science Resources in a Tenancy
Before you can use Data Science, a tenancy administrator must create these resources:
- A compartment to contain network resources. A VCN, a public or private subnet, and other resources such as, an internet gateway or service gateway, a route table, and security lists.
- A compartment to contain Data Science resources (projects, notebook sessions, models, and work requests). The compartment can contain the resources for multiple services not just Data Science.
The same compartment can own both network resources, Data Science-related resources, and the resources of other OCI services. Alternatively, you can have multiple compartments for network resources and Data Science related resources, structuring them for what is best for your organization.
If you already have suitable compartments, there is no need to create new ones.
Creating the VCN and Subnets to Use with Data Science
To create and use a Data Science notebook session, you must have a VCN that contains a subnet. Then, you can create the notebook session in that subnet.
All egress from a notebook session is routed through this subnet. To access data and install additional packages to use in the notebook session, you must configure the subnet with appropriate access.
For a private subnet to have egress access to the internet, it must have a route to a NAT Gateway. For egress access to the public internet, we recommend that you use a private subnet with a route to a NAT Gateway. A NAT gateway gives instances in a private subnet access to the internet.
If you already have suitable VCN and subnets, you don't need to create new ones. You can create an OCI VCN with these basic steps.
Each subnet in the VCN must have a CIDR block that provides at least one IP address for each concurrent notebook session. We recommend that you have a minimum of 12 free IP addresses for AD-specific subnets and a minimum of 32 free IP addresses for regional subnets.
We strongly recommend that each subnet has a CIDR block that provides more than the minimum number of free IP addresses.
Use these steps to create a simple VCN and subnet to access the Public Internet from a Data Science notebook session:
Creating the VCN and Subnets for Notebook Sessions Running on GPUs
For Tokyo (NRT), you can use the same VCN and subnet your are using for notebook sessions on CPU shapes, see Creating the VCN and Subnets to Use with Data Science.
If you are working in the Frankfurt (FRA), Ashburn (IAD), or London (LHR) regions, you must create new subnets. GPU availability in those regions is limited to certain availability domains. You must create subnets that are availability domain-specific with access to the public internet.
Use these steps to create a simple VCN and an availability domain-specific subnet to access the Public Internet from a Data Science notebook session: