Notebook Sessions

Troubleshoot your notebook sessions.

Launching Notebook Sessions

If you see an error when you launch a notebook session, it is likely the result of misconfigurations when the notebook session was configured. If the configuration is fixed, the notebook session launches. Locate the failure reason on the notebook session details page in the Console, or using the CLI or API lifecycleDetails attribute.

When a notebook session fails to launch, use the following to correct the notebook session configuration:

  1. Check the dynamic groups and policies:

    OCI uses Dynamic Groups and Policies to associate roles to users and then authorizes roles to access resources. You must correctly configure dynamic groups and policies before you can launch and use notebook sessions.

  2. Check the network infrastructure configuration:

    A common reason for notebook session failure is the misconfiguration of network infrastructure.

    You can use the Data Science Resource Manager sample template, which builds all the basic configurations required to set up a notebook session. Use this template for your configuration when you don't need a custom configuration.

    Follow the Networking documentation to configure and validate your VCN, subnet, routing tables, and other networking configurations.

  3. Check the VCN and subnet of a deactivated notebook session:

    If you can't reactivate a deactivated notebook session, then ensure that the notebook session is being launched with the same VCN and subnet configuration that you used when you first created it.

    Switching from a regional subnet to an availability domain subnet can cause some resources to be left unreachable.

If these troubleshooting tips don't help resolve the notebook session launch issue, Get help and contact Support.

Sizing Notebook Sessions

For notebook sessions, we recommend you use a shape that has memory that's equivalent to three times the amount of data you want to process. For example for a 10 GB dataset, a VM.Standard2.2 or VM.Standard2.4 is a good option.

It's also a matter of budget and speed. Whenever you can, we recommend using GPUs. If you are using a VM.Standard2.16 or 2.24, we recommend you switch to GPUs. GPUs are more expensive, but the speed increase generally results in a reduction in cost for large operations. For example: we trained an XGBoost model on 11M rows that ran in about seven seconds on a GPU. The same model on a VM.Standard2.24 with all cores utilized took more than twenty minutes.

Notebook Session Metrics Alerting

We recommend that you set up alerts and notifications on your notebook session metrics to alert data scientists when CPU or memory reaches a specified threshold. Alerts are useful if you want to execute long running processes in a notebook session. The data scientists that are part of the user group that have access to the notebook session should be able to read metrics and set alarms and notifications. Before you can set up alarms on metrics, you must set up policies to allow a group to manage topics, subscriptions, and messages in the Notifications service.

After a group has access to the Notifications service, the next step is to ensure that data scientists can set their own alarms.