Monitor Compute

This section explains the different methods and metrics you can use to monitor compute in your AI Data Platform.

Topics:

View Spark UI

You can view the Spark Web UI to see to monitor the status and resource consumption of your all-purpose compute clusters.

Navigate to your workspace and click Compute.
Click your cluster, then click the Spark UI tab.
Optional: Click to pop-out button on the top right to view the Spark UI in a separate window.

View Driver and Worker Logs

You can view the Driver and Worker Logs of your All Purpose Compute Clusters for troubleshooting or debugging.

Navigate to your workspace and click Compute.
Click your cluster, then click the Logs tab.
Filter your logs to see more specific information.
Click Download to save a local copy of your filtered data.

View Metrics

You can monitor the infrastructure metrics of your compute clusters for troubleshooting or for making any sizing adjustments.

You can view status and history for the following metrics:

CPU Utilization
Memory Utilization
Disk read
Disk write
File system utilization
Garbage Collector CPU utilization
Network received
Network transmitted
Active tasks
Total failed tasks
Total task tasks
Total completed tasks
Total number of tasks
Total shuffle read bytes
Total shuffle write bytes
Total task duration in seconds
SQL: Peak concurrent queries
SQL: Peak concurrent connections

Navigate to your workspace and click Compute.
Click your cluster, then click the Metrics tab.
Select time frames using the Date filter to view metrics over a specific period.
Select an option from the Interval dropdown to filter information for a specific metric.

View Event Logs

You can view the Event Logs to monitor different cluster related operations, like creation of clusters, restarts of clusters, init script execution, or monthly maintenance updates.

AI Data Platform retains the last 14 days of event logs.

Navigate to your workspace and click Compute.
Click your cluster, then click the Event Logs tab.
Filter your logs to see more specific information.

View Notebooks

You can view all the notebooks the current cluster is attached with. This view includes notebook count, notebook status, and provides you a quick way to navigate to the appropriate notebooks.

Navigate to your workspace and click Compute.
Click your cluster, then click the Notebooks tab.

The notebook state is Active if code is running from that notebook. The notebook state is Idle if no code is running from that notebook.
Click the name of a notebook to go to it.