Using Node Doctor to Troubleshoot Worker Node Issues
On Private Cloud Appliance, Node Doctor is a script that is included in the latest OKE images.
If Private Cloud Appliance has worker nodes that were created before Node Doctor was included in OKE images, you can use node cycling to update older worker node images. See Node Cycling an OKE Node Pool.
If a cluster has a worker node that's in a state other than Active or Running, use the Node Doctor utility to troubleshoot the issues.
Node Doctor scans a worker node and reports the health status of the node. Node Doctor can do the following tasks:
-
Identify potential problem areas and provide references to information to help you address those problem areas. See Print Troubleshooting Information.
-
Collect node system information into a support bundle if you need help from Oracle Support. See Create a Support Bundle.
Use Node Doctor only on worker nodes. Because Node Doctor is installed on OKE images, Node Doctor is also available on cluster control plane nodes. Don't use Node Doctor on control plane nodes.
Node Doctor was first delivered in Private Cloud Appliance Release 3.0.2-b1325160. If your node pools were created on that release or later, then you can proceed with the instructions in this topic. If your worker node image is from an earlier release, then that node does not have access to Node Doctor. Note that if your Private Cloud Appliance is running a release that includes Node Doctor, then you could use node cycling to update older worker node images. See Node Cycling an OKE Node Pool.
Connect to the Worker Node Using SSH
Perform the following steps to connect to the worker node that you want to troubleshoot.
-
Ensure that you have a private and public SSH key pair.
You must have the private key that goes with the public key that was added to the node when the node was created.
-
Get the node username. OKE images have the initial username
opcconfigured. -
Get the IP address of the worker node that you need to troubleshoot.
The IP address is on the Networking tab of the node details page in the Compute Web UI.
-
If the node has a public IP address, use the public IP address.
-
If the node is on a private IP, then connect to the node through the bastion host.
If a bastion host isn't available, see Creating a Bastion.
-
-
Enter the following command at a shell prompt on your local system (public IP address) or on the bastion host (private IP address):
ssh -i private_key_file username@ip-address-
private_key_file. The full path and name of the file that contains the private SSH key that goes with the public key that was added to the node when the node was created.
-
username. The default username for the node. This value probably is
opc. -
ip-address. The node IP address that you got in the previous step.
-
-
Ensure that you have execute permissions for the following script. You run the script later.
ls -l /usr/local/bin/node-doctor.sh -rwxr-xr-x 1 user1 user1 6288 Dec 5 2024 usr/local/bin/node-doctor.sh
Print Troubleshooting Information
While logged in to the worker node as described in Connect to the Worker Node Using SSH, enter the following command to print information that identifies potential problem areas:
$ sudo /usr/local/bin/node-doctor.sh --checkUse the following command to see more options:
$ sudo /usr/local/bin/node-doctor.sh --helpCreate a Support Bundle
If you can't resolve the issue, use the following command to create a support bundle with relevant information for Oracle Support:
$ sudo /usr/local/bin/node-doctor.sh --generateThe support bundle is in the /tmp directory as oke-support-bundle-dateTtime.tar.
Monitor the /tmp directory to ensure that it doesn't fill up. Remove old files using the rm command, for example.
See the following resources for information about submitting a Support Request and uploading a bundle: