Using Support Bundles
Support bundles are files of diagnostic data collected from the Private Cloud Appliance that are used to evaluate and fix problems.
Support bundles can be uploaded to Oracle Support automatically or manually. Support bundles are uploaded securely and contain the minimum required data: system identity (not IP addresses), problem symptoms, and diagnostic information such as logs and status.
Support bundles can be created and not uploaded. You might want to create a bundle for your own use. Creating a support bundle is a convenient way to collect related data.
Support bundles are created and uploaded in the following ways:
- Oracle Auto Service Request (ASR)
-
ASR automatically creates a service request and support bundle when certain hardware faults occur. The service request and support bundle are automatically sent to Oracle Support, and the Private Cloud Appliance administrator is notified. See Using Oracle Auto Service Request.
-
asrInitiateBundle
-
The
asrInitiateBundle
command is aPCA-ADMIN
command that creates a support bundle, attaches the support bundle to an existing service request, and uploads to Oracle Support. See Using the asrInitiateBundle Command. -
support-bundles
-
The
support-bundles
command is a management node command that creates a support bundle of a specified type. Oracle Support might ask you to run this command to collect more data related to a service request, or you might want to collect this data for your own use. See Using the support-bundles Command. - Manual upload to Oracle Support
-
Several methods are available for uploading support bundles or other data to Oracle Support. See Uploading Support Bundles to Oracle Support.
Using the asrInitiateBundle
Command
The asrInitiateBundle
command takes three parameters, all required:
PCA-ADMIN> asrInitiateBundle mode=triage sr=SR_number bundleType=auto
A triage
support bundle is collected and automatically attached to service
request SR_number
. For more information about the
triage
support bundle, see Triage Mode.
If the ASR service is enabled, bundleType=auto
uploads the bundle to Oracle Support using the Phone Home service. For
information about the Phone Home service, see Registering Private Cloud Appliance for Oracle Auto Service Request.
Using the support-bundles
Command
The support-bundles
command collects various types of bundles, or modes, of
diagnostic data such as health check status, command outputs, and logs. This topic describes
the available modes. The following is the recommended way to use this command:
-
Start data collection by specifying
triage
mode to understand the preliminary status of the Private Cloud Appliance. -
If NOT_HEALTHY appears in the
triage
mode results, then do one of the following:-
Use
time_slice
mode to collect data by time slots. These results can be further narrowed by specifying pod name, job, and k8s_app label. -
Use
smart
mode to query data from specific health-checkers.
-
The support-bundles
command requires a mode (-m
) option.
Some modes have additional options.
The following table lists the options that are common to all modes of the
support-bundles
command.
Option | Description | Required |
---|---|---|
|
The type of bundle. |
yes |
|
The service request number. |
no |
For most modes, the support-bundles
command produces a single archive
file. The output archive file is named
[SR_number_]pca-support-bundle.current-time.tgz
.
The SR_number
is used if you provided the -sr
option. If you are creating the support bundle for a service request, you should specify the
SR_number
.
For native
mode, the support-bundles
command produces a
directory of archive files.
The archive files are stored in /nfs/shared_storage/support_bundles/
on the
management node.
Log in to the Management Node
To use the support-bundles
command, log in as root
to the
management node that is running Pacemaker resources. Collect data first from the management
node that is running Pacemaker resources, then from other management nodes as needed.
If you do not know which management node is running Pacemaker resources, log in to any management node and check Pacemaker cluster status. The following command shows the Pacemaker cluster resources are running on pcamn01.
[root@pcamn01 ~]# pcs status Cluster name: mncluster Stack: corosync Current DC: pcamn01 ... Full list of resources: scsi_fencing (stonith:fence_scsi): Stopped (disabled) Resource Group: mgmt-rg vip-mgmt-int (ocf::heartbeat:IPaddr2): Started pcamn01 vip-mgmt-host (ocf::heartbeat:IPaddr2): Started pcamn01 vip-mgmt-ilom (ocf::heartbeat:IPaddr2): Started pcamn01 vip-mgmt-lb (ocf::heartbeat:IPaddr2): Started pcamn01 vip-mgmt-ext (ocf::heartbeat:IPaddr2): Started pcamn01 l1api (systemd:l1api): Started pcamn01 haproxy (ocf::heartbeat:haproxy): Started pcamn01 pca-node-state (systemd:pca_node_state): Started pcamn01 dhcp (ocf::heartbeat:dhcpd): Started pcamn01 hw-monitor (systemd:hw_monitor): Started pcamn01 Daemon Status: corosync: active/enabled pacemaker: active/enabled pcsd: active/enabled
Triage Mode
In triage
mode, Prometheus platform_health_check
is
queried for both HEALTHY and NOT_HEALTHY status. If NOT_HEALTHY is found, use
time_slice
mode to get more detail.
[root@pcamn01 ~]# support-bundles -m triage
The following files are in the output archive file.
File | Description |
---|---|
|
Time stamp and command line to generate this bundle. |
|
Pods running in the compute node. |
|
Pods running in the management node. |
|
Rack installation time and build version. |
|
Chunk files in json. |
Time Slice Mode
In time slice mode, data is collected by specifying start and end timestamps.
If you do not specify either the -j
or --all
option, then
data is collected from all health checker jobs.
You can narrow the data collection by specifying any of the following:
-
Loki job label
-
Loki k8s_app label
-
Pod name
[root@pcamn01 ~]# support-bundles -m time_slice -j flannel-checker -s 2021-05-29T22:40:00.000Z \ -e 2021-06-29T22:40:00.000Z -l INFO
See more examples below.
The time slice mode of the support-bundles
command has the following
options in addition to the mode and service request number options listed at the beginning
of this topic.
-
Only one of
--job_name
,--all
, and--k8s_app
an be specified. -
If none of
--job_name
,--all
, or--k8s_app
is specified, the pod filtering will occur on the default (.+checker
). -
The
--all
option can collect a huge amount of data. You might want to limit your time slice to 48 hours.
Option | Description | Required |
---|---|---|
|
Loki job name. Default value: See Label List Query below. |
no |
--all
|
Queries all job names except for jobs known for too much logging, such as
audit , kubernetes-audit , and
vault-audit and k8s_app label
pcacoredns .
|
no |
--k8s_app
label
|
The See Label List Query below. |
no |
|
Message level |
no |
|
Start date in format The minimum argument is |
yes |
|
End date in format The minimum argument is |
yes |
--pod_name pod_name
|
The pod name (such as kube or
network-checker ) to filter output based on the pod. Only the
starting letters are necessary.
|
no |
Label List Query
Use the label list query to list the available job names and k8s_app
label values.
[root@pcamn01 ~]# support-bundles -m label_list 2021-10-14T23:19:18.265 - support_bundles - INFO - Starting Support Bundles 2021-10-14T23:19:18.317 - support_bundles - INFO - Locating filter-logs Pod 2021-10-14T23:19:18.344 - support_bundles - INFO - Executing command - ['python3', '/usr/lib/python3.6/site-packages/filter_logs/label_list.py'] 2021-10-14T23:19:18.666 - support_bundles - INFO - Label: job Values: ['admin', 'api-server', 'asr-client', 'asrclient-checker', 'audit', 'cert-checker', 'ceui', 'compute', 'corosync', 'etcd', 'etcd-checker', 'filesystem', 'filter-logs', 'flannel-checker', 'his', 'hms', 'iam', 'k8s-stdout-logs', 'kubelet', 'kubernetes-audit', 'kubernetes-checker', 'l0-cluster-services-checker', 'messages', 'mysql-cluster-checker', 'network-checker', 'ovm-agent', 'ovn-controller', 'ovs-vswitchd', 'ovsdb-server', 'pca-healthchecker', 'pca-nwctl', 'pca-platform-l0', 'pca-platform-l1api', 'pca-upgrader', 'pcsd', 'registry-checker', 'sauron-checker', 'secure', 'storagectl', 'uws', 'vault', 'vault-audit', 'vault-checker', 'zfssa-checker', 'zfssa-log-exporter'] Label: k8s_app Values: ['admin', 'api', 'asr-client', 'asrclient-checker', 'brs', 'cert-checker', 'compute', 'default-http-backend', 'dr-admin', 'etcd', 'etcd-checker', 'filesystem', 'filter-logs', 'flannel-checker', 'fluentd', 'ha-cluster-exporter', 'has', 'his', 'hms', 'iam', 'ilom', 'kube-apiserver', 'kube-controller-manager', 'kube-proxy', 'kubernetes-checker', ' l0-cluster-services-checker', 'loki', 'loki-bnr', 'mysql-cluster-checker', 'mysqld-exporter', 'network-checker', 'pcacoredns', 'pcadnsmgr', 'pcanetwork', 'pcaswitchmgr', 'prometheus', 'rabbitmq', 'registry-checker', 'sauron-api', 'sauron-checker', 'sauron-grafana', 'sauron-ingress-controller', 'sauron-mandos', 'sauron-operator', 'sauron-prometheus', 'sauron-prometheus-gw', 'sauron-sauron-exporter', 'sauron.oracledx.com', 'storagectl', 'switch-metric', 'uws', 'vault-checker', 'vmconsole', 'zfssa-analytics-exporter', 'zfssa-csi-nodeplugin', 'zfssa-csi-provisioner', 'zfssa-log-exporter']
Examples:
No job label, no k8s_app label, collect log from all health checkers.
[root@pcamn01 ~]# support-bundles -m time_slice -sr 3-xxxxxxxxxxx -s "2022-01-11T00:00:00" -e "2022-01-12T23:59:59"
One job ceui.
[root@pcamn01 ~]# support-bundles -m time_slice -sr 3-xxxxxxxxxxx -j ceui -s "2022-01-11T00:00:00" -e "2022-01-12T23:59:59"
One k8s_app network-checker.
[root@pcamn01 ~]# support-bundles -m time_slice -sr 3-xxxxxxxxxxx --k8s_app network-checker -s "2022-01-11T00:00:00" -e "2022-01-12T23:59:59"
All jobs and date.
[root@pcamn01 ~]# support-bundles -m time_slice -sr 3-xxxxxxxxxxx -s `date -d "2 days ago" -u +"%Y-%m-%dT%H:%M:%S.000Z"` -e `date -d +u +"%Y-%m-%dT%H:%M:%S.000Z"`
All jobs.
[root@pcamn01 ~]# support-bundles -m time_slice -sr 3-xxxxxxxxxxx --all -s "2022-01-11T00:00:00" -e "2022-01-12T23:59:59"
The following files are in the output archive file.
File | Description |
---|---|
|
Time stamp and command line to generate this bundle. |
|
Chunk files in json. |
Smart Mode
In smart mode, health checkers are queried for recent NOT_HEALTHY status. By default, two
days of logs are collected. If you need more than two days of logs, specify the
--force
option. Use the -hc
option to specify a health
checker.
[root@pcamn01 ~]# support-bundles -m smart
See more examples below.
The smart mode of the support-bundles
command has the following options
in addition to the mode and service request number options listed at the beginning of this
topic.
If only the start date or only the end date is given, the time is calculated and queried two days prior to the given end date or two days after the given start date. If only the start date is given and under the two day time range, the default most recent unhealthy time is used.
Option | Description | Required |
---|---|---|
|
Loki health checker name. See the health checker log files table below. |
no |
--errors_only
|
Level name filtering takes place only on Error, Critical, and Severe. | no |
--force
|
Force the start date to override the two-day time range limit. |
no |
|
Start date in format The minimum argument is Default value: End date minus 2 days |
no |
|
End date in format The minimum argument is Default value: Most recent unhealthy time |
no |
The following table lists the log files for each health checker.
Health Checker | Supporting Log Files |
---|---|
L0_hw_health-checker |
|
cert-checker |
No logs - only certificate and expiry date (from the checker) |
etcd-checker |
|
flannel-checker |
|
kubernetes-checker |
|
l0-cluster-services-checker |
|
mysql-cluster-checker |
|
network-checker |
|
registry-checker |
messages (registry itself does not produce logs) |
vault-checker |
|
zfssa-checker |
|
Examples:
No -hc
. Query unhealthy data from all health checkers.
[root@pcamn01 ~]# support-bundles -m smart -sr 3-xxxxxxxxxxx
Use -hc
to specify one health checker.
[root@pcamn01 ~]# support-bundles -m smart -sr 3-xxxxxxxxxxx -hc network-checker
Timestamps with --force
.
[root@pcamn01 ~]# support-bundles -m smart -sr 3-xxxxxxxxxxx -s "2022-01-11/00:00:00" -e "2022-01-15/23:59:59" --force
The following files are in the output archive file.
File | Description |
---|---|
|
Time stamp and command line to generate this bundle. |
|
Chunk files in json. |
Native Mode
Unlike other support bundle modes, the native bundle command returns immediately and the
bundle collection runs in the background. Native bundles might take hours to collect.
Collection progress information is provided in the native_collection.log
in
the bundle directory.
Also unlike other support bundle modes, the output of native bundles is not a single
archive file. Instead, a bundle directory is created in the
/nfs/shared_storage/support_bundles/
area on the management node. The
directory contains the native_collection.log
file and a number of
tar.gz
files.
[root@pcamn01 ~]# support-bundles -m native -t bundle_type [-c component_name] [-sr SR_number]
The native mode of the support-bundles
command has the following options
in addition to the mode and service request number options listed at the beginning of this
topic.
Option | Description | Required |
---|---|---|
|
Bundle type: |
yes |
|
Component name This option only applies to type |
no |
ZFS Bundle
When type
is zfs-bundle
, a ZFS support bundle
collection starts on both ZFS nodes and downloads the new ZFS support bundles into the
bundle directory.
[root@pcamn01 ~]# support-bundles -m native -t zfs-bundle 2021-11-16T22:49:30.982 - support_bundles - INFO - Starting Support Bundles 2021-11-16T22:49:31.037 - support_bundles - INFO - Locating filter-logs Pod 2021-11-16T22:49:31.064 - support_bundles - INFO - Executing command - ['python3', '/usr/lib/python3.6/site-packages/filter_logs/native.py', '-t', 'zfs-bundle'] 2021-11-16T22:49:31.287 - support_bundles - INFO - LAUNCHING COMMAND: ['python3', '/usr/lib/python3.6/site-packages/filter_logs/native_app.py', '-t', 'zfs-bundle', '--target_directory', '/support_bundles/zfs-bundle_20211116T224931267'] ZFS native bundle collection running to /nfs/shared_storage/support_bundles/zfs-bundle_20211116T224931267 Monitor /nfs/shared_storage/support_bundles/zfs-bundle_20211116T224931267/native_collection.log for progress. 2021-11-16T22:49:31.287 - support_bundles - INFO - Finished running Support Bundles
SOS Report Bundle
When type
is sosreport
, the
component_name
is a management node or compute node.
If component_name
is not specified, the report is collected
from all management and compute nodes.
[root@pcamn01 ~]# support-bundles -m native -t sosreport -c pcacn003 -sr SR_number
Uploading Support Bundles to Oracle Support
After you create a support bundle using the support-bundles
command as
described in Using the support-bundles Command, you can use the methods described in this topic to upload the support
bundle to Oracle Support.
To use these methods, you must satisfy the following requirements:
-
You must have a My Oracle Support user ID with Create and Update SR permissions granted by the appropriate Customer User Administrator (CUA) for each Support Identifier (SI) being used to upload files.
-
For file uploads to existing service requests, the Support Identifier associated with the service request must be in your profile.
-
To upload files larger than 2 GB, sending machines must have network access to connect to the My Oracle Support servers at
transport.oracle.com
to use FTPS and HTTPS.The Oracle FTPS service is a "passive" implementation. With an implicit configuration, the initial connection is from the client to the service on a control port of 990 and the connection is then switched to a high port to exchange data. Oracle defines a possible range of the data port of 32000-42000, and depending upon your network configuration you may need to enable outbound connections on both port 990 and 32000-42000. TLSv1.2 / ECDHE-RSA-AES128-GCM-SHA256 is the only encryption method enabled.
The Oracle HTTPS diagnostic upload service uses the standard HTTPS port of 443 and does not require any additional ports to be opened.
When using command line protocols, do not include your password in the command. Enter your password only when prompted.
-
Oracle requires the use of TLS 1.2+ for all file transfers.
-
Do not upload encrypted or password-protected files, standalone or within an archive. A Service Request update will note this as a corrupted file or reject the upload as disallowed file types were found. Files are encrypted when you use FTPS and HTTPS; additional protections are not required.
-
Do not upload files with file type extensions
exe
,bat
,asp
, orcom
, either standalone or within an archive. A Service Request update will note that a disallowed file type was found.
Uploading Files 2 GB or Smaller
Use the SR file upload utility on the My Oracle Support Portal.
-
Log in to My Oracle Support with your My Oracle Support user name and password.
-
Do one of the following:
-
Create a new service request and in the next step, select the Upload button.
-
Select and open an existing service request.
-
-
Click the Add Attachment button located at the top of the page.
-
Click the Choose File button.
-
Navigate and select the file to upload.
-
Click the Attach File button.
You can also use the methods described in the next section for larger files.
Uploading Files Larger Than 2 GB
You cannot upload a file larger than 200 GB. See Splitting Files.
FTPS
Syntax:
Be sure to include the /
character after the service request number.
$ curl -T path_and_filename -u MOS_user_ID ftps://transport.oracle.com/issue/SR_number/
Example:
$ curl -T /u02/files/bigfile.tar -u MOSuserID@example.com ftps://transport.oracle.com/issue/3-1234567890/
HTTPS
Syntax:
Be sure to include the /
character after the service request number.
$ curl -T path_and_filename -u MOS_user_ID https://transport.oracle.com/upload/issue/SR_number/
Example:
$ curl -T D:\data\bigfile.tar -u MOSuserID@example.com https://transport.oracle.com/upload/issue/3-1234567890/
Renaming the file during send
$ curl -T D:\data\bigfile.tar -u MOSuserID@example.com https://transport.oracle.com/upload/issue/3-1234567890/NotSoBig.tar
Using a proxy
$ curl -k -T D:\data\bigfile.tar -x proxy.example.com:80 -u MOSuserID@example.com https://transport.oracle.com/upload/issue/3-1234567890/
Splitting Files
You can split a large file into multiple parts and upload the parts. Oracle Transport will concatenate the segments when you complete uploading all the parts.
Only HTTPS protocol can be used. Only the UNIX split utility can be used. The Microsoft Windows split utility produces an incompatible format.
To reduce upload times, compress the original file prior to splitting.
-
Split the file.
The following command splits the file
file1.tar
into 2 GB parts namedfile1.tar.partaa
andfile1.tar.partab
.Important:
Specify the
.part
extension exactly as shown below.$ split –b 2048m file1.tar file1.tar.part
-
Upload the resulting
file1.tar.partaa
andfile1.tar.partab
files.Important:
Do not rename these output part files.
$ curl -T file1.tar.partaa -u MOSuserID@example.com https://transport.oracle.com/upload/issue/SR_number/ $ curl -T file1.tar.partab -u MOSuserID@example.com https://transport.oracle.com/upload/issue/SR_number/
-
Send the command to put the parts back together.
The spit files will not be attached to the service request. Only the final concatenated file will be attached to the service request.
$ curl -X PUT -H X-multipart-total-size:original_size -u MOSuserID@example.com https://transport.oracle.com/upload/issue/SR_number/file1.tar?multiPartComplete=true
In the preceding command,
original_size
is the size of the original unsplit file as shown by a file listing. -
Verify the size of the newly-attached file.
Note:
This verification command must be executed immediately after the concatenation command in Step 3. Otherwise, the file will have begun processing and will no longer be available for this command.
$ curl -I -u MOSuserID@example.com https://transport.oracle.com/upload/issue/SR_number/file1.tar X-existing-file-size: original_size
Resuming an Interrupted HTTPS Upload
You can resume a file upload that terminated abnormally. Resuming can only be done by using HTTPS. Resuming does not work with FTPS. When an upload is interrupted by some event, the start with retrieving the file size of the interrupted file
-
Determine how much of the file has already been uploaded.
$ curl -I -u MOSuserID@example.com https://transport.oracle.com/upload/issue/SR_number/myinfo.tar HTTP/1.1 204 No Content Date: Tue, 15 Nov 2022 22:53:54 GMT Content-Type: text/plain X-existing-file-size: already_uploaded_size X-Powered-By: Servlet/3.0 JSP/2.2
-
Resume the file upload.
Note the file size returned in “X-existing-file-size” in Step 1. Use that file size after the
-C
switch and in the-H “X-resume-offset:”
switch.$ curl -Calready_uploaded_size -H "X-resume-offset: already_uploaded_size" -T myinfo.tar -u MOSuserID@example.com https://transport.oracle.com/upload/issue/SR_number/myinfo.tar
-
Verify the final file size.
$ curl -I -u MOSuserID@example.com https://transport.oracle.com/upload/issue/SR_number/myinfo.tar -H X-existing-file-size: original_size
In the preceding command,
original_size
is the size of the original file as shown by a file listing.