Using Support Bundles
Support bundles are files of diagnostic data collected from the Private Cloud Appliance that are used to evaluate and fix problems.
Support bundles can be uploaded to Oracle Support automatically or manually. Support bundles are uploaded securely and contain the minimum required data: system identity (not IP addresses), problem symptoms, and diagnostic information such as logs and status.
Support bundles can be created and not uploaded. You might want to create a bundle for your own use. Creating a support bundle is a convenient way to collect related data.
Support bundles are created and uploaded in the following ways:
- Oracle Auto Service Request (ASR)
-
ASR automatically creates a service request and support bundle when certain hardware faults occur. The service request and support bundle are automatically sent to Oracle Support, and the Private Cloud Appliance administrator is notified. See Using Oracle Auto Service Request.
-
asrInitiateBundle -
The
asrInitiateBundlecommand is aPCA-ADMINcommand that creates a support bundle, attaches the support bundle to an existing service request, and uploads the support bundle to Oracle Support. See Using the asrInitiateBundle Command. -
support-bundles -
The
support-bundlescommand is a management node command that creates a support bundle of a specified type. Oracle Support might ask you to run this command to collect more data related to a service request, or you might want to collect this data for your own use. See Using the support-bundles Command. - Manual upload to Oracle Support
-
Several methods are available for uploading support bundles or other data to Oracle Support. See Uploading Support Bundles to Oracle Support.
Using the asrInitiateBundle Command
The asrInitiateBundle command takes three parameters, all required:
PCA-ADMIN> asrInitiateBundle mode=triage sr=SR_number bundleType=autoA triage support bundle is collected and automatically attached to service
request SR_number. For more information about the
triage support bundle, see Triage Mode.
If the ASR service is enabled, bundleType=auto uploads the bundle to Oracle Support using the Phone Home service. For
information about the Phone Home service, see Registering Private Cloud Appliance for Oracle Auto Service Request. The
bundle is saved on the management node for two days after successful upload. See Using the support-bundles Command.
If you specify mode=native and do not specify any value for
nativeType, then a ZFS_BUNDLE is uploaded.
Using the support-bundles Command
The support-bundles command collects various types of bundles, or modes, of
diagnostic data such as health check status, command outputs, and logs. Depending on the
options provided, these files might contain logs or status. All modes collect files into a
bundle directory.
No more than one support bundle process is allowed at one time. A support bundle lock file is created at the beginning of bundle collection and removed when bundle collection is complete.
All support-bundles commands return immediately, and the bundle collection
runs in the background. This is because bundle collections might take a long time, perhaps
hours.
Bundles are stored for two days, then automatically deleted.
The following types of bundles are supported:
-
Triage Mode. Collects data about the current status of the Private Cloud Appliance.
-
Time Slice Mode. Collects data by time slots. These results can be further narrowed by specifying pod name, job, and k8s_app label.
-
Combo Mode. Collects a combination of triage and time slice data.
-
Native Mode. Collects data from management, compute, and ZFS nodes and from ILOM and Cisco hosts.
A good way to start to investigate an issue is to collect a combo bundle.
Look for NOT_HEALTHY in the triage mode results and compare that to what you
see in the time_slice mode results.
The support-bundles command requires a mode option. All modes accept the
service request number option. See the following table. Time slice and native modes have
additional options.
| Option | Description | Required |
|---|---|---|
|
|
The type of bundle. |
yes |
|
|
The service request number. |
no |
The support-bundles command output is stored in the following directory on
the management node, where bundle-type is the mode:
triage, time_slice, combo, or
native:
/nfs/shared_storage/support_bundles/SR_number_bundle-type-bundle_timestamp/
The SR_number is used if you provided the -sr
option. If you are creating the support bundle for a service request, specify the
SR_number.
This directory contains a bundle collection progress file and an archive file. The bundle collection progress file has the following name:
bundle-type_collection.logThe output archive file has the following name:
SR_number_bundle-type-bundle_timestamp.tar.gz
The archive file contains a header.json file with the following default
components:
-
current-time- the timestamp -
create-support-bundle- the command line that was used -
sr-number- the SR number associated with the archive file
Log in to the Management Node
To use the support-bundles command, log in as root to the
management node that is running Pacemaker resources. Collect data first from the management
node that is running Pacemaker resources, then from other management nodes as needed.
If you do not know which management node is running Pacemaker resources, log in to any management node and check Pacemaker cluster status. The following command shows the Pacemaker cluster resources are running on pcamn01.
[root@pcamn01 ~]# pcs status Cluster name: mncluster Stack: corosync Current DC: pcamn01 ... Full list of resources: scsi_fencing (stonith:fence_scsi): Stopped (disabled) Resource Group: mgmt-rg vip-mgmt-int (ocf::heartbeat:IPaddr2): Started pcamn01 vip-mgmt-host (ocf::heartbeat:IPaddr2): Started pcamn01 vip-mgmt-ilom (ocf::heartbeat:IPaddr2): Started pcamn01 vip-mgmt-lb (ocf::heartbeat:IPaddr2): Started pcamn01 vip-mgmt-ext (ocf::heartbeat:IPaddr2): Started pcamn01 l1api (systemd:l1api): Started pcamn01 haproxy (ocf::heartbeat:haproxy): Started pcamn01 pca-node-state (systemd:pca_node_state): Started pcamn01 dhcp (ocf::heartbeat:dhcpd): Started pcamn01 hw-monitor (systemd:hw_monitor): Started pcamn01 Daemon Status: corosync: active/enabled pacemaker: active/enabled pcsd: active/enabled
Triage Mode
In triage mode, Prometheus platform_health_check is
queried for both HEALTHY and NOT_HEALTHY status. If NOT_HEALTHY is found, use
time_slice mode to get more detail.
[root@pcamn01 ~]# support-bundles -m triage
The following files are in the output archive file.
| File | Description |
|---|---|
|
|
Time stamp and command line to generate this bundle. |
|
|
Pods running in the compute node. |
|
|
Hardware component list retrieved from |
|
|
Pods running in the management node. |
|
|
Rack installation time and build version. |
|
|
Chunk files in json. |
Time Slice Mode
In time slice mode, data is collected by specifying start and end timestamps. Both of the following options are required:
-
-s start_date -
-e end_date
Time slice mode has the following options in addition to the mode and service request
number options. These options help narrow the data collection. If you do not specify either
the -j or --all option, then data is collected from all
health checker jobs.
-
Only one of
--job_name,--all, and--k8s_appan be specified. -
If none of
--job_name,--all, or--k8s_appis specified, the pod filtering will occur on the default (.+checker). -
The
--alloption can collect a huge amount of data. You might want to limit your time slice to 48 hours.
Example:
[root@pcamn01 ~]# support-bundles -m time_slice -j flannel-checker -s 2021-05-29T22:40:00.000Z \ -e 2021-06-29T22:40:00.000Z -l INFO
See more examples below.
| Option | Description | Required |
|---|---|---|
|
|
Start date in format The minimum argument is |
yes |
|
|
End date in format The minimum argument is |
yes |
|
|
Loki job name. Default value: See Label List Query below. |
no |
--k8s_app label
|
The See Label List Query below. |
no |
--all
|
Queries all job names except for jobs known for too much logging, such as
audit, kubernetes-audit, and
vault-audit and k8s_app label
pcacoredns.
|
no |
|
|
Message level |
no |
|
|
The pod name (such as |
no |
|
|
Timeout in seconds for a single Loki query. By default it is 180 seconds. |
no |
Label List Query
Use the label list query to list the available job names and k8s_app label
values.
[root@pcamn01 ~]# support-bundles -m label_list 2021-10-14T23:19:18.265 - support_bundles - INFO - Starting Support Bundles 2021-10-14T23:19:18.317 - support_bundles - INFO - Locating filter-logs Pod 2021-10-14T23:19:18.344 - support_bundles - INFO - Executing command - ['python3', '/usr/lib/python3.6/site-packages/filter_logs/label_list.py'] 2021-10-14T23:19:18.666 - support_bundles - INFO - Label: job Values: ['admin', 'api-server', 'asr-client', 'asrclient-checker', 'audit', 'cert-checker', 'ceui', 'compute', 'corosync', 'etcd', 'etcd-checker', 'filesystem', 'filter-logs', 'flannel-checker', 'his', 'hms', 'iam', 'k8s-stdout-logs', 'kubelet', 'kubernetes-audit', 'kubernetes-checker', 'l0-cluster-services-checker', 'messages', 'mysql-cluster-checker', 'network-checker', 'ovm-agent', 'ovn-controller', 'ovs-vswitchd', 'ovsdb-server', 'pca-healthchecker', 'pca-nwctl', 'pca-platform-l0', 'pca-platform-l1api', 'pca-upgrader', 'pcsd', 'registry-checker', 'sauron-checker', 'secure', 'storagectl', 'uws', 'vault', 'vault-audit', 'vault-checker', 'zfssa-checker', 'zfssa-log-exporter'] Label: k8s_app Values: ['admin', 'api', 'asr-client', 'asrclient-checker', 'brs', 'cert-checker', 'compute', 'default-http-backend', 'dr-admin', 'etcd', 'etcd-checker', 'filesystem', 'filter-logs', 'flannel-checker', 'fluentd', 'ha-cluster-exporter', 'has', 'his', 'hms', 'iam', 'ilom', 'kube-apiserver', 'kube-controller-manager', 'kube-proxy', 'kubernetes-checker', ' l0-cluster-services-checker', 'loki', 'loki-bnr', 'mysql-cluster-checker', 'mysqld-exporter', 'network-checker', 'pcacoredns', 'pcadnsmgr', 'pcanetwork', 'pcaswitchmgr', 'prometheus', 'rabbitmq', 'registry-checker', 'sauron-api', 'sauron-checker', 'sauron-grafana', 'sauron-ingress-controller', 'sauron-mandos', 'sauron-operator', 'sauron-prometheus', 'sauron-prometheus-gw', 'sauron-sauron-exporter', 'sauron.oracledx.com', 'storagectl', 'switch-metric', 'uws', 'vault-checker', 'vmconsole', 'zfssa-analytics-exporter', 'zfssa-csi-nodeplugin', 'zfssa-csi-provisioner', 'zfssa-log-exporter']
Examples:
No job label, no k8s_app label, collect log from all health checkers.
[root@pcamn01 ~]# support-bundles -m time_slice -sr 3-xxxxxxxxxxx -s "2022-01-11T00:00:00" -e "2022-01-12T23:59:59"
One job ceui.
[root@pcamn01 ~]# support-bundles -m time_slice -sr 3-xxxxxxxxxxx -j ceui -s "2022-01-11T00:00:00" -e "2022-01-12T23:59:59"
One k8s_app network-checker.
[root@pcamn01 ~]# support-bundles -m time_slice -sr 3-xxxxxxxxxxx --k8s_app network-checker -s "2022-01-11T00:00:00" -e "2022-01-12T23:59:59"
All jobs and date.
[root@pcamn01 ~]# support-bundles -m time_slice -sr 3-xxxxxxxxxxx -s `date -d "2 days ago" -u +"%Y-%m-%dT%H:%M:%S.000Z"` -e `date -d +u +"%Y-%m-%dT%H:%M:%S.000Z"`
All jobs.
[root@pcamn01 ~]# support-bundles -m time_slice -sr 3-xxxxxxxxxxx --all -s "2022-01-11T00:00:00" -e "2022-01-12T23:59:59"
The following files are in the output archive file.
| File | Description |
|---|---|
|
|
Time stamp and command line to generate this bundle. |
|
|
Chunk files in json. Time slice bundles have a limit of 500,000 logs per query, from start time. |
|
|
Rack installation time and build version. |
Combo Mode
The combo mode is a combination of a triage bundle and a time slice
bundle. The output includes an archive file and two collection log files:
triage_collection.log and time_slice_collection.log.
The following files are in the output archive file.
| File | Description |
|---|---|
|
|
The triage bundle archive file. |
|
|
The time slice bundle archive file. The time slice data collected is for |
Native Mode
The native_collection.log file in the bundle directory provides collection
progress information. Native bundles can take hours to collect.
The native mode has the following parameters in addition to mode and SR
number.
| Parameter | Description | Required |
|---|---|---|
|
|
Default value: |
no |
|
|
Component name, such as the name of a management, compute, or ZFS node, or an ILOM or Cisco host. |
no |
The following files are in the output archive file.
| File | Description |
|---|---|
|
|
Time stamp and command line to generate this bundle. |
|
Native bundle files |
These files are specific to the |
|
|
Rack installation time and build version. |
ZFS Bundle
When nativetype is a ZFS support bundle, collection starts
on both ZFS nodes and downloads the new ZFS support bundles into the bundle directory. When
nativetype is not specified,
zfs_bundle is created by default.
[root@pcamn01 ~]# support-bundles -m native -t zfs_bundle
SOS Report Bundle
When nativetype is an SOS report bundle, the report is
collected from the management node or compute node specified by the
--component parameter. If --component is not specified,
the report is collected from all management and compute nodes.
[root@pcamn01 ~]# support-bundles -m native -t sosreport -c pcamn01
ILOM Snapshot
When nativeType=ilom_snapshot, the value of the
--component parameter is the ILOM host name of a management node or
compute node. If the --component parameter is not specified, the report is
collected from all ILOM hosts.
[root@pcamn01 ~]# support-bundles -m native -t ilom_snapshot -c ilom-pcacn007
Cisco Bundle
When nativetype is cisco-bundle, the value
of the --component parameter is an internal Cisco management, aggregation,
or access switch management host name.
[root@pcamn01 ~]# support-bundles -m native -t cisco-bundle -c accsn01
To create a cisco-bundle type of collection, the following conditions must
be met:
-
The Cisco OBFL module must be enabled on all Private Cloud Appliance Cisco switches. The Cisco OBFL module is enabled by default on all Private Cloud Appliance Cisco switches.
-
The Cisco EEM module must be enabled on all Private Cloud Appliance Cisco switches. The Cisco EEM module is enabled by default on all Private Cloud Appliance Cisco switches.
-
EEM (Embedded Event Manager) policy
Uploading Support Bundles to Oracle Support
After you create a support bundle using the support-bundles command as
described in Using the support-bundles Command, you can use the methods described in this topic to upload the support
bundle to Oracle Support.
To use these methods, you must satisfy the following requirements:
-
You must have a My Oracle Support user ID with Create and Update SR permissions granted by the appropriate Customer User Administrator (CUA) for each Support Identifier (SI) being used to upload files.
-
For file uploads to existing service requests, the Support Identifier associated with the service request must be in your profile.
-
To upload files larger than 2 GB, sending machines must have network access to connect to the My Oracle Support servers at
transport.oracle.comto use FTPS and HTTPS.The Oracle FTPS service is a "passive" implementation. With an implicit configuration, the initial connection is from the client to the service on a control port of 990 and the connection is then switched to a high port to exchange data. Oracle defines a possible range of the data port of 32000-42000, and depending upon your network configuration you may need to enable outbound connections on both port 990 and 32000-42000. TLSv1.2 / ECDHE-RSA-AES128-GCM-SHA256 is the only encryption method enabled.
The Oracle HTTPS diagnostic upload service uses the standard HTTPS port of 443 and does not require any additional ports to be opened.
When using command line protocols, do not include your password in the command. Enter your password only when prompted.
-
Oracle requires the use of TLS 1.2+ for all file transfers.
-
Do not upload encrypted or password-protected files, standalone or within an archive. A Service Request update will note this as a corrupted file or reject the upload as disallowed file types were found. Files are encrypted when you use FTPS and HTTPS; additional protections are not required.
-
Do not upload files with file type extensions
exe,bat,asp, orcom, either standalone or within an archive. A Service Request update will note that a disallowed file type was found.
Uploading Files 2 GB or Smaller
Use the SR file upload utility on the My Oracle Support Portal.
-
Log in to My Oracle Support with your My Oracle Support user name and password.
-
Do one of the following:
-
Create a new service request and in the next step, select the Upload button.
-
Select and open an existing service request.
-
-
Click the Add Attachment button located at the top of the page.
-
Click the Choose File button.
-
Navigate and select the file to upload.
-
Click the Attach File button.
You can also use the methods described in the next section for larger files.
Uploading Files Larger Than 2 GB
You cannot upload a file larger than 200 GB. See Splitting Files.
The curl commands in this section show required options and arguments. You
might want to add options such as --verbose and
--progress-bar to get more information about your upload. The
--progress-meter (more information than --progress-bar)
should be on by default, but it is disabled when curl is writing other
information to stdout. Note that some options might not be available or
might behave differently on some operating systems or some versions of
curl.
The following are the most common messages from uploading bundles to Oracle Support if you use the
--verbose option with the curl command:
-
UPLOAD SUCCESSFUL. The bundle is successfully uploaded to Oracle Support. -
LOGIN FAILED. The user has an authentication issue. -
INVALID SR NUMBER. The user does not have attach privilege to this Service Request.
FTPS
Syntax:
Be sure to include the / character after the service request number.
$ curl -T path_and_filename -u MOS_user_ID ftps://transport.oracle.com/issue/SR_number/
Example:
$ curl -T /u02/files/bigfile.tar -u MOSuserID@example.com ftps://transport.oracle.com/issue/3-1234567890/
HTTPS
Syntax:
Be sure to include the / character after the service request number.
$ curl -T path_and_filename -u MOS_user_ID https://transport.oracle.com/upload/issue/SR_number/
Example:
$ curl -T D:\data\bigfile.tar -u MOSuserID@example.com https://transport.oracle.com/upload/issue/3-1234567890/
Renaming the file during send
$ curl -T D:\data\bigfile.tar -u MOSuserID@example.com https://transport.oracle.com/upload/issue/3-1234567890/NotSoBig.tar
Using a proxy
$ curl -k -T D:\data\bigfile.tar -x proxy.example.com:80 -u MOSuserID@example.com https://transport.oracle.com/upload/issue/3-1234567890/
Splitting Files
You can split a large file into multiple parts and upload the parts. Oracle Transport will concatenate the segments when you complete uploading all the parts.
Only HTTPS protocol can be used. Only the UNIX split utility can be used. The Microsoft Windows split utility produces an incompatible format.
To reduce upload times, compress the original file prior to splitting.
-
Split the file.
The following command splits the file
file1.tarinto 2 GB parts namedfile1.tar.partaaandfile1.tar.partab.Important:
Specify the
.partextension exactly as shown below.$ split –b 2048m file1.tar file1.tar.part
-
Upload the resulting
file1.tar.partaaandfile1.tar.partabfiles.Important:
Do not rename these output part files.
$ curl -T file1.tar.partaa -u MOSuserID@example.com https://transport.oracle.com/upload/issue/SR_number/ $ curl -T file1.tar.partab -u MOSuserID@example.com https://transport.oracle.com/upload/issue/SR_number/
-
Send the command to put the parts back together.
The spit files will not be attached to the service request. Only the final concatenated file will be attached to the service request.
$ curl -X PUT -H X-multipart-total-size:original_size -u MOSuserID@example.com https://transport.oracle.com/upload/issue/SR_number/file1.tar?multiPartComplete=true
In the preceding command,
original_sizeis the size of the original unsplit file as shown by a file listing. -
Verify the size of the newly-attached file.
Note:
This verification command must be executed immediately after the concatenation command in Step 3. Otherwise, the file will have begun processing and will no longer be available for this command.
$ curl -I -u MOSuserID@example.com https://transport.oracle.com/upload/issue/SR_number/file1.tar X-existing-file-size: original_size
Resuming an Interrupted HTTPS Upload
You can resume a file upload that terminated abnormally. Resuming can only be done by using HTTPS. Resuming does not work with FTPS. When an upload is interrupted by some event, the start with retrieving the file size of the interrupted file
-
Determine how much of the file has already been uploaded.
$ curl -I -u MOSuserID@example.com https://transport.oracle.com/upload/issue/SR_number/myinfo.tar HTTP/1.1 204 No Content Date: Tue, 15 Nov 2022 22:53:54 GMT Content-Type: text/plain X-existing-file-size: already_uploaded_size X-Powered-By: Servlet/3.0 JSP/2.2
-
Resume the file upload.
Note the file size returned in “X-existing-file-size” in Step 1. Use that file size after the
-Cswitch and in the-H “X-resume-offset:”switch.$ curl -Calready_uploaded_size -H "X-resume-offset: already_uploaded_size" -T myinfo.tar -u MOSuserID@example.com https://transport.oracle.com/upload/issue/SR_number/myinfo.tar
-
Verify the final file size.
$ curl -I -u MOSuserID@example.com https://transport.oracle.com/upload/issue/SR_number/myinfo.tar -H X-existing-file-size: original_size
In the preceding command,
original_sizeis the size of the original file as shown by a file listing.