Using Support Bundles
Support bundles are files of diagnostic data collected from the Private Cloud Appliance that are used to evaluate and fix problems.
Support bundles can be uploaded to Oracle Support automatically or manually. Support bundles are uploaded securely and contain the minimum required data: system identity (not IP addresses), problem symptoms, and diagnostic information such as logs and status.
Support bundles can be created and not uploaded. You might want to create a bundle for your own use. Creating a support bundle is a convenient way to collect related data.
Support bundles are created and uploaded in the following ways:
- Oracle Auto Service Request (ASR)
-
ASR automatically creates a service request and support bundle when certain hardware faults occur. The service request and support bundle are automatically sent to Oracle Support, and the Private Cloud Appliance administrator is notified. See Using Oracle Auto Service Request.
-
asrInitiateBundle
-
The
asrInitiateBundle
command is aPCA-ADMIN
command that creates a support bundle, attaches the support bundle to an existing service request, and uploads the support bundle to Oracle Support. See Using the asrInitiateBundle Command. -
support-bundles
-
The
support-bundles
command is a management node command that creates a support bundle of a specified type. Oracle Support might ask you to run this command to collect more data related to a service request, or you might want to collect this data for your own use. See Using the support-bundles Command. - Manual upload to Oracle Support
-
Several methods are available for uploading support bundles or other data to Oracle Support. See Uploading Support Bundles to Oracle Support.
Using the asrInitiateBundle
Command
The asrInitiateBundle
command takes three parameters, all required:
PCA-ADMIN> asrInitiateBundle mode=triage sr=SR_number bundleType=auto
A triage
support bundle is collected and automatically attached to service
request SR_number
. For more information about the
triage
support bundle, see Triage Mode.
If the ASR service is enabled, bundleType=auto
uploads the bundle to Oracle Support using the Phone Home service. For
information about the Phone Home service, see Registering Private Cloud Appliance for Oracle Auto Service Request. The
bundle is saved on the management node for two days after successful upload. See Using the support-bundles Command.
If you specify mode=native
and do not specify any value for
nativeType
, then a ZFS_BUNDLE
is uploaded.
Using the support-bundles
Command
The support-bundles
command collects various types of bundles, or modes, of
diagnostic data such as health check status, command outputs, and logs. Depending on the
options provided, these files might contain logs or status. All modes collect files into a
bundle directory.
No more than one support bundle process is allowed at one time. A support bundle lock file is created at the beginning of bundle collection and removed when bundle collection is complete.
All support-bundles
commands return immediately, and the bundle collection
runs in the background. This is because bundle collections might take a long time, perhaps
hours.
Bundles are stored for two days, then automatically deleted.
The following types of bundles are supported:
-
Triage Mode. Collects data about the current status of the Private Cloud Appliance.
-
Time Slice Mode. Collects data by time slots. These results can be further narrowed by specifying pod name, job, and k8s_app label.
-
Combo Mode. Collects a combination of triage and time slice data.
-
Native Mode. Collects data from management, compute, and ZFS nodes and from ILOM and Cisco hosts.
A good way to start to investigate an issue is to collect a combo
bundle.
Look for NOT_HEALTHY in the triage
mode results and compare that to what you
see in the time_slice
mode results.
The support-bundles
command requires a mode option. All modes accept the
service request number option. See the following table. Time slice and native modes have
additional options.
Option | Description | Required |
---|---|---|
|
The type of bundle. |
yes |
|
The service request number. |
no |
The support-bundles
command output is stored in the following directory on
the management node, where bundle-type
is the mode:
triage
, time_slice
, combo
, or
native
:
/nfs/shared_storage/support_bundles/SR_number_bundle-type-bundle_timestamp/
The SR_number
is used if you provided the -sr
option. If you are creating the support bundle for a service request, specify the
SR_number
.
This directory contains a bundle collection progress file and an archive file. The bundle collection progress file has the following name:
bundle-type_collection.log
The output archive file has the following name:
SR_number_bundle-type-bundle_timestamp.tar.gz
The archive file contains a header.json
file with the following default
components:
-
current-time
- the timestamp -
create-support-bundle
- the command line that was used -
sr-number
- the SR number associated with the archive file
Log in to the Management Node
To use the support-bundles
command, log in as root
to the
management node that is running Pacemaker resources. Collect data first from the management
node that is running Pacemaker resources, then from other management nodes as needed.
If you do not know which management node is running Pacemaker resources, log in to any management node and check Pacemaker cluster status. The following command shows the Pacemaker cluster resources are running on pcamn01.
[root@pcamn01 ~]# pcs status Cluster name: mncluster Stack: corosync Current DC: pcamn01 ... Full list of resources: scsi_fencing (stonith:fence_scsi): Stopped (disabled) Resource Group: mgmt-rg vip-mgmt-int (ocf::heartbeat:IPaddr2): Started pcamn01 vip-mgmt-host (ocf::heartbeat:IPaddr2): Started pcamn01 vip-mgmt-ilom (ocf::heartbeat:IPaddr2): Started pcamn01 vip-mgmt-lb (ocf::heartbeat:IPaddr2): Started pcamn01 vip-mgmt-ext (ocf::heartbeat:IPaddr2): Started pcamn01 l1api (systemd:l1api): Started pcamn01 haproxy (ocf::heartbeat:haproxy): Started pcamn01 pca-node-state (systemd:pca_node_state): Started pcamn01 dhcp (ocf::heartbeat:dhcpd): Started pcamn01 hw-monitor (systemd:hw_monitor): Started pcamn01 Daemon Status: corosync: active/enabled pacemaker: active/enabled pcsd: active/enabled
Triage Mode
In triage
mode, Prometheus platform_health_check
is
queried for both HEALTHY and NOT_HEALTHY status. If NOT_HEALTHY is found, use
time_slice
mode to get more detail.
[root@pcamn01 ~]# support-bundles -m triage
The following files are in the output archive file.
File | Description |
---|---|
|
Time stamp and command line to generate this bundle. |
|
Pods running in the compute node. |
|
Hardware component list retrieved from |
|
Pods running in the management node. |
|
Rack installation time and build version. |
|
Chunk files in json. |
Time Slice Mode
In time slice mode, data is collected by specifying start and end timestamps. Both of the following options are required:
-
-s start_date
-
-e end_date
Time slice mode has the following options in addition to the mode and service request
number options. These options help narrow the data collection. If you do not specify either
the -j
or --all
option, then data is collected from all
health checker jobs.
-
Only one of
--job_name
,--all
, and--k8s_app
an be specified. -
If none of
--job_name
,--all
, or--k8s_app
is specified, the pod filtering will occur on the default (.+checker
). -
The
--all
option can collect a huge amount of data. You might want to limit your time slice to 48 hours.
Example:
[root@pcamn01 ~]# support-bundles -m time_slice -j flannel-checker -s 2021-05-29T22:40:00.000Z \ -e 2021-06-29T22:40:00.000Z -l INFO
See more examples below.
Option | Description | Required |
---|---|---|
|
Start date in format The minimum argument is |
yes |
|
End date in format The minimum argument is |
yes |
|
Loki job name. Default value: See Label List Query below. |
no |
--k8s_app label
|
The See Label List Query below. |
no |
--all
|
Queries all job names except for jobs known for too much logging, such as
audit , kubernetes-audit , and
vault-audit and k8s_app label
pcacoredns .
|
no |
|
Message level |
no |
|
The pod name (such as |
no |
|
Timeout in seconds for a single Loki query. By default it is 180 seconds. |
no |
Label List Query
Use the label list query to list the available job names and k8s_app
label
values.
[root@pcamn01 ~]# support-bundles -m label_list 2021-10-14T23:19:18.265 - support_bundles - INFO - Starting Support Bundles 2021-10-14T23:19:18.317 - support_bundles - INFO - Locating filter-logs Pod 2021-10-14T23:19:18.344 - support_bundles - INFO - Executing command - ['python3', '/usr/lib/python3.6/site-packages/filter_logs/label_list.py'] 2021-10-14T23:19:18.666 - support_bundles - INFO - Label: job Values: ['admin', 'api-server', 'asr-client', 'asrclient-checker', 'audit', 'cert-checker', 'ceui', 'compute', 'corosync', 'etcd', 'etcd-checker', 'filesystem', 'filter-logs', 'flannel-checker', 'his', 'hms', 'iam', 'k8s-stdout-logs', 'kubelet', 'kubernetes-audit', 'kubernetes-checker', 'l0-cluster-services-checker', 'messages', 'mysql-cluster-checker', 'network-checker', 'ovm-agent', 'ovn-controller', 'ovs-vswitchd', 'ovsdb-server', 'pca-healthchecker', 'pca-nwctl', 'pca-platform-l0', 'pca-platform-l1api', 'pca-upgrader', 'pcsd', 'registry-checker', 'sauron-checker', 'secure', 'storagectl', 'uws', 'vault', 'vault-audit', 'vault-checker', 'zfssa-checker', 'zfssa-log-exporter'] Label: k8s_app Values: ['admin', 'api', 'asr-client', 'asrclient-checker', 'brs', 'cert-checker', 'compute', 'default-http-backend', 'dr-admin', 'etcd', 'etcd-checker', 'filesystem', 'filter-logs', 'flannel-checker', 'fluentd', 'ha-cluster-exporter', 'has', 'his', 'hms', 'iam', 'ilom', 'kube-apiserver', 'kube-controller-manager', 'kube-proxy', 'kubernetes-checker', ' l0-cluster-services-checker', 'loki', 'loki-bnr', 'mysql-cluster-checker', 'mysqld-exporter', 'network-checker', 'pcacoredns', 'pcadnsmgr', 'pcanetwork', 'pcaswitchmgr', 'prometheus', 'rabbitmq', 'registry-checker', 'sauron-api', 'sauron-checker', 'sauron-grafana', 'sauron-ingress-controller', 'sauron-mandos', 'sauron-operator', 'sauron-prometheus', 'sauron-prometheus-gw', 'sauron-sauron-exporter', 'sauron.oracledx.com', 'storagectl', 'switch-metric', 'uws', 'vault-checker', 'vmconsole', 'zfssa-analytics-exporter', 'zfssa-csi-nodeplugin', 'zfssa-csi-provisioner', 'zfssa-log-exporter']
Examples:
No job label, no k8s_app label, collect log from all health checkers.
[root@pcamn01 ~]# support-bundles -m time_slice -sr 3-xxxxxxxxxxx -s "2022-01-11T00:00:00" -e "2022-01-12T23:59:59"
One job ceui.
[root@pcamn01 ~]# support-bundles -m time_slice -sr 3-xxxxxxxxxxx -j ceui -s "2022-01-11T00:00:00" -e "2022-01-12T23:59:59"
One k8s_app network-checker.
[root@pcamn01 ~]# support-bundles -m time_slice -sr 3-xxxxxxxxxxx --k8s_app network-checker -s "2022-01-11T00:00:00" -e "2022-01-12T23:59:59"
All jobs and date.
[root@pcamn01 ~]# support-bundles -m time_slice -sr 3-xxxxxxxxxxx -s `date -d "2 days ago" -u +"%Y-%m-%dT%H:%M:%S.000Z"` -e `date -d +u +"%Y-%m-%dT%H:%M:%S.000Z"`
All jobs.
[root@pcamn01 ~]# support-bundles -m time_slice -sr 3-xxxxxxxxxxx --all -s "2022-01-11T00:00:00" -e "2022-01-12T23:59:59"
The following files are in the output archive file.
File | Description |
---|---|
|
Time stamp and command line to generate this bundle. |
|
Chunk files in json. Time slice bundles have a limit of 500,000 logs per query, from start time. |
|
Rack installation time and build version. |
Combo Mode
The combo
mode is a combination of a triage bundle and a time slice
bundle. The output includes an archive file and two collection log files:
triage_collection.log
and time_slice_collection.log
.
The following files are in the output archive file.
File | Description |
---|---|
|
The triage bundle archive file. |
|
The time slice bundle archive file. The time slice data collected is for |
Native Mode
The native_collection.log
file in the bundle directory provides collection
progress information. Native bundles can take hours to collect.
The native
mode has the following parameters in addition to mode and SR
number.
Parameter | Description | Required |
---|---|---|
|
Default value: |
no |
|
Component name, such as the name of a management, compute, or ZFS node, or an ILOM or Cisco host. |
no |
The following files are in the output archive file.
File | Description |
---|---|
|
Time stamp and command line to generate this bundle. |
Native bundle files |
These files are specific to the |
|
Rack installation time and build version. |
ZFS Bundle
When nativetype
is a ZFS support bundle, collection starts
on both ZFS nodes and downloads the new ZFS support bundles into the bundle directory. When
nativetype
is not specified,
zfs_bundle
is created by default.
[root@pcamn01 ~]# support-bundles -m native -t zfs_bundle
SOS Report Bundle
When nativetype
is an SOS report bundle, the report is
collected from the management node or compute node specified by the
--component
parameter. If --component
is not specified,
the report is collected from all management and compute nodes.
[root@pcamn01 ~]# support-bundles -m native -t sosreport -c pcamn01
ILOM Snapshot
When nativeType=ilom_snapshot
, the value of the
--component
parameter is the ILOM host name of a management node or
compute node. If the --component
parameter is not specified, the report is
collected from all ILOM hosts.
[root@pcamn01 ~]# support-bundles -m native -t ilom_snapshot -c ilom-pcacn007
Cisco Bundle
When nativetype
is cisco-bundle
, the value
of the --component
parameter is an internal Cisco management, aggregation,
or access switch management host name.
[root@pcamn01 ~]# support-bundles -m native -t cisco-bundle -c accsn01
To create a cisco-bundle
type of collection, the following conditions must
be met:
-
The Cisco OBFL module must be enabled on all Private Cloud Appliance Cisco switches. The Cisco OBFL module is enabled by default on all Private Cloud Appliance Cisco switches.
-
The Cisco EEM module must be enabled on all Private Cloud Appliance Cisco switches. The Cisco EEM module is enabled by default on all Private Cloud Appliance Cisco switches.
-
EEM (Embedded Event Manager) policy
Uploading Support Bundles to Oracle Support
After you create a support bundle using the support-bundles
command as
described in Using the support-bundles Command, you can use the methods described in this topic to upload the support
bundle to Oracle Support.
To use these methods, you must satisfy the following requirements:
-
You must have a My Oracle Support user ID with Create and Update SR permissions granted by the appropriate Customer User Administrator (CUA) for each Support Identifier (SI) being used to upload files.
-
For file uploads to existing service requests, the Support Identifier associated with the service request must be in your profile.
-
To upload files larger than 2 GB, sending machines must have network access to connect to the My Oracle Support servers at
transport.oracle.com
to use FTPS and HTTPS.The Oracle FTPS service is a "passive" implementation. With an implicit configuration, the initial connection is from the client to the service on a control port of 990 and the connection is then switched to a high port to exchange data. Oracle defines a possible range of the data port of 32000-42000, and depending upon your network configuration you may need to enable outbound connections on both port 990 and 32000-42000. TLSv1.2 / ECDHE-RSA-AES128-GCM-SHA256 is the only encryption method enabled.
The Oracle HTTPS diagnostic upload service uses the standard HTTPS port of 443 and does not require any additional ports to be opened.
When using command line protocols, do not include your password in the command. Enter your password only when prompted.
-
Oracle requires the use of TLS 1.2+ for all file transfers.
-
Do not upload encrypted or password-protected files, standalone or within an archive. A Service Request update will note this as a corrupted file or reject the upload as disallowed file types were found. Files are encrypted when you use FTPS and HTTPS; additional protections are not required.
-
Do not upload files with file type extensions
exe
,bat
,asp
, orcom
, either standalone or within an archive. A Service Request update will note that a disallowed file type was found.
Uploading Files 2 GB or Smaller
Use the SR file upload utility on the My Oracle Support Portal.
-
Log in to My Oracle Support with your My Oracle Support user name and password.
-
Do one of the following:
-
Create a new service request and in the next step, select the Upload button.
-
Select and open an existing service request.
-
-
Click the Add Attachment button located at the top of the page.
-
Click the Choose File button.
-
Navigate and select the file to upload.
-
Click the Attach File button.
You can also use the methods described in the next section for larger files.
Uploading Files Larger Than 2 GB
You cannot upload a file larger than 200 GB. See Splitting Files.
The curl
commands in this section show required options and arguments. You
might want to add options such as --verbose
and
--progress-bar
to get more information about your upload. The
--progress-meter
(more information than --progress-bar
)
should be on by default, but it is disabled when curl
is writing other
information to stdout
. Note that some options might not be available or
might behave differently on some operating systems or some versions of
curl
.
The following are the most common messages from uploading bundles to Oracle Support if you use the
--verbose
option with the curl
command:
-
UPLOAD SUCCESSFUL
. The bundle is successfully uploaded to Oracle Support. -
LOGIN FAILED
. The user has an authentication issue. -
INVALID SR NUMBER
. The user does not have attach privilege to this Service Request.
FTPS
Syntax:
Be sure to include the /
character after the service request number.
$ curl -T path_and_filename -u MOS_user_ID ftps://transport.oracle.com/issue/SR_number/
Example:
$ curl -T /u02/files/bigfile.tar -u MOSuserID@example.com ftps://transport.oracle.com/issue/3-1234567890/
HTTPS
Syntax:
Be sure to include the /
character after the service request number.
$ curl -T path_and_filename -u MOS_user_ID https://transport.oracle.com/upload/issue/SR_number/
Example:
$ curl -T D:\data\bigfile.tar -u MOSuserID@example.com https://transport.oracle.com/upload/issue/3-1234567890/
Renaming the file during send
$ curl -T D:\data\bigfile.tar -u MOSuserID@example.com https://transport.oracle.com/upload/issue/3-1234567890/NotSoBig.tar
Using a proxy
$ curl -k -T D:\data\bigfile.tar -x proxy.example.com:80 -u MOSuserID@example.com https://transport.oracle.com/upload/issue/3-1234567890/
Splitting Files
You can split a large file into multiple parts and upload the parts. Oracle Transport will concatenate the segments when you complete uploading all the parts.
Only HTTPS protocol can be used. Only the UNIX split utility can be used. The Microsoft Windows split utility produces an incompatible format.
To reduce upload times, compress the original file prior to splitting.
-
Split the file.
The following command splits the file
file1.tar
into 2 GB parts namedfile1.tar.partaa
andfile1.tar.partab
.Important:
Specify the
.part
extension exactly as shown below.$ split –b 2048m file1.tar file1.tar.part
-
Upload the resulting
file1.tar.partaa
andfile1.tar.partab
files.Important:
Do not rename these output part files.
$ curl -T file1.tar.partaa -u MOSuserID@example.com https://transport.oracle.com/upload/issue/SR_number/ $ curl -T file1.tar.partab -u MOSuserID@example.com https://transport.oracle.com/upload/issue/SR_number/
-
Send the command to put the parts back together.
The spit files will not be attached to the service request. Only the final concatenated file will be attached to the service request.
$ curl -X PUT -H X-multipart-total-size:original_size -u MOSuserID@example.com https://transport.oracle.com/upload/issue/SR_number/file1.tar?multiPartComplete=true
In the preceding command,
original_size
is the size of the original unsplit file as shown by a file listing. -
Verify the size of the newly-attached file.
Note:
This verification command must be executed immediately after the concatenation command in Step 3. Otherwise, the file will have begun processing and will no longer be available for this command.
$ curl -I -u MOSuserID@example.com https://transport.oracle.com/upload/issue/SR_number/file1.tar X-existing-file-size: original_size
Resuming an Interrupted HTTPS Upload
You can resume a file upload that terminated abnormally. Resuming can only be done by using HTTPS. Resuming does not work with FTPS. When an upload is interrupted by some event, the start with retrieving the file size of the interrupted file
-
Determine how much of the file has already been uploaded.
$ curl -I -u MOSuserID@example.com https://transport.oracle.com/upload/issue/SR_number/myinfo.tar HTTP/1.1 204 No Content Date: Tue, 15 Nov 2022 22:53:54 GMT Content-Type: text/plain X-existing-file-size: already_uploaded_size X-Powered-By: Servlet/3.0 JSP/2.2
-
Resume the file upload.
Note the file size returned in “X-existing-file-size” in Step 1. Use that file size after the
-C
switch and in the-H “X-resume-offset:”
switch.$ curl -Calready_uploaded_size -H "X-resume-offset: already_uploaded_size" -T myinfo.tar -u MOSuserID@example.com https://transport.oracle.com/upload/issue/SR_number/myinfo.tar
-
Verify the final file size.
$ curl -I -u MOSuserID@example.com https://transport.oracle.com/upload/issue/SR_number/myinfo.tar -H X-existing-file-size: original_size
In the preceding command,
original_size
is the size of the original file as shown by a file listing.