9.3.3.3 tfactl diagcollect

Use the tfactl diagcollect command to perform on-demand diagnostic collection.

AHF 23.8

Starting in AHF 23.8, you will be able to upload to pre-authenticated (PAR) URL. Uploading AHF Insights reports helps Oracle Cloud Operations to identify, investigate, track, and resolve system health issues and divergences in best practice configurations quickly and effectively.

Oracle Exadata Database Service on Dedicated Infrastructure (ExaDB-D) and Oracle Base Database Service

To upload AHF Insights report to PAR location, run:
tfactl diagcollect -insight -last 1h -par <par_url>
tfactl insight -last 1h -par <par_url>

Oracle Trace File Analyzer Collector can perform three types of on-demand collections:

  • Default collections
  • Event-driven Support Service Request Data Collection (SRDC) collections
  • Custom collections

Syntax

tfactl diagcollect [ [-insight | -noinsight] [component_name1] [component_name2] ... [component_nameN] | [-srdc <srdc_profile>] | [-defips]]  
[-sr <SR#>]
[-node <all|local|n1,n2,..>] 
[-cellnode <all|cel1,cel2,..> [-nomaxcells] | -nocell] 
[-sundiagcompute ] 
[-sundiagcell] 
[-tag <tagname>] 
[-z <filename>] 
[-last <n><m|h|d>| -from <time> -to <time> | -for <time>] 
[-nocopy] 
[-notrim] 
[-dryrun] 
[-nodbforegroundfiles] 
[-silent] 
[-cores]
[-collectalldirs] [-collectdir <dir1,dir2..>]
[-collectfiles <file1,..,fileN,dir1,..,dirN> [-onlycollectfiles]] 
[-par <par_url>] 
[-onlyinsights] 
[-request_from <requestor>] 
[-singlearchive] 
[-examples]
Components:
-ips|-database|-asm|-crsclient|-dbclient|-dbwlm|-tns|-rhp|-procinfo|-cvu|-afd|-crs|-cha|-wls|-emagenti|-emagent|-oms|-omsi|-ocm|-emplugins|-em|-acfs|-install|-cfgtools|-os|-ashhtml|-ashtext|-awrhtml|-awrtext|-sosreport|-ahf|-dataguard|-syslens|-hami|-avs|-goldengate|-asr|-sundiagcompute|-sundiagcell|-zdlra|-exaswitch

For detailed help on each component, use tfactl diagcollect [component_name1] [component_name2] ... [component_nameN] -help

Parameters

Prefix each option with a minus sign (-).

Option Description

[ [-insight | -noinsight] [component_name1] [component_name2] ... [component_nameN] | [-srdc srdc_profile] | [-defips]]]

Specify the list of components for which you want to obtain collections, or specify the SRDC name, or specify to include Incident Packaging Service (IPS) Packages for Oracle Automatic Storage Management (Oracle ASM), Oracle Clusterware, and Oracle Databases in the default collection.

-insight: Specify to include the AHF Insights Report in the diagnostic collection.

-noinsight: Specify not to include the AHF Insights Report in the diagnostic collection.

[-defips]: Specify to Include in the default collection the IPS Packages for ASM, CRS, and Databases.

[-sr SR#]

Specify the Service Request number to which Oracle Trace File Analyzer automatically uploads all collections.

-node all|local|n1,n2,...

Collects diagnostics from the nodes specified.

Specify a comma-delimited list of nodes. If you do not specify, then the commands collects diagnostics for all the nodes by default.

For example: tfactl diagcollect -node node1

[-cellnode <all|cel1,cel2,..> [-nomaxcells] | -nocell]

Collects diagnostics from the cells specified.

Specify a comma-delimited list of cells. If you do not specify, then the commands collects diagnostics for all the cells by default.

For example: tfactl diagcollect -cellnode all

-nomaxcells: Specify to override the limit of cells for collection.

-nocell: Specify not to include collections on storage cells

-sundiagcompute

Specify to run sundiag on Compute nodes.

-sundiagcell

Specify to run sundiag on storage cells.

-tag description

Use this parameter to create a subdirectory for the resulting collection in the Oracle Trace File Analyzer repository.

-z file_name

Use this parameter to specify an output file name.

[-last nh|d | -from time -to time | -for time]

  • Specify the -last parameter to collect files that have relevant data for the past specific number of hours (h) or days (d). By default, using the command with this parameter also trims files that are large and shows files only from the specified interval.

    You can also use -since, which has the same functionality as -last. This option is included for backward compatibility.

  • Specify the -from and -to parameters (you must use these two parameters together) to collect files that have relevant data during a specific time interval, and trim data before this time where files are large.

    Supported time formats:

    "Mon/dd/yyyy hh:mm:ss"

    "yyyy-mm-dd hh:mm:ss"

    "yyyy-mm-ddThh:mm:ss"

    "yyyy-mm-dd"

  • Specify the -for parameter to collect files that have relevant data for the time given. The files tfactl collects will have timestamps in between which the time you specify after -for is included. No data trimming is done for this option.

    Supported time formats:

    "Mon/dd/yyyy"

    "yyyy-mm-dd"

Note:

If you specify both date and time, then you must enclose both the values in double quotation marks (""). If you specify only the date or the time, then you do not have to enclose the single value in quotation marks.

-nocopy

Specify this parameter to stop the resultant trace file collection from being copied back to the initiating node. The file remains in the Oracle Trace File Analyzer repository on the executing node.

-notrim

Specify this parameter to stop trimming the files collected.

-dryrun

Creates a text file that contains a list of all the files that would have been collected and which scripts would be run for the specific diagcollect command without actually doing the collection.

-silent

Specify this parameter to run diagnostic collection as a background process

-cores

Specify this parameter to collect core files when it would normally have not been collected.

-collectalldirs

Specify this parameter to collect all files from a directory that has Collect All  flag marked true.

-collectdir dir1,dir2,...dirn

Specify a comma-delimited list of directories and collection includes all files from these directories irrespective of type and time constraints in addition to the components specified.

[-collectfiles file1,..,fileN,dir1,..,dirN [-onlycollectfiles]]

Specify a comma-delimited list of files and directories and the collection will include the files and directories in addition to the components specified.

If -onlycollectfiles is also used, then no other components will be collected.

[-acrlevel system,database,userdata]

Use this parameter to specify the ACR level(s) for redaction.

ACR supports the following three levels:
  • system: For entity types such as hostname, IP, port, and user name.
  • database: For entity types such as dbname, tbsname, svcname, and sqlstmt.
  • userdata: For block dumps and redo log dumps.

-sanitize

Note:

Starting with Oracle Autonomous Health Framework 24.1, the Oracle Trace File Analyzer masking feature is deprecated, and can be desupported in a future release.

For more information, see Deprecated Oracle Trace File Analyzer Masking in Release 24.1

Sanitize sensitive values in the collection using Adaptive Classification and Redaction (ACR).

This option will significantly increase the elapsed and actual processor time required to complete the collection.

-mask

Note:

Starting with Oracle Autonomous Health Framework 24.1, the Oracle Trace File Analyzer masking feature is deprecated, and can be desupported in a future release.

For more information, see Deprecated Oracle Trace File Analyzer Masking in Release 24.1

Mask sensitive values in the collection using Adaptive Classification and Redaction (ACR).

This option will significantly increase the elapsed and actual processor time required to complete the collection.

-examples

Specify this parameter to view diagcollect usage examples.

-singlearchive

Specify this parameter to merge remote zip files into a single zip file on the initiating node.

-nodbforegroundfiles

Specify this parameter to filter out all foreground database logs, regardless of their size.

Example 9-97 tfactl diagcollect -onlycollectfiles -collectfiles

tfactl diagcollect -onlycollectfiles -collectfiles
/tmp/tfa/tracedir,/tmp/tfa/tracedir/trace1.log,/tmp/tfa/tracedir2/trace2_dir2.log
-node local -since 1h
Collecting data for local node(s).

TFA is using system timezone for collection, All times shown in UTC.

Collection Id : 20210721225241<hostname>

Detailed Logging at :
/opt/oracle.ahf/data/repository/collection_Wed_Jul_21_22_52_46_UTC_2021_node_local/diagcollect_20210721225241_<hostname>.log
2021/07/21 22:52:51 UTC : NOTE : Any file or directory name containing the
string .com will be renamed to replace .com with dotcom
2021/07/21 22:52:51 UTC : Collection Name : tfa_Wed_Jul_21_22_52_44_UTC_2021.zip
2021/07/21 22:52:51 UTC : Getting list of files satisfying time range
[07/21/2021 21:52:51 UTC, 07/21/2021 22:52:51 UTC]
2021/07/21 22:52:51 UTC : Collecting additional diagnostic information...
2021/07/21 22:52:53 UTC : Collecting ADR incident files...
2021/07/21 22:53:15 UTC : Completed collection of additional diagnostic
information...
2021/07/21 22:53:18 UTC : Completed Local Collection
2021/07/21 22:53:18 UTC : Redacting the collection...
2021/07/21 22:55:06 UTC : Redacted masked Host name :owlo000037-vm1
2021/07/21 22:55:06 UTC : Successfully redacted the collection
.-------------------------------------------.
|             Collection Summary            |
+----------------+-----------+-------+------+
| Host           | Status    | Size  | Time |
+----------------+-----------+-------+------+
| <hostname> | Completed | 1.6MB |  27s |
'----------------+-----------+-------+------'

Logs are being collected to:
/opt/oracle.ahf/data/repository/collection_Wed_Jul_21_22_52_46_UTC_2021_node_local
/opt/oracle.ahf/data/repository/collection_Wed_Jul_21_22_52_46_UTC_2021_node_local/owlo000037-vm1.tfa_Wed_Jul_21_22_52_44_UTC_2021.zip

9.3.3.3.1 Smart Problem Classification to Help Oracle Support Resolve Service Requests Faster

AHF diagnostic collection now uses Smart Problem Classification to pinpoint the specific problem for which the diagnostic collection is being performed.

You are often required to collect generic collections for all components over a wide range of times. The logs collected as part of diagnostic collections often reveal evidence of multiple types of problems. Consequently, automated log analysis is limited in its effectiveness because of the significant amount of time required to process and analyze log files.

By intelligently presenting you with a list of detected events relevant to the type of collection being performed, Smart Problem Classification allows you to identify the problem by selecting one from the list.

In addition to recording the type of problem, AHF also records the time and location. This information is made available to Oracle Support to help them resolve Service Requests faster. To ensure that the correct targeted collection is made, you can drill down by problem category if the problem you are looking for is not displayed.

Smart Problem Classification is enabled by default when you run tfactl diagcollect. You can, however, disable it when necessary.

Note:

Currently, Smart Problem Classification is not enabled on systems running AIX and Microsoft Windows operating systems.

How to use Smart Problem Classification

  1. When you initiate a diagnostic collection, Oracle Trace File Analyzer queries events pertinent to the type of collection occurred during the specified time range. If you do not specify the time range, then by default Oracle Trace Files Analyzer queries events occurred duing the last four hours.
  2. Oracle Trace Files Analyzer displays the list of events to pick.
    1. Select one event from the list. Oracle Trace Files Analyzer will initiate a collection to collect data for the selected event.
    2. If you opt to enter a new time range, then Oracle Trace Files Analyzer will prompt you to enter the new time range and will be redirect to step 2.
    3. If you opt to choose to display problem categories, then Oracle Trace Files Analyzer will display a list of categories. Selecting one of them will display sub-categories. After getting the information needed (directed to a problem), Oracle Trace File Analyzer will prompt you to enter the following details:
      1. Time range
      2. Name of the database, if the problem maps to a database.

    Setting this will trigger an SRDC for the details provided.

    At any point in time, you can exit from the classification process by selecting "X" from the menu.

Note:

The following collection options do not honor Smart Problem Classification:
  • Collection switches: [-ips, -syslens, -ahf, -awrhtml, -awrtext, -sosreport, -ashhtml, -ashtext]
  • Collection modules: [-srdc, -insight, -em, -emagenti, -emagent, -oms, -omsi]

Example 9-98 Smart Problem Classification - Examples

To check if Smart Problem Classification is enabled or disabled:
# tfactl get smartprobclassifier
.---------------------------------.
|              node1              |
+-------------------------+-------+
| Configuration Parameter | Value |
+-------------------------+-------+
| smartprobclassifier     | ON    |
'-------------------------+-------'
To collect diagnostics when Smart Problem Classification is enabled:
# tfactl diagcollect -last 1h

AHF has detected following events from 2022-11-04 12:01:56.768 to 2022-11-04 13:01:56.768
All events are displayed in UTC time zone

Choose an event to perform a diagnostic collection:
1  . 2022-11-04 13:01:43.000 [RDBMS.orcl.orcl1] ORA-00600: internal error code, arguments: [kjb], [ch11], [ch24], [], ...
2  . Show problem categories
3  . Enter a different event time
X  . Exit
Please choose the option [1-3]:2

Problem Categories:
1  . ACFS
2  . ASM Configuration
3  . ASM Errors/Other
4  . ASM Instance Crash
5  . CRS Client
6  . CRS Errors/Other
7  . Clusterware Installation
8  . Clusterware Patching
9  . Clusterware Startup
10 . Clusterware Upgrade
11 . Database Corruption
12 . Database Errors/Other
13 . Database Install
14 . Database Instance Eviction/Crash
15 . Database Internal Error
16 . Database Memory
17 . Database Patching
18 . Database Performance
19 . Database RMAN
20 . Database Storage (ASM)
21 . Database Streams/AQ
22 . Database Upgrade
23 . Dataguard
24 . GoldenGate
25 . Node Eviction/Reboot
26 . Problem not listed, provide problem description
X  . Exit
Please select the category of your problem [1-26]:
...
...
To bypass Smart Problem Classification:
# tfactl diagcollect -last 1h -noclassify
Collecting data for all nodes
TFA is using system timezone for collection, All times shown in UTC.
Collection Id: 20221104125517stbm000004-vm15
To bypass Smart Problem Classification, use the flags -silent and -noclassify.
# tfactl diagcollect -last 1h -silent
Smart Problem Classifier is ON. Since -silent is passed, Problem Classifier is not processing the request.
To disable Smart Problem Classification:
# tfactl set smartprobclassifier=off
Successfully set smartprobclassifier=OFF
.---------------------------------.
|               node1             |
+-------------------------+-------+
| Configuration Parameter | Value |
+-------------------------+-------+
| smartprobclassifier     | OFF   |
'-------------------------+-------'
To enable Smart Problem Classification:
# tfactl set smartprobclassifier=on
Successfully set smartprobclassifier=ON
.---------------------------------.
|               node1             |
+-------------------------+-------+
| Configuration Parameter | Value |
+-------------------------+-------+
| smartprobclassifier     | ON    |
'-------------------------+-------'