Link Visualization

Link lets you perform advanced analysis of log records by combining individual log records from across log sources into groups, based on the fields you’ve selected for linking. You can analyze the groups by using the same fields as the ones you used for linking or additional fields for observing unusual patterns to detect anomalies.

Link command can be used for a variety of use-cases. For example, individual log records from business applications can be linked to synthesize business transactions. Groups can also be used to synthesize user sessions from web access logs. Once these linked records have been generated, they can be analyzed for anomalous behavior. Some examples of this anomalous behavior can include:

  • Business Transactions that are taking unusually long to execute or are failing.

  • User sessions that are downloading large amounts of data than normal.

Tip:

To use the Link feature, users need to have a good understanding of their log sources. The Link feature relies on a field or a set of fields that are used to combine individual log records. To generate meaningful associations of log records, it is important to know the relevant fields that can be used for linking the log records.
To understand the application of Link in performing advanced analytics and its advanced features, see Perform Advanced Analytics with Link. These are the features highlighted in the use cases:
  • Link Trend

  • Generating charts with virtual fields

  • Using SQL statement as a field of analysis

  • Generating charts for multiple fields and their values

  • Second level aggregation

  • Time analysis

  • Navigation functions

Use Dictionary Lookup in Link

Similar to cluster, you can use a lookup command to annotate the Link results.

Consider the Link results for FMW WLS Server Access Logs. To use the dictionary lookup to provide names for different pages:

  1. Create a CSV file with the following contents:

    Operator,Condition,Name
    CONTAINS,login,Login Page
    CONTAINS,index,Home Page
    CONTAINS ONE OF REGEXES,"[\.sh$,\.jar$]",Script Access

    Import this as a Dictionary type lookup using the name Page Access Types. This lookup contains one field, Name that can be returned from each matching row. See Create a Dictionary Lookup.

  2. Use the dictionary in link:

    Add a lookup command after link, as follows:

    'Log Source' = 'FMW WLS Server Access Logs' 
    | link URI, Status 
    | lookup table = 'Page Access Types' select Name using URI

    The value of URI field for each row is evaluated against the rules defined in the Page Access Types dictionary. The Name field is returned from each matching row.

    The Name field contains the value from the dictionary. There can be more than one value for the Name field, if the URI matches against multiple fields.

  3. Analyze Link data using the dictionary fields:

    The Name field can now be used like any other field in Link. For example, the following query filters by valid values for Name and analyzes the results against the HTTP Status in the response:

    'Log Source' = 'FMW WLS Server Access Logs' 
    | link URI, Status 
    | lookup table = 'Page Access Types' select Name using URI 
    | where Name != null 
    | classify Status, Name as 'Page Analysis'

    This query produces the analytical chart showing the distribution of HTTP Status for various pages. The resulting bubble chart has the pages like "Login Page, Home Page", "Home Page, Script Access", Home Page, Login Page, and Script Access plotted along Y-axis, and the HTTP status along Y-axis.

Semantic Clustering Using Natural Language Processing

Cluster Visualization allows you to cluster text messages in log records. Cluster works by grouping messages that have similar number of words in a sentence, and identifying the words that change within those sentences. Cluster does not consider the literal meaning of the words during the grouping.

The new NLP (Natural Language Processing) command supports semantic clustering. Semantic Clustering is done by extracting the relevant keywords from a message and clustering based on these keywords. Two sets of messages that have similar words are grouped together. Each such group is given a deterministic Cluster ID.

The following example shows the usage of NLP clustering and keywords on Linux Syslog Logs:

'Log Source' = 'Linux Syslogs Logs'
| link Time, Entity, cluster()
| nlp cluster('Cluster Sample') as 'Cluster ID', 
                    keywords('Cluster Sample') as Keywords
| classify 'Start Time', Keywords, Count, Entity as 'Cluster Keywords'


For more example use cases of semantic clustering, see Examples of Semantic Clustering Using Natural Language Processing.

nlp Command

The nlp command supports two functions. cluster() can be used to cluster the specified field, and keywords() can be used to extract keywords from the specified field.

nlp command can be used only after the link command. See NLP Command in Using Oracle Log Analytics Search.

  • nlp cluster():

    cluster() takes the name of a field generated in Link, and returns a Cluster ID for each clustered value. The returned Cluster ID is a number, represented as a string. The Cluster ID can be used in queries to filter the clusters.

    For example:

    nlp cluster('Description') as 'Description ID' - This would extract relevant keywords from the Description field. The Description ID field would contain a unique ID for each generated cluster.

  • nlp keywords():

    Extracts keywords from the specified field values. The keywords are extracted based on a dictionary. The dictionary name can be supplied using the table option. If no dictionary is provided, the out-of-the-box default dictionary NLP General Dictionary is used.

    For example:

    nlp keywords('Description') as Summary - This would extract relevant keywords from the Description field. The keywords are accessible using the Summary field.

    nlp table='My Issues' cluster('Description') as 'Description ID' - Instead of the default dictionary, use the custom dictionary My Issues.

NLP Dictionary

Semantic Clustering works by splitting a message into words, extracting the relevant words and then grouping the messages that have similar words. The quality of clustering thus depends on the relevance of the keywords extracted.

  • A dictionary is used to decide what words in a message should be extracted.
  • The order of items in the dictionary is important. An item in the first row has higher ranking than the item in the second row.
  • A dictionary is created as a .csv file, and imported using the Lookup user interface with Dictionary Type option.
  • It is not necessary to create a dictionary, unless you want to change the ranking of words. The default out-of-the-box NLP General Dictionary is used if no dictionary is specified. It contains pre-trained English words.

See Create a Dictionary Lookup.

Following is an example dictionary iSCSI Errors:

Operator Condition Value

CONTAINS IGNORE CASE

error

noun

CONTAINS IGNORE CASE

reported

verb

CONTAINS IGNORE CASE

iSCSI

noun

CONTAINS IGNORE CASE

connection

noun

CONTAINS IGNORE CASE

closed

verb

The first field is reserved for future use. Second field is a word. The third word specifies the type for that word. The type can be any string and can be referred to from the query using the category parameter.

In the above example, the word error has higher ranking than the words reported or iSCSI. Similarly, connection has higher ranking than closed.

Using a Dictionary

Suppose that the following text is seen in the Message field:

Kernel reported iSCSI connection 1:0 error (1020 - ISCSI_ERR_TCP_CONN_CLOSE: TCP connection closed) state (2) 
Please verify the storage and network connection for additional faults

The above message is parsed and split into words. Non-alphabets are removed. Following are some of the unique words generated from the split:

Kernel
reported
iSCSI
connection
error
ERR
TCP
CONN
CLOSE
closed
state
...
...

There are a total of 24 words in the message. By default, semantic clustering would attempt to extract 20 words and use these words to perform clustering. In a case like the above, the system needs to know which words are important. This is achieved by using the dictionary.

The dictionary is an ordered list. If iSCSI Errors is used, then NLP would not extract ERR, TCP, or CONN because these words are not included in the dictionary. Similarly, the words error, reported, iSCSI, connection, and closed are given higher priority due to their ranking in the dictionary.

Generate Link Alerts

After you have viewed your log records in the link visualization and determined the boundaries in which the anomalies typically appear, you can create alert rules to get notifications when anomalies are detected.

You can save a maximum of 50 scheduled alerts.

Consider the following order flow application example where the anomalies are detected for transactions that take more than 1 minute to complete.



  1. To create an alert rule that will notify you upon detecting anomalies, you must first define the condition in the query. Edit the highlighted query, and add the where command to define the Group Duration in which the anomalies are found, that is, when it's more than or equal to 60,000 milliseconds. For example:

    'Log Source' = SOALogSource | link Module, 'Context ID' | classify topcount = 300 'Group Duration', Module | where 'Group Duration' >= 60000
  2. Click the down arrow next to Save, and select Save As.

    The dialog box to create the alert rule opens.

  3. Specify the name for the alert under Search Name.

  4. Check the Create alert rule check box.

    The field Rule Name is automatically populated with the alert name that you specified earlier. The Enable Rule check box is enabled by default.

  5. Under condition type, select Fixed Threshold.

  6. Under Results, specify the warning and critical thresholds for the notification actions. For example, if you want a warning notification if more than one anomaly are detected, and critical notification if more than five anomalies are detected, then select the operator for greater than or equal to, warning threshold 1, and critical threshold 5.

  7. Schedule the interval at which the test must be run to detect anomalies. For example, Every Day. This will depend on the frequency of collecting your logs, and the number of log records that you expect to be analyzed on a regular basis.

    The time period of the logs analyzed for the saved search alert is the same as the run period. For example, if you select 15-minute interval, then the logs are checked for the last 15 minutes at that specific time.

    You can select Every Hour, Every Day, Every Week or a Custom setting for any value between 15 minutes to 21 days as the Schedule Interval. Your saved search runs automatically based on the interval that you specify.

    If you select Every Hour, then you can optionally specify to exclude Weekend or Non-business hours from the schedule.

    If you select Every Day, then you can optionally specify to Exclude Weekend from the schedule.

  8. If you want to customize your alert message, then under Customize Message Format, select Use custom message. You can customize any or all of the messages available under this section. For details, see Step 8 in Create An Alert Rule.

  9. In Notifications, specify the recipients of the alert notifications and in Remediation Action, select the action that must be performed automatically in response to an alert. For details, see Step 9 and Step 10 in Create An Alert Rule.

    Click Save.

The alert is now created. You can visit the Alert Rules page to view the alert that you just created, and edit it, if required. See View and Edit Alert Rules.

To view the alerts generated, see View the Entity Details for an Alert.

Use the Getting Started Panel

If you’re new to using Link, then you can familiarize with the following features by using the Getting Started Panel:
  1. On the results table header, click the Open the Getting Started panel (open getting started panel) icon to open the Getting Started Panel.
  2. On the Getting Started tab, click the Show Tips link to view some useful tips to explore options on the visualization of the Link feature.

    Click Hide Tips.

  3. Click on the Sample Link Commands tab. View and edit some of the sample link commands.

    You can select to Run a link command that’s listed under Available Sample Link Commands or View the link commands listed under All Sample Link Commands.

  4. Click on the Link Builder tab, and run the wizard to select the Log Source, select up to four fields in Link By, select up to two fields in Analyze Fields, and click Run Link to build custom queries. You can select multiple fields at once before running the query, thus saving time from having the drag and drop operation to complete the background query for every field.

    Click Clear to clear the selection.

For example, if you select EBS Concurrent Request Logs - Enhanced log source from the available sample link command and run it, you can obtain the following information:

  • Requests that have already completed execution within the selected time window

  • Currently running requests that show anomalous run times

  • Ability to create an Alert to identify specific requests that took anomalous run time to complete, or still running but with anomalous run time

Analyze Chart Options

The following chart options are available to analyze the groups that’re displayed by the Link query:
Analyze Chart Option Utility

Chart Type

Select from the bubble, scatter, tree map, and sunburst type of charts to view the groups. By default, a bubble chart is displayed.

  • Bubble Chart: To analyze the data from three fields, and each field can have multiple values. The position of the bubble is determined by the values of the first and second fields that’re plotted on the x and y axes, and the size of the bubble is determined by the third field.

  • Scatter Chart : To analyze the data from two numeric fields, to see how much one parameter is affecting the other.

  • Tree Map: To analyze the data from multiple fields that’re both hierarchical and fractional, with the help of interactive nested rectangles.

  • Sunburst Chart: To analyze hierarchical data from multiple fields. The hierarchy is represented in the form of concentric rings, with the innermost ring representing the top of the hierarchy.

Height

Increase or decrease the height of the chart to suit your screen size.

Swap X Y axis

You can swap the values plotted along the x and y axes for better visualization.

Show Anomalies

View the anomalies among the groups displayed on the chart.

Highlight Anomaly Baselines

If you’ve selected to view the anomalies, then you can highlight the baselines for those anomalies.

Show Group Count Legend

Toggle the display of the Group Count legend.

Zoom and Scroll

Select Marquee zoom or Marquee select to dynamically view the data on the chart or to scroll and select multiple groups.

When displaying Problem Priority, Analyze charts display colors that match the severity of Problem Priority.

You can create multiple Analyze charts. Click Analyze Analyze icon> Create Chart option. Configure each chart by clicking Chart Options Chart option icon > Chart Settings > Edit Chart for that chart.

Histogram Chart Options

Histogram shows the dispersion of log records over the time period and can be used to drill down into a specific set of log records.

You can generate charts for the log records, groups and numeric display fields. Select a row to view the range highlighted in the histogram.

The following chart options are to view the group data on the histogram:

Histogram Chart Option Utility

Chart Type

Select from the following types of visualization to view the group data:

  • Bar: The log records are displayed as segmented columns against the time period. This is the default display chart.

  • Marker Only : The size of the log records against the specific time is represented by a marker.

  • Line Without Marker: The size of the log records against the specific time is plotted with the line tracing the number that represents the size.

  • Line With Marker: The size of the log records against the specific time is plotted with the line tracing the marker that represents the size.

  • Line With Area: This is similar to a line chart, but the area between the line and the axis is covered with color. The colored area represents the volume of data.

Show Combined Chart

This option combines all the individual charts into a single chart.

Note:

  • You can modify the Height and Width of the charts to optimize the visualization and view multiple charts on one line.

  • When viewing multiple charts, you can deselect the Show Correlated Tooltips check box to show only one tooltip at a time.

  • When using the log scale, the Bar or Line With Marker type of chart is recommended.

Example: For generating a chart for the numeric eval command, let's consider the example query:

* 
| rename 'Content Size' as sz 
| where sz > 0 
| link 'Log Source' 
| stats avg(sz) as 'Avg Sz', earliest(sz) as FirstSz, latest(sz) as LastSz 
| eval Delta = LastSz - FirstSz 
| eval Rate = Delta / 'Avg Sz'

Here, the log source is the field considered for Link By. The chart is generated for Delta, Rate, and Avg Sz after the computations performed as specified in the eval command. The resulting Line With Area charts for the above fields are displayed as below:



Groups Table

The groups table displays the result of the analysis by listing the groups and the corresponding values for the following default fields:

Column Details

Field (s)

The field that’s used to analyze the group

Count

The number of log records in the group

Start Time

The start of the time period for which the logs are considered for the analysis

End Time

The end of the time period for which the logs are considered for the analysis

Group Duration

The duration of the log event for the group

When displaying Problem Priority, groups table displays colors that match the severity of Problem Priority.

Features for Bubble Charts in Link Analysis

Use the following features to edit the bubble chart:

Change the Title of the Bubble Chart

To improve the readability of the chart and for friendly analysis, you can change the title of the bubble chart by using the option in the Analyze dialog box.

To modify the title of the bubble chart, click Analyze Analyze icon icon > In the Analyze dialog box, update the value of the field Chart Title > Click OK.

As a result, the title of the chart is now changed to the value that you provided.

Control the Color of the Bubbles in the Chart

To plot along the X-axis, you can select a numeric, string, or time field. Only a numeric or string field can be used for the Y-axis.

  • Any fields can be used to control the color of the bubbles. There are no restrictions about the types of the fields.

  • Numeric fields can be used for controlling the size of the bubbles. The value of the fields control the size of the bubble. The larger the values, the larger the bubbles.

For steps to select the fields for controlling the color of the bubbles in the chart, see Add More Fields for Analysis Using Size and Color.

The following chart shows the Time Taken for Requests, which is plotted along Y-axis, and also the Application and Job that are involved in the analysis:



By default, the Link Analyze chart automatically selects a color palette based on the values in the chart. To select a different palette or to add additional field values, click the Color link. In the following example, the field Method has HTTP Method color palette applied for different values:

'Log Source' = 'FMW WebLogic Server Access Logs'
| link Time, Method
| classify Time, Method, Count as 'HTTP Methods Trend'


Features for Fields in Link Analysis

Use the following features to work with the fields in the Link visualization:

Add More than Two Fields

Add more than two fields to the analysis. Each field that is added for analysis appears as a column in the Groups Table.

Consider the following example:



Select the field from the Fields panel > click the Options Options icon icon > use the Add to Display Fields option to extract their values.

As a result, the Groups table has the columns for the fields Event Start Time, Event End Time, unique(Application), and unique(Program Details).

Rename the Fields by Editing the Query

By default, the fields that you add to the Display Fields panel will be displayed in the column names of the Groups Table with the name of the function that was used to create the field. Edit the query to give names to the fields.

Consider the following example for the query that is currently used to run link feature:

'Log Source' = 'EBS Concurrent Request Logs - Enhanced'
| link 'Request ID'
| stats earliest('Event Start Time') as 'Request Start Time', 
latest('Event End Time') as 'Request End Time',
unique(Application),
unique('Program Details')  
| eval 'Time Taken' = 'Request End Time' - 'Request Start Time'
| classify topcount = 300 'Request Start Time', 'Time Taken' as 'Request Analysis'

To change the names of the fields unique(Application) to Application Name and unique('Program Details') to Job, modily the query:

'Log Source' = 'EBS Concurrent Request Logs - Enhanced'
| link 'Request ID'
| stats earliest('Event Start Time') as 'Request Start Time', 
latest('Event End Time') as 'Request End Time',
unique(Application) as 'Application Name',
unique('Program Details') as Job  
| eval 'Time Taken' = 'Request End Time' - 'Request Start Time'
| classify topcount = 300 'Request Start Time', 'Time Taken' as 'Request Analysis'

After renaming the fields, you can refer to the fields using the new names. The column names in the Groups Table will have the new names of the fields.

Add More Fields for Analysis Using Size and Color

In the bubble chart, two fields are used to plot along the x-axis and y-axis. The remaining fields can be used to control the size and color of the bubbles in the chart.

Two fields are used in the chart to plot along X and Y axes. To add more fields for analysis in the bubble chart,

  1. Click Analyze Analyze icon icon > Click Create Chart. The Analyze dialog box is displayed.

  2. Select the field to plot along the X-axis. This must be a numerical field.

  3. Select the field to plot along the Y-axis. This must be a numerical field.

  4. In the Size / Color panel, select the fields that must be used for defining the size and colors of the bubbles in the chart. Any fields can be used for controlling the color, but numeric fields must be used to control the size of the bubbles.

  5. Click OK.

Additionally, Group Count is available as a field to control the size and color.

The classify command is now run with multiple fields, in the order specified in the Analyze selection. The following bubble chart shows multiple fields:



In the above example,

  • The field Request Start Time is plotted along X-axis
  • The field Time Taken is plotted along Y-axis
  • The string fields Application Name and Job are used for controlling the size and color of the bubbles in the chart

Furthermore, the Groups alias is changed to Requests, and Log Records alias is changed to Concurrent Request Logs.

Instant Analysis of Multiple Fields Using the Link Analyzer Chart

Slice and dice data using multiple filters in the Analyzer Chart.

Use Filter Options > Show Search Filters to enable the filters:



Features for Groups in Link Analysis

Use the following features to modify the groups:

Change the Group Alias

Each row in the link table corresponds to a Group. In the following example, the link command is run using the Request ID field. Therefore, each row of the table represents a request. You can change the alias for Groups and Log Records tabs.

The following example shows the bubble chart in the Groups tab. The adjacent Log Records tab can also be seen in the image:



Click Search and Table Options Search and Table icon icon > Click Display Options > Under Alias Options, modify the Groups Alias and Log Records Alias values.

The Group Alias is used when there is only one item in the Groups table.

Join Multiple Groups Using the Map Command

Use map command to join multiple sub-groups from the existing linked Groups. This is useful to assign a Session ID for related events, or to correlate events across different servers or log sources.

For example, the below query joins Out of Memory events with other events that are within 30 minutes, and colors these groups to highlight a context for the Out of Memory outage:

* | link Server, Label
  | createView [ *   | where Label = 'Out of Memory' 
                     | rename Entity as 'OOM Server', 'Start Time' as 'OOM Begin Time' ] as 'Out of Memory Events'
  | sort Entity, 'Start Time'
  | map [ * | where Label != 'Out of Memory' and Server = 'OOM Server' and 'Start Time' >= dateAdd('OOM Begin Time', minute,-30) and 'Start Time' <= 'OOM Begin Time'
            | eval Context = Yes ] using 'Out of Memory Events'
  | highlightgroups color = yellow [ * | where Context = Yes ] as '30 Minutes before Out of Memory'
  | highlightgroups priority = high [ * | where Label = 'Out of Memory' ] as 'Server Out of Memory'


See Map Command in Using Oracle Log Analytics Search.

Create Sub-Groups Using the Createview Command

Use createview command to create sub-groups from the existing linked groups. This can be used in conjunction with the map command to join groups.

For example, you can group all the Out of Memory errors using the following command:

* | link Entity, Label 
  | createView  [ * | where Label = 'Out of Memory' ] as 'Out of Memory Events'

See Createview Command in Using Oracle Log Analytics Search.

Search and Highlight Link Groups

Use highlightgroups command to search one or more columns in the Link results and highlight specific groups. You can optionally assign a priority to the highlighted regions. The priority would be used to color the regions. You can also explicitly specify a color.

Optionally, you can specify an alias for the highlight. This alias is displayed on mouse over on the highlighted region. The alias can also be used to turn on or off the highlight using the Hide/Show Highlights option under the Options menu.

For example:

* | link Label 
| highlightgroups priority = medium [ * | where Label in ('I/O Error', 'Socket Timeout') ] 
| highlightgroups priority = high   [ * | where Label = 'Stuck Thread'] as 'Stuck Thread Events'
| highlightgroups color = #68C182   [ * | where Label = 'Service Started'] as Startup


See Highlightgroups Command in Using Oracle Log Analytics Search.