6 Troubleshoot Stream Analytics

After you provision and run the pipeline, sometimes you may encounter issues with the pipeline. Some of those issues are explained here. Each pipeline is composed of various stages. A stage can be a stream, query, or pattern.

Troubleshoot Live Output

For every pipeline, there will be one Spark streaming pipeline running on Spark Cluster. If a Stream Analytics pipeline uses one or more Query Stage or Pattern Stage, then the pipeline will run one or more continuous query for each of these stages.

For more information about continuous query, see Understanding Oracle CQL.

If there are no output events in Live Output Table for Query Stage or Pattern Stage, use the following steps to determine or narrow down the problem:

  1. Ensure that Pipeline is Deployed Successfully

  2. Ensure that the Input Stream is Supplying Continuous Stream of Events to the Pipeline

  3. Ensure that CQL Queries for Each Query Stage Emit Output

  4. Ensure that the Output of Stage is Available

Ensure that Pipeline is Deployed Successfully

You can deploy pipelines to any Spark Cluster (version 1.6).

Follow the steps in the below sections to verify that the pipeline is deployed and running successfully on Spark cluster.

Verify pipeline Deployment on Oracle Big Data Cloud Service - Compute Edition based Spark Cluster

  1. Go to PSM user interface and open the home page for Oracle Big Data Cloud Service (BDCSCE) instance.

  2. Click on the hamburger menu next to instance name and then click Big Data Cluster Console.

    Description of big_data_cluster_home.png follows
    Description of the illustration big_data_cluster_home.png

  3. Enter the login credentials and open the Big Data Cluster Console home page.

  4. Navigate to Jobs tab.

    You can see a list of jobs. Each job corresponds to a spark pipeline running on your BDCSCE cluster.

    Description of jobs_logs.png follows
    Description of the illustration jobs_logs.png

  5. Find the entry corresponding to your pipeline and check the status. For more information, see Determine the Spark Application Name Corresponding to a Pipeline.

    If you see the status as Running, then the pipeline is currently deployed and running successfully.

  6. Click the hamburger menu corresponding to the required job to fetch logs and click Logs to get container wise logs.

    You can download these files for further debugging.

Verify pipeline Deployment on Apache Spark Installation based Spark Cluster

  1. Open Spark Master user interface.

    Description of spark_master_ui.png follows
    Description of the illustration spark_master_ui.png

  2. Find the entry corresponding to your pipeline and check the status. For more information, see Determine the Spark Application Name Corresponding to a Pipeline.

    If you see the status as Running, then the pipeline is currently deployed and running successfully.

Ensure that the Input Stream is Supplying Continuous Stream of Events to the Pipeline

You must have a continuous supply of events from the input stream.

  1. Go to the Catalog.

  2. Locate and click the stream you want to troubleshoot.

  3. Check the value of the topicName property under the Source Type Parameters section.

  4. Listen to the Kafka topic where the input stream for the pipeline is received.

    Since this topic is created using Kafka APIs, you cannot consume this topic with REST APIs.

    1. Listen to the Kafka topic hosted on Oracle Event Hub Cloud Service. You must use Apache Kafka utilities or any other relevant tool to listed to the topic.

      Follow these steps to listen to Kafka topic:

      1. Determine the Zookeeper Address. — Go to Oracle Event Hub Cloud Service Platform home page. Find the IP Address of Zookeeper.

      2. Use following command to listen the Kafka topic:
        ./kafka-console-consumer.sh --zookeeper IPAddress:2181 --topic nano
    2. Listen to the Kafka topic hosted on a standard Apache Kafka installation.

      You can listen to the Kafka topic using utilities from a Kafka Installation. kafka-console-consumer.sh is a utility script available as part of any Kafka installation.

      Follow these steps to listen to Kafka topic:

      1. Determine the Zookeeper Address from Apache Kafka Installation based Cluster.

      2. Use the following command to listen the Kafka topic:
        ./kafka-console-consumer.sh --zookeeper IPAddress:2181 --topic nano

Ensure that CQL Queries for Each Query Stage Emit Output

Check if the CQL queries are emitting output events to monitor CQL Queries using CQL Engine Metrics.

Follow these steps to check the output events:

  1. Open CQL Engine Query Details page. For more information, see Access CQL Engine Metrics.

  2. Check that at least one partition has Total Output Events greater than zero under the Execution Statistics section.

    Description of cql_engine_query_details.png follows
    Description of the illustration cql_engine_query_details.png

     If your query is running without any error and input data is continuously coming, then the Total Output Events will keep rising.

Ensure that the Output of Stage is Available

One of the essential things required to troubleshoot a pipeline is to ensure that the output of stage is available in monitor topic.

Follow these steps to check if the output stream is available in the monitor topic:

  1. Ensure that you stay in the pipeline Editor and don’t click Done. Else, the pipeline will be undeployed.

  2. Right-click anywhere in the browser and click Inspect.

  3. Select Network from the top tab and then select WS.

  4. Refresh the browser.

    New websocket connections are created.

  5. Locate a websocket whose URL has a parameter with name topic.

    The value of the topic param is the name of the Kafka topic where the output of this stage is pushed.

    Description of websocket_network.png follows
    Description of the illustration websocket_network.png

  6. Listen to the Kafka topic where output of the stage is being pushed.

    Since this topic is created using Kafka APIs, you cannot consume this topic with REST APIs. Follow these steps to listen to the Kafka topic:

    1. Listen to the Kafka topic hosted on Oracle Event Hub Cloud Service. You must use Apache Kafka utilities or any other relevant tool to listed to the topic.

      Follow these steps to listen to Kafka topic:

      1. Determine the Zookeeper Address. — Go to Oracle Event Hub Cloud Service Platform home page. Find the IP Address of Zookeeper.

      2. Use following command to listen the Kafka topic:
        ./kafka-console-consumer.sh --zookeeper IPAddress:2181 --topic sx_2_49_12_pipe1_draft_st60
    2. Listen to the Kafka topic hosted on a standard Apache Kafka installation.

      You can listen to the Kafka topic using utilities from a Kafka Installation. kafka-console-consumer.sh is a utility script available as part of any Kafka installation.

      Follow these steps to listen to Kafka topic:

      1. Determine the Zookeeper Address from Apache Kafka Installation based Cluster.

      2. Use following command to listen the Kafka topic:
        ./kafka-console-consumer.sh --zookeeper IPAddress:2181 --topic sx_2_49_12_pipe1_draft_st60

Example 6-1 Example Title

(Optional) Use sections to add and organize related examples when including an example with a concept topic.

Determine the Spark Application Name Corresponding to a Pipeline

You can perform the following steps to check if the output stream is available in monitor topic.

  1. Navigate to Catalog.

  2. Open the required pipeline.

  3. Ensure that you stay in pipeline editor and do not click Done. Otherwise the pipeline gets undeployed.

  4. Right-click anywhere in the browser and select Inspect.

  5. Go to WS tab under the Network tab.

  6. Refresh the browser.

    New websocket connections are created.

  7. Locate a websocket whose URL has a parameter with the name topic.

    The value of topic param is the name of Kafka Topic where the output of this stage (query or pattern) is pushed.

    Description of ws_network.png follows
    Description of the illustration ws_network.png

    The topic name is AppName_StageId. The pipeline name can be derived from topic name by removing the _StageID from topic name. In the above snapshot, the pipeline name is sx_2_49_12_pipe1_draft.

Access CQL Engine Metrics

When a pipeline with a query or pattern stage is deployed to a Spark cluster, you can perform the complex event processing using a set of CQL Engine Metrics running inside the spark cluster.

Use CQL queries which can do aggregate, correlate, filter, and pattern matching over a stream of events. Spark provides an out-of-the-box pipeline UI (commonly running on <host>:4040) that can help users to monitor a running Spark Streaming pipeline. As CQL queries also run as part of Spark Streaming pipeline, the Spark pipeline UI is extended to include monitoring capabilities of CQL queries.

To access CQL Engine metrics:

  1. Create a pipeline with at least one query or pattern stage.

  2. Navigate to Spark Master User Interface.

    Description of application_master.png follows
    Description of the illustration application_master.png

  3. Click the CQL Engine tab.

    Description of cql_engine.png follows
    Description of the illustration cql_engine.png

    You can see the details of all queries running inside a Spark CQL pipeline. This page also shows various streams/relations and external relations registered as part of the pipeline.

  4. Click any query to see the details of that query. the query details page shows partition-wise details about a particular running query.

  5. Click the specific partition link to determine further details about query plan and operator level details. This page shows the operator level details of a query processing a particular partition.

    Description of cql_engine_detailed_analysis.png follows
    Description of the illustration cql_engine_detailed_analysis.png

Troubleshoot Pipeline Deployment

Sometimes pipeline deployment fails with the following exception:

Spark pipeline did not start successfully after 60000 ms.

This exception usually occurs when you do not have free resources on your cluster.

Workaround:

Use external Spark cluster or get better machine and configure the cluster with more resources.