Skip Headers
Oracle® Enterprise Data Quality for Product Data Endeca Connector Installation and User's Guide
Release 11g R1 (11.1.1.6)

E29135-03
Go to Documentation Home
Home
Go to Book List
Book List
Go to Table of Contents
Contents
Go to Feedback page
Contact Us

Go to previous page
Previous
Go to next page
Next
PDF · Mobi · ePub

C Endeca Connector Robustness

The Endeca Connector supports high availability through:

The Endeca Connector supports parallel processing and load balancing through:

This is accomplished without the need for additional hardware support such as redundant clustered servers or intensive hardware support although these hardware solutions are fully supported. This reduces hardware infrastructure costs by having a very robust software solution. Additionally, it allows parallel processing, load balancing and high availability for the Endeca Connector Adapter when running as part of the Endeca Forge processing.

Endeca Connector Redundancy

Redundancy is accomplished by having multiple Oracle DataLens Servers, all setup to process DSAs, and all setup to load and process the same data lenses. This is configured with the multiple Oracle DataLens Server configuration parameters supported by the Endeca Connector.

These multiple redundant servers eliminate the need for additional hardware support for redundancy.

Note:

Three servers are defined in the example though there is no limit to the number of Oracle DataLens Servers that you can add.

These multiple redundant servers are used by both the Endeca Connector Add-In Discovery components (and the Deletion components) and the Endeca Connector PdqAdapter.

This solution is completely flexible and will work with almost any Oracle DataLens Server topology, such as the following:

Configuring the DSA and Data Lens

The Endeca Connector DSA must be made available to all the Oracle DataLens Servers in any of the Development or Production Server Groups.

Each Oracle DataLens Server must have the All check box selected so that the all of the deployed data lenses used by the Forge process DSAs are loaded as in the following:

Surrounding text describes image048.png.

Go to the Oracle DataLens Server Administration web page and ensure this option is set for each Oracle DataLens Server in the appropriate Development and Production server groups. For more information, see Oracle Enterprise Data Quality for Product Data Oracle DataLens Server Administration Guide

Round-Robin Support for Oracle DataLens Servers

The Endeca Connector DSA Transformation Add-Ins (the discovery processes) uses a round-robin approach to selecting an initial server for data processing. This means that the job will not even start until a Oracle DataLens Server is verified to be up and running. This is controlled by the PDQ_SERVER parameters that are set in the PdqAdapter pass through parameters and used by all the components of the Endeca Connector.

The round-robin checking always starts with PDQ_SERVER_1 and then checks PDQ_SERVER_2 and finally PDQ_SERVER_3.

Note that the Endeca Connector Adapter is more sophisticated and keeps track of the last accessed server when doing the round-robin fail-over.

The Oracle DataLens Servers all have a ”ping servlet” so that the Endeca Connector can ensure not only that the server is running, but also that the Oracle DataLens Server service is running and processing requests.

Add-In Transforms

Following is an example of the round-robin server connection from the Oracle DataLens Server log file for a ”Discover Precedence” DSA job.

INFO 16 Sep 2008 15:53:07 [] -  PDQ-Endeca Connector Dimension Discovery Version 11.1.1.6.0, Build 20120804 Copyright (c) 2012, 2012, Oracle and/or its affiliates. All rights reserved.  
INFO 16 Sep 2012 15:53:10 [] - Attempted 0 times to connect to http:// DLFProdServerOne:2229/datalens/Ping
ERROR 16 Sep 2012 15:53:10 [] - Failed to connect to server (http:// DLFProdServerOne:2229/datalens/Ping)[PingRequest]: DLFProdServerOne  
INFO 16 Sep 2012 15:53:10 [] - PdqAdapter parameters: 
    DSA_MAP = endeca_demo_dimensions
    REPLACE_UNDERSCORES_ONLY = true
    USE_PDQ_rPREFIX = false
    Using PDQ_SERVER_2 (DLFProdServerTwo:2229)
INFO 16 Sep 2012 15:53:10 [] - Connecting to Server DLFProdServerTwo and port 2229
INFO 16 Sep 2012 15:53:10 [] - Running on  DataLens Admin server DLFProdServerTwo:2229 

This failed to get a response from the DLFProdServerOne and ended up connecting to the DLFProdServerOne. The log also reports on which PDQ_SERVER is being used.

Endeca Connector Fail-over

The Endeca Connector fail-over is a component that works when processing the actual data with the PdqAdapter during the Endeca Forge processing. This is optimized over a hardware fail-over solution because the Endeca Connector Fail-over will resubmit the data chunk to an alternate server is a problem is encountered, continuing the Forge processing. If a job is processing chunk 15 of a total of 20 chunks, the fail-over will resubmit data chunk 15 to a redundant Oracle DataLens Server, continuing the Forge processing without Forge ever being aware that a Oracle DataLens Server went down.

A hardware fail-over will require that the Forge job is re-submitted from the start.

The fail-over will occur if the following occur:

Note:

The Endeca Connector Adapter keeps track of the last accessed server among all the servers defined when doing the round-robin fail-over and will use this information to determine which server to send a chunk of data to for re-processing.

Here is the result of pulling the plug on one of the Oracle DataLens Servers:

Endeca51:2229-2 2009.02.04_03:51:00 Running a data chunk on the DLS Server Endeca51:2229
Endeca51:2229-2 2009.02.04_03:51:01 Processing a chunk of 9950 records on the Endeca51:2229 DLS server
Endeca51:2229-2 2009.02.04_03:51:31 DLF Server Endeca51:2229 is not responding
Endeca51:2229-2 2009.02.04_03:51:31 Warning: Caught a Connection Exception, trying another server...
Endeca51:2229-2 2009.02.04_03:52:13 DLF Server Endeca51:2229 is not responding
Endeca51:2229-2 2009.02.04_03:52:13 Retrying the chunk with the DL Server admin1-M6300:2229
Endeca51:2229-2 2009.02.04_03:52:14 Processing a chunk of 9950 records on the admin1-M6300:2229 DLS server
Endeca51:2229-2 2009.02.04_03:52:15 Using Job Id: 182
Endeca51:2229-2 2009.02.04_03:53:04 Job#182 DLS Server returned 7600 records from the chunk

In the preceding example, PDQ_SERVER_1 is pinged to verify that there was just not a network issue. Then the server is hot-swapped to PDQ_SERVER_2 and the entire chunk is re-submitted. The last line in the preceding log snippet is the first line of re-submitted data for this current chunk.

The following error message will be output to the log file if the WebLogic Server is stopped or fails:

admin1-M6300:2229-1 2009.02.04_03:45:07 Warning: Caught a Job Failed Fault, trying another server...

Example of Single Threaded Versus Multiple Oracle DataLens Servers

This first example is of the Endeca Connector Adapter running with a single Oracle DataLens Server.

All the data chunks are being processed in parallel threads (one per server) on three separate Oracle DataLens Servers.

PDQ-Endeca Connector Adapter (Endeca Java Manipulator) Version 11.1.1.6.0, Build 20120804 Copyright (c) 2008, 2012, Oracle and/or its affiliates. All rights reserved.
PDQ_SERVER_1 - Adding required DataLens Server cwellell-M6300:2229
PDQ_SERVER_2 - Adding optional High-Availability, Load-Balanced, Parallel-Processing DataLens Server Endeca51:2229
PDQ_SERVER_3 - Adding optional High-Availability, Load-Balanced, Parallel-Processing DataLens Server cwellell-VM:2229
cwellell-M6300:2229-1 2009.02.12_01:10:07.000    Running a data chunk on the DLS Server cwellell-M6300:2229
cwellell-M6300:2229-1 2009.02.12_01:10:07.343  Processing a chunk of 10000 records on the cwellell-M6300:2229 DLS server
cwellell-M6300:2229-1 2009.02.12_01:10:08.796            Using Job Id: 208
cwellell-M6300:2229-1 2009.02.12_01:10:23.031            Job#208 DLS Server returned 8965 records from the chunk

cwellell-VM:2229-3 2009.02.12_01:10:12.890       Running a data chunk on the DLS Server cwellell-VM:2229
cwellell-VM:2229-3 2009.02.12_01:10:13.984       DLF Server cwellell-VM:2229 is not responding
cwellell-VM:2229-3 2009.02.12_01:10:14.125       Retrying the chunk with the DL Server Endeca51:2229
cwellell-VM:2229-3 2009.02.12_01:10:14.359     Processing a chunk of 3041 records on the Endeca51:2229 DLS server
cwellell-VM:2229-3 2009.02.12_01:10:20.859               Using Job Id: 10
cwellell-VM:2229-3 2009.02.12_01:11:17.343               Job#10 DLS Server returned 1954 records from the chunk

Endeca51:2229-2 2009.02.12_01:10:11.250  Running a data chunk on the DLS Server Endeca51:2229
Endeca51:2229-2 2009.02.12_01:10:11.921        Processing a chunk of 9950 records on the Endeca51:2229 DLS server
Endeca51:2229-2 2009.02.12_01:10:43.515          Using Job Id: 11
Endeca51:2229-2 2009.02.12_01:11:06.500          Job#11 DLS Server returned 7600 records from the chunk

********* Processed 3 Chunks  with 22991 total input lines *********
********* Updated   18519 total lines by the PDQ-Endeca Connector *********
********* Completed the PDQ-Endeca Connector processing in 73 seconds *********