Oracle® Enterprise Data Quality for Product Data Java API Interface Guide Release 11g R1 (11.1.1.6) Part Number E29133-03 |
|
|
PDF · Mobi · ePub |
This chapter describes how the DSA API is used with the Oracle DataLens Server.
Use of this interface is the preferred method to access any data transformations that need to be done on the Oracle DataLens Server. This interface will directly execute the DSAs that are deployed to the Oracle DataLens Server. Use of this interface should supplant use of the Oracle DataLens API.
The WfgClient
class is used as an interface to the Oracle DataLens Server. This class provides methods to perform DSA transformations.
DSAs are not a one-to-one match of the input records and the output records. In some cases this may be true, depending on the map. More likely, there will be multiple output steps, and each step will only have a subset of the input data results. In some output steps, there may be no data returned, and in other cases there may be multiple output records returned for a single input record.
This means that the DSAs should pass the original Id into the processing, usually as the first data field. This provides a means for matching the output result data with the original input data.
In cases where data is just being processed, and there is no need to link the results back to each individual input record, then passing the ID through the DSA is not needed.
Use the following to transform data:
Import the WfgClient
with the following lines:
import oracle.pdq.api1.api.client.WfgClient; import oracle.pdq.api1.api.client.WfgResultLine; import oracle.pdq.api1.api.client.WfgRequestLine; import oracle.pdq.api1.api.bean.Fault; import oracle.pdq.api1.api.iface.ErrorIF; import oracle.pdq.api1.api.util.Priorities;
An instance of the WfgClient class needs to be created with the Oracle DataLens Server name and port.
Actual parameters:
Server Name - This can be either a machine name (such as, "Production") or an IP address (such as "127.0.0.1").
Server Port - This is the port number of the server. By default, the Oracle DataLens Server is installed on port 2229.
Encryption flag - False uses normal HTTP communication; true uses the secure HTTPS.
Client Code - This is the "secret code" that the Oracle DataLens Server provides with your server license to prevent unauthorized access to the Oracle DataLens Server via this API. This code is built into the Oracle DataLens Server license and is only active if requested as part of the license. This value can be left blank if the server license has no code.
Application - This application name initiated the client request to the server. This name is used to accumulate server statistics on the Oracle DataLens Server Administration Web Pages.
// Create WfgClient object WfgClient wfgClient; wfgClient = new WfgClient(serverName, SERVER_PORT, ENCRYPTION, clientCode, APPLICATION);
This is a brief example of creating the input data list. First, you create the list from an array of static data as shown as follows:
private static final String m_inputData[][] = { {"0", "Res, 20 Ohm"}, {"1", "Res, Net 4 W"};
This data above is just an example. Your data will come from your application, from an input file or a database query.
In any case, the data needs to be put into an input list for the Java API to process the data. Following is an example of creating and populating the list using the example data. In this case, the input data needs to be separated using the character separator, in this case the Tab character. This interface is best used when there is only one field of input data to be processed.
// Setup this list of String Fields for the request List list = new ArrayList(); for (int i=0; i<m_inputData.length; i++) { List fields = new ArrayList(); // Create a List of Strings fields.add(new String(m_inputData[i][0])); // Add the ID Data Field fields.add(new String(m_inputData[i][1])); // Add the Description Field list.add(fields); }
A list is passed to the runRtJob
method and a single job ID is returned. The runRtJob
method is called just a single time with a single list of data.
Actual parameters:
Job ID - The DSA Job ID obtained from the runJob
call.
DSA Name - The name of the DSA to run on the Oracle DataLens Server.
Description - A description of this particular job.
List - The list with a list of String input fields.
// Start the DSA job with our data
// NOTE: Input data with a List containing a list of string attributes.
// This is useful for already separated data
m_wfgClient.setLinesFromFields(list);
int m_jobID = m_wfgClient.runJob(PMapName, "My Job");
The preceding call is using the following default values:
Job Priority of medium
Job run-time locale of USA English
In any case, the data needs to be put into an input list for the Java API to process the data. Following is an example of creating and populating the list using the example data. In this case, the input data needs to be separated using the character separator. The following example uses the Tab character. This interface is best used when there is only one field of input data to be processed.
List list = new ArrayList(); for (int i=0; i<m_inputData.length; i++) { list.add(new WfgRequestLine(m_inputData[i][0] + "\t" + m_inputData[i][1])); }
Now process the data using the list you created:
The list of WfgRequestLine
objects that have been initialized with the tab-separated input data.
// Start the DSA job with our data wfgClient.setLines(list); int m_jobID = wfgClient.runJob(PMapName, "Comment: API Test Job 1");
When the job has finished, the transformed data can be retrieved from the server back to the client application. The getResultData
method is called just a single time and returns a list of WfgResultLine
objects containing the result data.
The call will wait until the job has finished processing the data before control is returned to the program with the result data.
// Get the DSA Results! boolean waitForResults = true; resultData = wfgClient.getResultData(jobID, waitForResults);
You can check the job status and do other processing while waiting for the job to complete. The getResultData
method will throw a fault indicating that the job is still processing the input data.
try { // Get the DSA Results! boolean waitForResults = true; resultData = wfgClient.getResultData(jobID, waitForResults); } catch (Fault f) { // Check if the job has not completed yet if (f.getErrorCode() == ErrorIF.ERROR_NOT_COMPLETED) . . . }
Server Faults that can be thrown from a call to getResultData
include the following ErrorIF
errors:
ERROR_JOB_CANCELED ERROR_CANCEL_FAILED ERROR_COPY_FAILED ERROR_JOB_FAILED ERROR_NOT_COMPLETED
When the job has finished, the transformed data can be retrieved from the Server back to the client application. The getResultData
method is called just a single time for each output step. Each call returns a list of WfgResultLine
objects containing the result data, just as with the jobs with a single output step.
The call will wait until the job has finished processing the data before control is returned to the program with the result data.
// Get the DSA Results!
resultData = wfgClient.getResultData(jobID, stepName, waitForResults);
The call to getResultData
can be made synchronously or asynchronously as demonstrated above.
The result data is returned as a list of WfgResultLine
objects.
This is how the result data fields should be pulled from the output lines. This list interface will always maintain all the columns of output data, even if there is no data for a particular output data field. In this case, the data field result will be a null value.
The following code excerpt demonstrates pulling out the individual data lines, with the individual data fields.
// Iterate through the result data lines Iterator iter = resultData.iterator(); while (iter.hasNext()) { WfgResultLine resultLine = (WfgResultLine)iter.next(); List outFields = resultLine.getDataFields(); // Iterate through the result data fields Iterator i2 = outFields.iterator(); while (i2.hasNext()) { String outField = (String)i2.next(); System.out.print(outField); If (i2.hasNext())System.out.print(", "); } System.out.println(" "); }
This is a simple way to get to the result data for testing. The following code excerpt demonstrates pulling out each line of tab-separated output data.
Note:
This example works if you have specified an alternate separator character.Iterator iter = resultData.iterator(); while(iter.hasNext()) { WfgResultLine resultLine = (WfgResultLine)iter.next(); System.out.println(resultLine.getData()); }
The DSA Client can list Jobs can be listed from the Oracle DataLens Server Administration Web Pages. The following types of lists can be retrieved from the server.
All Jobs (also since a particular date)
All jobs that have not completed
All jobs for a particular submitter (also since a particular date)
All not-completed jobs for a particular submitter
All jobs for a particular approver
The following code shows the calls in the order listed in the preceding:
List list = wfgClient.listAllJobs(sinceTS); List list = wfgClient.listNCJobs(); List list = wfgClient.listSubmitterJobs(submitter, sinceTS); List list = wfgClient.listNCSubmitterJobs(submitter); List list = wfgClient.listApproverJobs(approver);
These calls all return lists of WfgJobInfo
objects.
Information can also be obtained from a single job given the Job ID. The following Java code example shows this:
WfgJobInfo jobInfo = wfgClient.listJob(jobID);
This call returns a single WfgJobInfo
object with the job details.
Additionally, all the details on the steps are returned as well. To get the steps, use the getSteps
method call as shown in the following example:
List steps = jobInfo.getSteps();
These steps are a list of WfgJobStepInfo
objects with all the details on the individual job steps.
The DSA API can use a text file as input and a text file as output. The complete path to the input file and the complete path to the output directory are needed. Use the setters to toggle on the input/output directory locations as in the following example:
// Setting the input file and output directory toggles on file processing wfgClient.setOutputDirectory(outputLocation); wfgClient.setInputFilePath(filePath); jobID = wfgClient.runJob(transformProcess, desc);
These file input paths and the file output paths are sent directly to the Oracle DataLens Server. This means that the paths must be paths that are relative to the server. For example, if you give the path to an input file as:
C:/temp/raw_data.txt
This file is from the C drive on the server machine, not the C drive on the client machine. The output directory is also a relative path from the server machine as well.
The source path can be a UNC path to a file on a remote machine.
Here is an example:
//node_name/shared/test.txt
WfgClient
These are options that can be used by the WfgClient
. In fact, these settings can be used by any of the Oracle DataLens Server Client API classes. For a complete list of methods in the WfgClient
class and additional information, see the Javadoc documentation as described in "Related Documents".
This is useful to control the amount of time that the client attempts to connect to the Oracle DataLens Server. The default is to retry 20 times. This could be a problem in an interactive user environment, where you does not have a couple of minutes while WfgClient
is attempting to connect to the server. In these cases you could set the retry count to 1 or even 0. Look also at PingClient,
which can be used to check if a particular server is responding.
// Just set the retry to one for starting the job, then use the default wfgClient.setRetryCount(1);jobID = wfgClient.runRtJob(transformProcess, jobPriority, desc, rtLocale, input); wfgClient.setRetryCountToDefault();
By default, data filtering is turned on for all input data. This will filter out all inadvertent control characters that may be interspersed in your input data. This data can cause problems with processing and sometimes it can cause problems with sending the data from the client to the server via HTTP as XML Soap documents. Tab characters are never filtered out.
// By default filtering is turned on and nothing needs to be done
wfgClient.setFilterData(false);
jobID = wfgClient.runRtJob(transformProcess, jobPriority, desc, rtLocale, input);
In the preceding example, the parameter input (with the List of input data) will be filtered.
Where the filtering encounters control characters in the input data, they will be substituted with the "?" character. This facilitates you in tracking down the source and exact location of the control characters. The data lens can ignore the "?" character when processing the input lines.
By default, a job priority of medium is used for all jobs.
This is the priority the job will be given on the server for processing. Large batch overnight jobs should be given a priority of low. Small jobs with few input records, or requests that need a quick response, such as users waiting for a response should get a priority of High. All other jobs should use a priority of medium. The number of concurrent jobs that can be run on the server is also controlled by the priority of the job (For more information, use the Configuration link on the Oracle DataLens Server Administration Web Pages). These priority values can be used from the Priorities
class in the edqp-api.jar
.
Priorities.PRIORITY_LOW
Priorities.PRIORITY_MEDIUM
Priorities.PRIORITY_HIGH
// Set the job priority wfgClient.setPriority(Priorities.PRIORITY_HIGH);
By default, a run-time locale of USA English (en_US
) is used.
Set the locale to use for output of this job.
// Set the run-time locale wfgClient.setRuntimeLocale(RT_LOCALE);
By default, a field separator character of tab is used.
// Set the run-time locale wfgClient.setFieldSeparator('|');
Note:
If you are using a different separator character than the default, then the separator character must be specified when pulling the data fields from theWfgResultLine
data object.List fields = wfgResultLine.getDataFields(FIELD_SEPARATOR_CHAR);
By default, this is toggled off when a new WfgClient
object is created
This will dump the client information out to standard output prior to sending the request to the server. This is only used for debugging and should never be toggled to on in a production environment.
// Toggle on client data to standard output wfgClient.setTrace(true);
If set, then an API request that would return a list or update a file, will email the results to the user specified instead.
// Send the results to the following user wfgClient.setEmailAddress("user1@systems.com");
A DSA that updates a database will continue to update the database.
A DSA can be defined to return the results to an email address. This will work regardless of this API email setting. In fact, the email address in the DSA will take precedence over this email set in the API.
If set, then an API request that would return a list or update a file, will send the results instead to the FTP location specified.
// Send the results to the following FTP site wfgClient.setFtpName("internal");
This value should not be set if setEmailAddress
is being used. In addition, the FTP name, internal, must be setup on the Oracle DataLens Server as in the following:
By default, database parameters are not used.
This is used where the input map is expecting input from a database query and the query requires parameters that must be passed in.
Create a list of parameters and then set the database parameters as shown in the following code excerpt:
// Set the database parameters List dbParams = new ArrayList(); dbParams.add("first_parameter"); dbParams.add("second_parameter"); wfgClient.setDbParameters(dbParams);