8 Using the EDQ Configuration API

EDQ provides a set of REST-based interfaces that enable you to perform various configuration tasks programmatically, using any preferred programming language.

In this chapter, the EDQ service is assumed to be installed at:

http://edqserver:8001/edq

This chapter provides a detailed description of these interfaces and the operations that can be performed using these interfaces. It includes the following topics:

8.1 REST Interface for Projects

The REST interface for working with EDQ projects is

http://edqserver:8001/edq/config/projects

This interface allows you to perform the following tasks:

8.1.1 Retrieving a List of EDQ Projects

To get a list of all projects that are available with the current EDQ installation, you need to simply run an HTTP GET operation on the REST interface for EDQ projects, as shown in the following code:

GET http://edqserver:8001/edq/config/projects

When this code runs successfully, a list of projects is generated in JSON format:

[  
   {  
      "id":10,
      "name":"My Project"
   },
   {  
      "id":12,
      "name":"Scratch Project"
   }
]

8.1.2 Creating a Project

To create a new project, you need to create a JSON object that describes the project to be created and then send it in the request body of the REST call using an HTTP POST.

For example:

POST http://edqserver:8001/edq/config/projects

{ "name" : "Profile Customer Names" , 
  "description" : "Profile my customers" }

This code returns a response of type "OK" with a response body similar to the following:

{"id":14,"name":"Profile Customer Names"}

However, if an error occurs while creating a project, then a response of type "500 Internal Server Error" is generated, along with an error message similar to the following:

"Profile Customer Names" already exists (Code: 205,130)

8.1.3 Deleting a Project

To delete a project, you need to call HTTP DELETE on the REST interface and specify the project you need to delete, as a query parameter.

There are two ways to specify a project for deletion:

  • By ID using pid=<NN>

  • By name using pname=<Name>

To delete a project by ID, do the following:

DELETE http://edqserver:8001/edq/config/projects?pid=14

To delete a project by name, do the following:

DELETE http://edqserver:8001/edq/config/projects?pname=Profile%20Customer%20Names

In both cases, the result is a string similar to the following:

Project Profile Customer Names deleted

If an invalid project is specified then a response of type 406 'Not Acceptable' is returned with an appropriate string message, for example:

Bad project ID "14" (Code: 205,454)

or

No project named "Profile Customer Names" (Code: 205,453)

8.2 REST Interface for Data Stores

You can query and manipulate data stores in EDQ, using the following interface:

http://edqserver:8001/edq/config/datasources

This interface allows you to perform the following tasks:

8.2.1 Retrieving a List of Data Stores

To get a list of data stores, call the interface with a valid project name.

For example:

GET http://edqserver:8001/edq/config/datasources?pid=14

If successful, an OK response is returned along with a list of data stores in the response body.

For example:

[  
   {  
      "client":false,
      "id":36,
      "name":"Individuals",
      "properties":[  
         {  
            "name":"quote",
            "value":"\""
         },
         {  
            "name":"encoding",
            "value":"ISO-8859-1"
         },
         {  
            "name":"file",
            "value":"Customer/customerindividuals.csv"
         },
         {  
            "name":"cols",
            "value":""
         },
         {  
            "name":"project",
            "value":"59"
         },
         {  
            "name":"hdr",
            "value":"1"
         },
         {  
            "name":"usepr",
            "value":"0"
         },
         {  
            "name":"skip",
            "value":""
         },
         {  
            "name":"sep",
            "value":","
         }
      ],
      "species":"servertxt"
   }
]
 

8.2.2 Creating a Data Store

To create a data store you need to create a JSON object that describes the data store and then POST it to the endpoint specifying the project (by name or id) that will own the data store.

For example:

A server-based .csv file that has been placed in the landing area.

POST http://edqserver:8001/edq/config/datasources?pid=14
{  
   "client":false,
   "name":"Individuals",
   "properties":[  
      {  
         "name":"quote",
         "value":"\""
      },
      {  
         "name":"encoding",
         "value":"ISO-8859-1"
      },
      {  
         "name":"file",
         "value":"Customer/customerindividuals.csv"
      },
      {  
         "name":"hdr",
         "value":"1"
      },
      {  
         "name":"usepr",
         "value":"0"
      },
      {  
         "name":"sep",
         "value":","
      }
   ],
   "species":"servertxt"
}

A server-based Oracle schema:

POST http://edqserver:8001/edq/config/datasources?pid=14
{  
   "client":false,
   "name":"Staging",
   "properties":[  
      {  
         "name":"service",
         "value":"sid"
      },
      {  
         "name":"sid",
         "value":"orcl"
      },
      {  
         "name":"user",
         "value":"staging"
      },
      {  
         "name":"port",
         "value":"1521"
      },
      {  
         "name":"password",
         "value":"staging"
      },
      {  
         "name":"host",
         "value":"localhost"
      }
   ],
   "species":"oracle"
}

If successful, an OK response is returned along with the name and ID of the data store, as shown in the following example:

{"id":42,"name":"Staging"}

The value of the species parameter varies depending on the type of data store being used in a project. For example, if you are using the Oracle database, the value of species would be "oracle". Each species parameter has its own set of properties.

The following table lists the properties for the species "oracle":

Property Type Required Description
host String Yes The machine hosting the database
port Number Yes Port number of the database
sid String Yes Database Identifier
service Choice of sid or srv Yes Whether the database identifier above is a SID or a SERVICE (srv) name
user String Yes User to log in as
password String No Password for the user
schema String No The schema to use (usually left empty)

The following table lists the properties for the species "servertext":

Property Type Required Description
file String Yes The name and location of the file in the landing area
userpr Boolean Yes Use project specific landing area
hdr Boolean Yes Treat the first line as header
sep String No The field delimiter
quote Choice No Quote character
cols Integer No Number of columns to read
encoding String Yes The character encoding of text
skip Integer No The number of the lines to skip at the start

Note:

For Boolean type use 0 for false and 1 for true.

For quote, the value needs to be a double quote like this ""\"", or a single quote like this "'", or an empty value like this "".

For the species, "other", which uses a JDBC connection, the properties are listed in the following table:

Property Type Required Description
driver String Yes The JDBC driver java class
url String Yes The address of the database
user String No The user to log in as
password String No The user's password

8.2.3 Deleting a data store

To delete a data store call HTTP DELETE on the endpoint by specifying either a data store id or a valid project (by name or id) and a data store name, for example:

DELETE http://edqserver:8001/edq/config/datasources?id=42

or

DELETE http://edqserver:8001/edq/config/datasources?pid=14&name=Staging

When the deletion is successful, an OK response is returned without any response body.

8.3 REST Interface for Snapshots

The REST interface for snapshots is:

http://edqserver:8001/edq/config/snapshots

It allows you to perform the following tasks:

8.3.1 Retrieving a List of Snapshots

To retrieve a list of snapshots specify a valid project, for example:

GET http://edqserver:8001/edq/config/snapshots?pid=14

or

GET http://edqserver:8001/edq/config/snapshots?pname=Profile%20Customer%20Names

If successful, it returns an OK response with a list of snapshots in the response body.

For example:

[  
   {  
      "columns":[  
         "TITLE",
         "FULLNAME",
         "GIVENNAMES",
         "FAMILYNAME",
         "NAMETYPE",
         "PRIMARYNAME",
         "ADDRESS1",
         "ADDRESS2",
         "ADDRESS3",
         "ADDRESS4",
         "CITY",
         "STATE",
         "POSTALCODE"
      ],
      "datasource":"Individuals",
      "name":"Individuals",
      "table":"customerindividuals.csv"
   }
]

8.3.2 Creating a Snapshot

To create a snapshot, you need to create a JSON object that describes the snapshot and specify the project where it will be created.

For example:

POST http://edqserver:8001/edq/config/snapshots?pid=14
{  
   "name":"Individuals",
   "description":"Customer data",
   "datasource":"Individuals",
   "table":"customerindividuals.csv",
   "columns":[  
      "TITLE",
      "FULLNAME",
      "GIVENNAMES",
      "FAMILYNAME",
      "NAMETYPE",
      "PRIMARYNAME",
      "ADDRESS1",
      "ADDRESS2",
      "ADDRESS3",
      "ADDRESS4",
      "CITY",
      "STATE",
      "POSTALCODE"
   ],
   "sampling":{  
      "number":100,
      "offset":0,
      "ordering":"ascending",
      "count":"true"
   }
}

If successful, an OK response is returned with the snapshot ID in the response body.

For example:

{"id":68,"name":"Individuals"}

8.3.3 Deleting a Snapshot

To delete a snapshot, you need to specify either a snapshot ID or a valid project and snapshot name.

For example:

DELETE http://edqserver:8001/edq/config/snapshots?id=68

or

DELETE http://edqserver:8001/edq/config/snapshots?pid=14&name=Individuals

or

DELETE http://edqserver:8001/edq/config/snapshots?pname=Profile%20Customer%20Names&name=Individuals

When the specified snapshot is deleted successfully, it returns an OK response with a string message in the response body, which is similar to the following:

Snapshot Individuals deleted

8.4 REST Interface for Processes

The interface for EDQ processes is:

http://edqserver:8001/edq/config/processes

Using this interface, you can perform the following tasks:

The following sub-level interface allows you to create a simple profiling process:

http://edqserver:8001/edq/config/processes/simpleprocess

See Creating a Simple Process section for details.

8.4.1 Retrieving a List of Processes

To get a list of processes in a project, you need to call HTTP GET on the processes interface with a valid project name.

Example:

GET http://edqserver:8001/edq/config/processes?pid=14

or

GET http://edqserver:8001/edq/config/processes?pname=Profile%20Customer%20Names

An OK response is returned along with a list of processes.

Example:

[{"name":"Profile Names","id":31}]

If the request is not successful, an error response would either be a 404 'Not Found' or 500 'Internal Server Error' along with a string in the response body describing the error.

8.4.2 Deleting a Process

To delete a process, specify either the process ID, or the project ID or name, and the process name. For example:

DELETE http://edqserver:8001/edq/config/processes?id=31

or

DELETE http://edqserver:8001/edq/config/processes?pid=14&name=Profile%20Names

or

DELETE http://edqserver:8001/edq/config/processes?pname=Profile%20Customer%20Names&name=Profile%20Names

When the deletion is successful, an OK response is returned with a string message and response body, as shown in the following example:

Process Profile Names deleted

If the deletion is not successful, then either of the following errors along with a response string are returned:

  • 404 "Not Found"

  • 500 "Internal Server Error"

8.4.3 Creating a Simple Process

The interface currently only supports creation of simple profiling processes. To create a simple process, you need to create a JSON object that describes the process you want to create and specify the project where it should be created. The HTTP POST operation is used to post this information to the interface.

For example:

POST http://edqserver:8001/edq/config/processes/simpleprocess?pid=14
{  
   "name":"Profile Names",
   "description":"Profile Individuals Names",
   "reader":{  
      "name":"Read from Individuals",
      "stageddata":"Individuals"
   },
   "processors":[  
      {  
         "name":"Do Quickstats",
         "type":"dn:quickstatsprofiler",
         "columnlist":[  
            "GivenNames",
            "FamilyName"
         ]
      },
      {  
         "name":"Do Frequency Profiling",
         "type":"dn:attributefrequencycountsprofiler"
      }
   ]
}

When the simple process is created successfully, an OK response is returned along with the name and ID of the process in the response body.

For example:

{"id":33,"name":"Profile Names"}

An error response would be generated in the following cases:

  • 404 'Not Found' if the project does not exist

  • 400 'Bad Request' if the JSON object is malformed

  • 500 'Internal Server Error' if a server error occurs during creation, along with a string in the response body describing the error.

The full list of supported processors is mentioned in the following table:

Processor Type
Quick Stats Profiler dn:quickstatsprofiler
Data Types Profiler dn:datatypesprofiler
Max/Min Profiler dn:maxandminprofiler
Length Profiler dn:lengthprofiler
Record Completeness Profiler dn:recordcompletenessprofiler
Character Profiler dn:characterprofiler
Frequency Profiler dn:attributefrequencycountsprofiler
Patterns Profiler dn:attributepatternsprofiler

8.5 REST Interfaces for Jobs

The /config/jobs interface performs the following tasks on EDQ jobs:

The URL for this interface is similar to the following:

http://edqserver:8001/edq/config/jobs

There is another REST interface, /jobs to perform the following tasks:

The URL for this interface is similar to the following:

http://edqserver:8001/edq/jobs

8.5.1 Retrieving a List of Jobs

You can get a list of jobs for a project using HTTP GET. You must specify at least one project (there could be more than one project) as the query parameter.

The project can be specified by using the pid or pname in the query parameter, as shown in the following example:

GET http://edqserver:8001/edq/config/jobs?pid=14

or

GET http://edqserver:8001/edq/config/jobs?pname=Profile%20Customer%20Names

The response to this operation would list all the jobs for the specific project or projects, as shown in the following example:

[  
   {  
      "id":99,
      "name":"Profile Names Job"
   },
   {  
      "id":98,
      "name":"Profile Individuals Job"
   }
]

8.5.2 Deleting a Job

To delete a job you need to either specify a valid job ID or a valid project (using one of the query parameters) and a valid job name.

To delete a job by job ID:

DELETE http://edqserver:8001/edq/config/jobs?id=99

To delete a job by job name:

DELETE http://edqserver:8001/edq/config/jobs?pid=14&name=Profile%20Names%20Job

or

DELETE http://edqserver:8001/edq/config/jobs?pname=Profile%20Customer%20Names&name=Profile%20Names%20Job

When the job is deleted successfully, a message appears to confirm the deletion:

Job Profile Names Job deleted

8.5.3 Creating a Simple Job

A simple job has a single phase and contains a single process. To create a simple job, you need to specify a valid project that owns the job (using one of the query parameters). Also, you need to create a JSON object describing the job to create.

For example:

POST http://edqserver:8001/edq/config/jobs/simplejob?pid=14
{ 
"name" : "Profile Names Job" , 
"process" : "Profile Names" ,
"description" : "Profile Customer Names"
"resultsdrilldown" : "none"
}

The attribute resultsdrilldown can have any of the following values: none, sample, limited, all. However, the sample and limited values have the same implication.

8.5.4 Running a Job

With the /jobs/run interface you can run a named job using HTTP POST. The required parameters are the project name or project ID and the job name. Optionally, you can specify run label and overrides.

The following example illustrates the URL representation of the interface when running a job named "Real-time Start All" for the project "Profile Customer Name":

POST http://edqserver:8001/edq/jobs/run
{  
   "project":"Profile Customer Name",
   "job":"Real-time Start All",
   "overrides":[  
      {  
         "name":"a",
         "value":"b"
      },
      {  
         "name":"c",
         "value":"d"
      }
   ]
}

The JSON response to this request would be similar to the following:

{
"executionID": 2
"runeverywhere": false
}

In this example, the job "Real-time Start All" returns the "runeverywhere" value as false, which implies that this job can run only in one place. In such cases, an executionID is returned for the job. This executionID can be used to cancel a job and query a job's status.

However, if the value of "runeverywhere" were true, then only the "jobtype" would be returned in the JSON response. For "runeverywhere" jobs, cancel and query calls are not supported.

8.5.5 Cancelling a Running Job

To cancel a running job, the HTTP POST operation is used. The interface URL is similar to the following:

POST http://edqserver:8001/edq/jobs/cancel

You only need the executionID of the job to cancel it. The following example illustrates cancelling a job with the executionID 12.

{ 
"executionID": 12, 
"type" : "immediate" 
}

The following options are available with the "type" parameter:

  • immediate: This option cancels the job as early as possible.

  • keepresults: This option cancels the job but retains the results that have been generated so far.

  • shutdown: This options is used to cancel or shutdown a job that runs a web service.

No response is returned when a job is cancelled successfully. If the cancellation is unsuccessful, an HTTP error such as "404 File Not Found" or "Internal Server Error" is displayed.

8.5.6 Getting the Status of a Job

To get the status of an individual job, the executionID is passed in the URL. The URL looks similar to the following:

GET http://edqserver:8001/edq/jobs/status?xid=executionID

For example, if the execution ID of the job "Real-time Start All" is 14, the URL would be:

GET http://edqserver:8001/edq/jobs/status?xid=14

The JSON response would be as follows:

{   "complete": true,   
   "endtime": "2016-04-20T08:45:22+01:00",
   "executionid": 14,
   "job": "Real-time START ALL",
   "project": "Profile Customer Name",
   "server": "edqserver",
   "starttime": "2016-04-20T08:44:48+01:00",
   "status": "finished"}

In this example, the "Real-time Start All" job triggers other jobs. Once all the jobs in the project are triggered, the status of the executionID 14 shows finished. However, the jobs that are triggered by the "Real-time Start All" job, may still show the status as running.

8.5.7 Getting the Details of All Running Jobs

The status of all running jobs in a project can be retrieved using the /jobs/running interface, which would be represented by a URL similar to the following:

GET http://edqserver:8001/edq/jobs/running

The following example shows the output in JSON format:

{
      "complete": false,
      "executionid": 4,
      "job": "Real-time Individual Clean",
      "project": "Profile Customer Name",
      "server": "edqserver",
      "starttime": "2016-04-19T10:05:30.74+01:00",
      "status": "running"
   },
      {
      "complete": false,
      "executionid": 8,
      "job": "Real-time Address Match",
      "project": "Profile Customer Name",
      "server": "edqserver",
      "starttime": "2016-04-19T10:06:11.755+01:00",
      "status": "running"
   },
      {
      "complete": false,
      "executionid": 5,
      "job": "Real-time Entity Clean",
      "project": "Profile Customer Name",
      "server": "edqserver",
      "starttime": "2016-04-19T10:05:32.778+01:00",
      "status": "running"
   },
      {
      "complete": false,
      "executionid": 6,
      "job": "Real-time Address Clean",
      "project": "Profile Customer Name",
      "server": "edqserver",
      "starttime": "2016-04-19T10:05:32.782+01:00",
      "status": "running"
   }

Optionally, you can provide other query parameters such as project name, job name, and run label. For jobs without a run label, omit the run label parameter and set it to empty filter jobs with no run label. The URL in this case would be similar to the following:

/jobs/running?[project=project[&job=job][&runlabel=]

8.6 REST Interface for Reference Data

Reference data can exist within a project or outside of all projects at system level. To refer to reference data at system level, specify the project ID as 0 or pid=0.

The interface for reference data is:

http://edqserver:8001/edq/config/referencedata

The interface for reference data content is:

http://edqserver:8001/edq/config/referencedata/contents

You can use this interface to perform the following tasks:

8.6.1 Retrieving a List of Reference Data

To get a list of all reference data defined at system level specify the project using either of the following parameters:

By pid = 0:

GET http://edqserver:8001/edq/config/referencedata?pid=0

By pname:

GET http://edqserver:8001/edq/config/referencedata?pname=Profile%20Customer%20Names

A list of JSON objects, which represent reference data, are returned. The output looks similar to the following:

[  
   {  
      "activerows":2,
      "category":"charactertokeymap",
      "columns":[  
         {  
            "key":true,
            "name":"Name",
            "type":"STRING",
            "unique":true
         },
         {  
            "name":"Value",
            "type":"STRING",
            "value":true
         }
      ],
      "id":39,
      "name":"Tokens",
      "totalrows":2
   }
]

8.6.2 Retrieving Contents of Reference Data

To list the contents of the reference data, the reference data contents interface is used. You must specify a valid project or use pid=0 for system level.

For example:

GET http://edqserver:8001/edq/config/referencedata/contents?pid=0&id=40

or

GET http://edqserver:8001/edq/config/referencedata/contents?pid=0&name=ShortNameMap

This returns information about the reference data rows, as shown in the following code snippet:

{  
   "activerows":4,
   "columns":[  
      {  
         "key":true,
         "name":"ShortName",
         "type":"STRING",
         "unique":true,
         "value":true
      },
      {  
         "key":true,
         "name":"LongName",
         "type":"STRING",
         "value":true
      }
   ],
 
"description":"Map short names to long names",
   "id":43,
   "name":"ShortNameMap",
   "rows":[  
      {  
         "data":[  
            "Jeff",
            "Jeffrey"
         ]
      },
      {  
         "data":[  
            "Jon",
            "Jonathan"
         ]
      }
   ],
   "totalrows":4
}

8.6.3 Creating Reference Data

To create reference data you need to create a JSON object describing the reference data, which you will post to the interface specifying either pid=0 for system level, or a valid project name.

For example:

POST 
http://edqserver:8001/edq/config/referencedata?pname=Profile%20Customer%20Names

{ 
"name" : "ShortNameMap",
"description" : "Map short names to long names",
"columns":
        [
            {
                "key": true,
                "name": "ShortName",
                "type": "STRING",
                "unique": true,
                "value": true
            },
            {
                "key": true,
                "name": "LongName",
                "type": "STRING",
                "value": true
            }
    ],
"rows":
    [
        {
            "data":
            [
                "Jeff",
                "Jeffrey"     
            ]
        },
        {
            "data":
            [
                "Jon",
                "Jonathan"
            ]
        }
  ]
}

On successful creation, a response similar to the following is returned:

{"id":40,"name":"ShortNameMap"}

8.6.4 Deleting Reference Data

To delete reference data, you need to call HTTP DELETE on the reference data interface, with either a valid reference data ID or a valid project (including pid=0 for system level) and a valid reference data name.

To delete by reference data ID:

DELETE http://edqserver:8001/edq/config/referencedata?id=40

To delete by reference data name:

DELETE http://edqserver:8001/edq/config/referencedata?pid=0&name=ShortNameMap

After a successful deletion, the response returns a string message, such as the following:

Reference data ShortNameMap deleted

8.7 REST Interface for Web Services

The interface for web services is:

http://edqserver:8001/edq/config/webservices

It allows you to perform the following tasks when you call the respective get, post, or delete operations:

8.7.1 Retrieving a List of Web Services

You can get a list of web services defined for a valid project by using the HTTP GET operation on the web services interface. To get a list of web services, specify a valid project using either the pid or pname parameter.

For example:

GET http://edqserver:8001/edq/config/webservices?pid=14&pid=20

A successful call returns a list of web services, and their input and output interfaces, in the response body:

[  
   {  
      "id":1,
      "inputs":{  
         "attributes":[  
            {  
               "name":"Name",
               "type":"STRING"
            }
         ],
         "multirecord":false
      },
      "name":"Long Names",
      "outputs":{  
         "attributes":[  
            {  
               "name":"LongName",
               "type":"STRING"
            }
         ],
         "multirecord":false
      }
   }
]

8.7.2 Creating or Updating a Web Service

You can create and update web services by creating an appropriate JSON object, which you then POST to the web services interface.

To create a web service you need to specify a valid project (by name or id), as shown in the following example:

POST http://edqserver:8001/edq/config/webservices?pid=14

{  
   "name":"Name Gender",
   "inputs":{  
      "attributes":[  
         {  
            "name":"Name",
            "type":"STRING"
         }
      ],
      "multirecord":false
   },
   "outputs":{  
      "attributes":[  
         {  
            "name":"Gender",
            "type":"STRING"
         }
      ],
      "multirecord":false
   }
}

If successful, the name and ID of the web service is returned in the response body.

Example:

{"id":4,"name":"Name Gender"}

To update a web service, you need a JSON object that is identical in structure, but with an additional ID attribute to identify the existing web service. For an update you do not specify a project.

Example:

POST http://edqserver:8001/edq/config/webservices

{  
   "id":4,
   "name":"Name Gender",
   "inputs":{  
      "attributes":[  
         {  
            "name":"First Name",
            "type":"STRING"
         },
         {  
            "name":"Last Name",
            "type":"STRING"
         }
      ],
      "multirecord":false
   },
   "outputs":{  
      "attributes":[  
         {  
            "name":"Gender",
            "type":"STRING"
         }
      ],
      "multirecord":false
   }
}

If successful, the name and ID of the web service is returned in the response body, as shown in the following example:

{"id":4,"name":"Name Gender"}

8.7.3 Deleting a Web Service

To delete a web service call HTTP DELETE on the web service interface. Specify either the web service ID or a valid project (by name or ID) and the web service name.

For example:

DELETE http://edqserver:8001/edq/config/webservices?id=5

or

DELETE http://edqserver:8001/edq/config/webservices?pid=14&name=Name%20Gender

If successful, an OK response is returned but without a response body.

8.8 Example: Profiling from an External Application

Consider a scenario where an external application needs to profile data in a table in an Oracle database, using EDQ. In such a case, you can programmatically profile this table using the REST-based APIs. For this example, a CUSTOMERS table in a CustomerDB database will be used.

To generate and run the profiling job on the CUSTOMERS table, the following tasks are performed:

  1. Create a project by using the following URL:

    POST http://edqserver:8001/edq/config/projects

    The project name (pname) is "Profile Customer". The JSON code is:

    {
    "name":"Profile Customer", 
    "description": "Profiling customers in the CUSTOMERS table"
    }
    
  2. Create a data store using an Oracle database, CustomerDB, by using the following URL:

    POST http://edqserver:8001/edq/config/datasources?pid=4

    An example JSON code to create the data store is:

    {  
       "client":false,
       "name":"CustomersDB",
       "properties":[  
          {  
             "name":"service",
             "value":"sid"
          },
          {  
             "name":"sid",
             "value":"orcl"
          },
          {  
             "name":"user",
             "value":"CRM"
          },
          {  
             "name":"port",
             "value":"1521"
          },
          {  
             "name":"password",
             "value":"welcome123"
          },
          {  
             "name":"host",
             "value":"localhost"
          }
       ],
       "species":"oracle"
    }
    

    Note:

    To determine the pid or the project ID for the project "Profile Customer", use the HTTP GET operation with the URL:

    GET http://edqserver:8001/edq/config/projects

  3. Create a snapshot by using the following URL:

    POST http://edqserver:8001/edq/snapshots?pid=4

    The JSON code for creating the snapshot is:

    {  
       "name":"CustomersDB.Customers",
       "description":"Customer details",
       "datasource":"CustomersDB",
       "table":"Customers",
       "columns":[  
          "ID",
          "FULLNAME",
          "GIVENNAME",
          "FAMILYNAME",
          "Street",
          "City",
          "State",
          "PostalCode",
          "State",
          "Phone",
          "Cell",
          "Work",
          "eMail",
          "DoB",
          "Gender",
          "Active",
          "CreditLimit",
          "StartDate",
          "EndDate"
       ],
       "sampling":{  
          "number":100,
          "offset":0,
          "ordering":"ascending",
          "count":"true"
       }
    }
    

    The result for this is displayed as:

    {
       "id": 85,
       "name": "CustomersDB.Customers"
    }
    

    The snapshot with the name "CustomersDB.Customers" is created.

  4. Create a simple process by using the following URL:

    POST http://edqserver:8001/edq/config/processes/simpleprocess?pid=4

    For this example, a simple process is created with a Quickstats Profiler and a Frequency Profiler, both profiling only the Name fields. This can be done using the following example JSON.

    {  
       "name":"Profile Names",
       "description":"Profile Customer Names",
       "reader":{  
          "name":"Read from Customers",
          "stageddata":"Connection to Customers"
       },
       "processors":[  
          {  
             "name":"Do Quickstats",
             "type":"dn:quickstatsprofiler",
             "columnlist":[  
                "GIVENNAME",
                "FAMILYNAME"
             ]
          },
          {  
             "name":"Do Frequency Profiling",
             "type":"dn:attributefrequencycountsprofiler"
          }
       ]
    }
    

    The response to this request is:

    {
       "id": 267,
       "name": "Profile Names"
    }
    
  5. Create a simple job by using the following URL:

    POST http://edqserver:8001/edq/jobs/simplejob?pid=4

    {  
       "name":"Profile Customer Job",
       "process":"Profile Names",
       "description":"Profiling Customer Names",
       "resultsdrilldown":"none"
    }
    

    The response to the request is:

    {
      "id": 211,
      "name": "Profile Customer Job"
    }
    
  6. Run the job using the following URL:

    POST http://edqserver:8001/edq/jobs/run

    The JSON code for running the job "Profile Customer Job", which in turn would run the profiling process, is:

    {
    "project":"Profile Customer",
    "job":"Profile Customer Job"
    }
    

    The response is:

    {
       "executionID": 20,
       "runeverywhere": false
    }
    

Once this job is running, you can check the status of this execution of the job using the following URL:

GET http://edqserver:8001/edq/jobs/status?xid=20

Running this URL displays the status of the job, as shown in the following code:

{
   "complete": true,
   "endtime": "2016-04-29T14:05:41+01:00",
   "executionid": 1,
   "job": "Profile Customer Job",
   "project": "Profile Customer",
   "server": "edq_server1",
   "starttime": "2016-04-29T14:05:38+01:00",
   "status": "finished"
}

If required, you can cancel the job using the following URL:

POST http://edqserver:8001/edq/jobs/cancel

The JSON code to cancel a job is:

{ 
"executionID": 12345, 
"type" : "immediate" 
}

This would cancel the job instantly, without saving the results. For other options that can be used with "type", see Cancelling a Running Job.

To log in to EDQ Director to view the results of a profiling job that was executed successfully, use the following URL:

http://edqserver:8001/edq/blueprints/director/jnlp?projectid=1&processid=1&processornum=2

The projectid and processid are the same that are generated using the corresponding REST API calls and the processornum value is set to 2, which is the first processor after the reader.

This URL opens the Director UI with the focus on the first profiling processor in the job so that its results can be viewed immediately.

An external application may include an option to remove generated jobs, which would execute calls to the relevant deletion calls. The simplest version of this is to delete the whole project. For details on deleting a project, see Deleting a Project.