Using SPARQL Gateway with RDF Data

Storing SPARQL Queries and XSL Transformations
Specifying a Timeout Value
Specifying Best Effort Query Execution
Specifying a Content Type Other Than text/xml

The primary interface for an application to interact with SPARQL Gateway is through a URL with the following format:

http://host:port/sparqlgateway/sg?<SPARQL_ENDPOINT>
&<SPARQL_QUERY>&<XSLT>

In the preceding format:

Related topics:

Storing SPARQL Queries and XSL Transformations

If it is not feasible for an application to accept a very long URL, you can specify the location of the SPARQL query and the XSL transformation in the <SPARQL_QUERY> and <XSLT> portions of the URL format described in Using SPARQL Gateway with RDF Data, using any of the following approaches:

  • Store the SPARQL queries and XSL transformations in the SPARQL Gateway Web application itself.

    To do this, unpack the sparqlgateway.war file, and store the SPARQL queries and XSL transformations in the top-level directory; then pack the sparqlgateway.war file and redeploy it.

    The sparqlgateway.war file includes the following example files: qb1.sparql (SPARQL query) and default.xslt (XSL transformation).

    Note

    Use the file extension .sparql for SPARQL query files, and the file extension .xslt for XSL transformation files.

    The syntax for specifying these files (using the provided example file names) is wq=qb1.sparql for a SPARQL query file and wx=default.xslt for an XSL transformation file.

    If you want to customize the default XSL transformations, see the examples in Customizing the Default XSLT File.

    If you specify wx=noop.xslt, XSL transformation is not performed and the SPARQL response is returned "as is" to the client.

  • Store the SPARQL queries and XSL transformations in a file system directory, and make sure that the directory is accessible for the deployed SPARQL Gateway Web application.

    By default, the directory is set to /tmp, as shown in the following <init-param> setting:

    <init-param>
    <param-name>sparql_gateway_repository_filedir</param-name>
    <param-value>/tmp/</param-value>
    </init-param> 

    It is recommended that you customize this directory before deploying the SPARQL Gateway. To change the directory setting, edit the text in between the <param-value> and </param-value> tags.

    The following examples specify a SPARQL query file and an XSL transformation file that are in the directory specified in the <init-param> element for sparql_gateway_repository_filedir:

    fq=qb1.sparql
    fx=myxslt1.xslt
  • Make the SPARQL queries and XSL transformations accessible from a website.

    By default, the website directory is set to http://127.0.0.1/queries/, as shown in the following <init-param> setting:

    <init-param>
    <param-name>sparql_gateway_repository_url</param-name>
    </param-name>http://127.0.0.1/queries/</param-name>
    </init-param> 

    Customize this directory before deploying the SPARQL Gateway. To change the website setting, edit the text in between the </param-name> and </param-name> tags.

    The following example specifies a SPARQL query file and an XSL transformation file that are in the URL specified in the <init-param> element for sparql_gateway_repository_url.

    uq=qb1.sparql
    ux=myxslt1.xslt 

    Internally, SPARQL Gateway computes the appropriate complete URL, fetches the content, starts query execution, and applies the XSL transformation to the query response XML.

Configure the OracleSGDS Data Source

If an Oracle database is used for storage of and access to SPARQL queries and XSL transformations for SPARQL Gateway, then you must configure a data source named OracleSGDS.

To create this data source, follow the instructions in Use Oracle WebLogic Server; however, specify OracleSGDS as the data source name instead of OracleSemDS.

If the OracleSGDS data source is configured and available, SPARQL Gateway servlet will automatically create all the necessary tables and indexes upon initialization.

Specifying a Timeout Value

When you submit a potentially long-running query using the URL format described in Using SPARQL Gateway with RDF Data, you can limit the execution time by specifying a timeout value in milliseconds. For example, the following shows the URL format and a timeout specification that the SPARQL query execution started from SPARQL Gateway is to be ended after 1000 milliseconds (1 second):

http://host:port/sparqlgateway/sg?<SPARQL_ENDPOINT>&<
SPARQL_QUERY>&<XSLT>&t=1000

If a query does not finish when timeout occurs, then an empty SPARQL response is constructed by SPARQL Gateway.

Note that even if SPARQL Gateway times out a query execution at the HTTP connection level, the query may still be running on the server side. The actual behavior will be vendor-dependent.

Specifying Best Effort Query Execution

Note

You can specify best effort query execution only if you also specify a timeout value (described in the previous section, Specifying a Timeout Value).

When you submit a potentially long-running query using the URL format described in Using SPARQL Gateway with RDF Data, if you specify a timeout value, you can also specify a "best effort" limitation on the query. For example, the following shows the URL format with a timeout specification of 1000 milliseconds (1 second) and a best effort specification (&b=t):

http://host:port/sparqlgateway/sg?<SPARQL_ENDPOINT>&
<SPARQL_QUERY>&<XSLT>&t=1000&b=t

The web.xml file includes two parameter settings that affect the behavior of the best effort option: sparql_gateway_besteffort_maxrounds and sparql_gateway_besteffort_maxthreads. The following show the default definitions:

<init-param>
<param-name>sparql_gateway_besteffort_maxrounds</param-name>
</param-name>10</param-name>
</init-param> 
<init-param>
<param-name>sparql_gateway_besteffort_maxthreads</param-name>
</param-name>3</param-name>
</init-param> 

When a SPARQL SELECT query is executed in best effort style, a series of queries will be executed with an increasing LIMIT value setting in the SPARQL query body. (The core idea is based on the observation that a SPARQL query runs faster with a smaller LIMIT setting.) SPARQL Gateway starts query execution with a "LIMIT 1" setting. Ideally, this query can finish before the timeout is due. Assume that is the case, the next query will have its LIMIT setting increased, and subsequent queries have higher limits. The maximum number of query executions is controlled by the sparql_gateway_besteffort_maxrounds parameter.

If it is possible to run the series of queries in parallel, the sparql_gateway_besteffort_maxthreads parameter controls the degree of parallelism.

Specifying a Content Type Other Than text/xml

By default, SPARQL Gateway assumes that XSL transformations generate XML, and so the default content type set for HTTP response is text/xml. However, if your application requires a response format other than XML, you can specify the format in an additional URL parameter (with syntax &rt=), using the following format:

http://host:port/sparqlgateway/sg?<SPARQL_ENDPOINT>&
<SPARQL_QUERY>&<XSLT>&rt=<content_type>

Note that <content_type> must be URL encoded.