5 Oracle XQuery for Hadoopの使用

この章では、Oracle XQuery for Hadoopを使用して大量の半構造化データを抽出および変換する方法について説明します。内容は次のとおりです。

5.1 Oracle XQuery for Hadoopとは

Oracle XQuery for Hadoopは、半構造化されたビッグ・データの変換エンジンです。Oracle XQuery for Hadoopでは、XQuery言語で表された変換を一連のMapReduceジョブに解釈して実行します(これらのジョブはApache Hadoopクラスタ上で並列で実行されます)。ユーザーは、スケーラビリティやパフォーマンスを犠牲にせずに、JavaおよびMapReduceの複雑性に取り組むのではなくデータ移動および変換ロジックに集中できます。

入力データは、Hadoop Distributed File System (HDFS)など、Hadoopのファイル・システムAPIを介してアクセス可能なファイル・システムに配置するか、Oracle NoSQL Databaseに格納できます。Oracle XQuery for Hadoopでは、変換結果をHadoopファイル、Oracle NoSQL DatabaseまたはOracle Databaseに書き込むことができます。

また、Oracle XQuery for Hadoopには、大規模なXMLファイルをサポートするためにApache Hiveに対する拡張が用意されています。

Oracle XQuery for Hadoopは、XPath、XQueryおよびXQuery Update Facilityなどの円熟した業界標準に基づいています。他のOracle製品と完全に統合されているため、Oracle XQuery for Hadoopを使用すると、次のことが可能です。

Oracle Loader for Hadoopを使用してデータをOracle Databaseに効率的にロードすること。
Oracle NoSQL Databaseに読取り/書込みサポートを提供すること。

次の図に、Oracle XQuery for Hadoopを使用したデータ・フローの概要を示します。

図5-1 Oracle XQuery for Hadoopのデータ・フロー

「図5-1 Oracle XQuery for Hadoopのデータ・フロー」の説明

5.2 Oracle XQuery for Hadoopを使用する前に

Oracle XQuery for HadoopはXQuery開発者が使用するように設計されています。XQueryをよく理解している場合はすぐに開始できます。XQuery初心者は、最初に言語の基本を習得する必要があります。この情報については、このガイドには記載されていません。

関連項目:

5.2.1 基本的なステップ

Oracle XQuery for Hadoopを使用する場合は、次の基本的なステップに従います。

初めてOracle XQuery for Hadoopを使用する場合は、ソフトウェアがインストールおよび構成されていることを確認します。
「Oracle XQuery for Hadoopの設定」を参照してください。
HadoopクラスタのノードまたはクラスタのHadoopクライアントとして設定されているシステムにログインします。
Oracle XQuery for Hadoop関数を使用するXQuery変換を作成します。入出力には様々なアダプタを使用できます。
「Oracle XQuery for Hadoopの関数について」および「XQuery変換の作成」を参照してください。
XQuery変換を実行します。
「問合せの実行」を参照してください。

5.2.2 例: Hello World!

次のステップに従って、Oracle XQuery for Hadoopを使用して単純な問合せを作成し、実行します。

Helloの行を含むhello.txtというテキスト・ファイルを現在のディレクトリに作成します。
```
$ echo "Hello" > hello.txt
```
ファイルをHDFSにコピーします。
```
$ hdfs dfs -copyFromLocal hello.txt
 
```

次の内容でhello.xqという問合せファイルを現在のディレクトリに作成します。

import module "oxh:text";
for $line in text:collection("hello.txt")
return text:put($line || " World!")

問合せを実行します。

$ hadoop jar $OXH_HOME/lib/oxh.jar hello.xq -output ./myout -print
13/11/21 02:41:57 INFO hadoop.xquery: OXH: Oracle XQuery for Hadoop 4.2.0 ((build 4.2.0-cdh5.0.0-mr1 @mr2). Copyright (c) 2014, Oracle.  All rights reserved.
13/11/21 02:42:01 INFO hadoop.xquery: Submitting map-reduce job "oxh:hello.xq#0" id="3593921f-c50c-4bb8-88c0-6b63b439572b.0", inputs=[hdfs://bigdatalite.localdomain:8020/user/oracle/hello.txt], output=myout
     .
     .
     .

出力ファイルを確認します。

$ hdfs dfs -cat ./myout/part-m-00000
Hello World!

5.3 Oracle XQuery for Hadoopの関数について

Oracle XQuery for Hadoopは、ビッグ・データセットに対してcollectionおよびput関数を使用して読取りおよび書込みを実行します。

collection関数は、HadoopファイルまたはOracle NoSQL Databaseからデータを項目のコレクションとして読み取ります。Hadoopファイルは、Hadoopのファイル・システムAPIを介してアクセスできるファイルです。Oracle Big Data Applianceおよび大半のHadoopクラスタでは、このファイル・システムはHadoop Distributed File System (HDFS)です。
put関数は、Oracle Database、Oracle NoSQL DatabaseまたはHadoopファイルに格納されたデータセットに対して単一の項目を追加します。

次の例は、項目をソースから読み取って別の場所に書き込むOracle XQuery for Hadoopの単純な問合せです。

for $x in collection(...)
return put($x)

Oracle XQuery for Hadoopには、特定の形式およびソースについてputおよびcollection関数を定義する際に使用できる一連のアダプタが付属しています。各アダプタには2つのコンポーネントがあります。

利便性を考慮して事前定義された一連の組込みputおよびcollection関数。
カスタムputおよびcollection関数の定義に使用できる一連のXQuery関数の注釈。

他の共通して使用する関数もOracle XQuery for Hadoopに含まれています。

5.3.1 アダプタについて

次に、Oracle XQuery for Hadoopアダプタについて簡単に説明します。

Avroファイル・アダプタ

Avroファイル・アダプタは、HDFSに格納されたAvroコンテナ・ファイルへのアクセスを提供します。Avroコンテナ・ファイルに対して読取り/書込みを行うcollectionおよびput関数が含まれています。

JSONファイル・アダプタ

JSONファイル・アダプタは、HDFSに格納されたJSONファイルへのアクセスを提供します。JSONファイルを読み取るためのcollection関数とJSONデータを直接解析するための一連のhelper関数が含まれています。出力を書き込むには、別のアダプタを使用する必要があります。

Oracle Databaseアダプタ

Oracle Databaseアダプタは、データをOracle Databaseにロードします。このアダプタは、JDBCまたはOCIを使用してOracleデータベースの表に出力先を指定するためのカスタムput関数をサポートしています。データベースへのライブ接続が使用可能でない場合、このアダプタはデータ・ポンプまたはHDFSのデリミタ付きテキスト・ファイルへの出力もサポートします。ファイルは、SQL*Loaderなどの異なるユーティリティで、または外部表を使用してOracleデータベースにロードできます。このアダプタはデータをデータベースから移動しないため、collection関数やget関数はありません。

サポートされるOracle Databaseのバージョンについては、ソフトウェア要件を参照してください。

Oracle NoSQL Databaseアダプタ

Oracle NoSQL Databaseアダプタは、Oracle NoSQL Databaseに格納されたデータへのアクセスを提供します。データは表、Avro、XML、バイナリXMLまたはテキストとして読取りまたは書込みできます。このアダプタにはcollection、getおよびput関数が含まれています。

順序ファイル・アダプタ

順序ファイル・アダプタは、Hadoop順序ファイルへのアクセスを提供します。順序ファイルは、キーと値のペアで構成されるHadoop形式です。

このアダプタには、テキスト、XMLまたはバイナリXMLを格納するHDFS順序ファイルに対して読取り/書込みを行うcollectionおよびput関数が含まれています。

Solrアダプタ

Solrアダプタは、フルテキスト索引を作成し、Apache Solrサーバーにロードする機能を提供します。

テキスト・ファイル・アダプタ

テキスト・ファイル・アダプタは、CSVファイルなどのテキスト・ファイルへのアクセスを提供します。テキスト・ファイルに対して読取り/書込みを行うcollectionおよびput関数が含まれています。

JSONファイル・アダプタは、テキスト・ファイルに格納されたJSONオブジェクトに対するサポートを拡張します。

XMLファイル・アダプタ

XMLファイル・アダプタは、HDFSに格納されたXMLファイルへのアクセスを提供します。大規模なXMLファイルを読み取るためのcollection関数が含まれています。出力を書き込むには、別のアダプタを使用する必要があります。

関連項目

5.3.2 Oracle XQuery for Hadoopで使用する他のモジュールについて

問合せでは次の追加モジュールの関数を使用できます。

標準XQuery関数: 標準XQuery算術関数を使用できます。
Hadoop関数: Hadoopモジュールは、Hadoopに固有の関数の集合です。
期間、日付および時間の関数: この一連の関数は、期間、日付および時間の値を解析します。
文字列処理関数: これらの関数は、データ値を囲む空白を追加および削除します。

関連項目

5.4 XQuery変換の作成

この章では、Oracle XQuery for Hadoopを使用してXQuery変換を作成する方法について説明します。この項の内容は次のとおりです。

5.4.1 XQuery変換の要件

Oracle XQuery for Hadoopの変換は、次の追加要件に従う必要があることを除き、他のXQuery変換と同様の方法で作成します。

主要なXQuery式(問合せ本文)は、次のいずれかの形式で指定する必要があります。
```
FLWOR₁
```
または
```
(FLWOR₁, FLWOR₂,... , FLWOR_N)
```
この構文で、FLWORはXQuery FLWOR式のトップレベルのFLWOR (For、Let、Where、Order by、Returnの頭文字)式です。
トップレベルの各FLWOR式には、Oracle XQuery for Hadoopのcollection関数全体を反復するfor句が必要です。このfor句には位置指定変数を使用できません。

collection関数については、「Oracle XQuery for Hadoopリファレンス」を参照してください。
トップレベルの各FLWOR式には、オプションのlet、whereおよびgroup by句を使用できます。order by、count、window句など、その他のタイプの句は無効です。
トップレベルの各FLWOR式では、Oracle XQuery for Hadoopのput関数の呼出しで1つ以上の結果を返す必要があります。put関数については、「Oracle XQuery for Hadoopリファレンス」を参照してください。
問合せ本文は更新式である必要があります。put関数はすべて更新関数として分類されるため、Oracle XQuery for Hadoopのすべての問合せは更新問合せとなります。

Oracle XQuery for Hadoopでは、%*:put注釈は、関数が更新であることを示します。この場合、%updating注釈またはupdatingキーワードは不要です。
関連項目:
- XQuery 3.1: An XML Query LanguageのFLWOR式に関する項
- 式の更新に関する詳細は、W3C XQuery Update Facility 1.0のXQuery 1.0に対する拡張に関する項

5.4.2 XQuery言語のサポートについて

Oracle XQuery for HadoopはW3C XQuery 3.1をサポートしています。ただし次のものを除きます。

FLWOR式のwindow句
FLWOR式のcount句
namespaceコンストラクタ
fn:parse-ietf-date
fn:transform
XQueryの高階関数

言語については、W3CのXQuery 3.1: An XML Query Languageを参照してください。

関数については、W3CのXPath and XQuery Functions and Operatorsを参照してください。

5.4.3 Hadoop分散キャッシュのデータへのアクセス

Hadoop分散キャッシュ機能を使用して補助ジョブ・データにアクセスできます。このメカニズムは、一方が比較的小さいファイルである場合の結合問合せに役立ちます。分散キャッシュからアクセスされるファイルが小さいほど、問合せは高速になります。

ファイルを分散キャッシュに配置するには、Oracle XQuery for Hadoopを呼び出すときに-files Hadoopコマンドライン・オプションを使用します。分散キャッシュからファイルを読み取る問合せで、XMLの場合はfn:doc関数を、テキスト・ファイルの場合はfn:unparsed-textまたはfn:unparsed-text-linesのいずれかを呼び出す必要があります。例5-7を参照してください。

5.4.4 XQueryからのカスタムJava関数の呼出し

Oracle XQuery for Hadoopは、Java言語でカスタム外部関数を実装して拡張できます。Java実装は、パラメータを備えた静的メソッドで、XQuery API for Java (XQJ)仕様に定義されているタイプを返す必要があります。

カスタムJava関数バインディングは、%ora-java:binding注釈を使用して外部関数定義に注釈を指定することで、Oracle XQuery for Hadoopに定義されます。この注釈の構文は、次のとおりです。

%ora-java:binding("java.class.name[#method]")

java.class.name: 実装メソッドが格納されているJavaクラスの完全修飾名。
method: Javaメソッド名。XQuery関数名にデフォルト設定されます。オプション。

%ora-java:bindingの例については、例5-8を参照してください。

カスタムJava関数が格納されているJARファイルはすべて-libjarsコマンドライン・オプションに指定する必要があります。次に例を示します。

hadoop jar $OXH_HOME/lib/oxh.jar -libjars myfunctions.jar query.xq

関連項目:

次のサイトにある『XQuery API for Java (XQJ)』

http://www.jcp.org/en/jsr/detail?id=225

5.4.5 ユーザー定義のXQueryライブラリ・モジュールおよびXMLスキーマのアクセス

Oracle XQuery for Hadoopは、次の基準に従う場合、ユーザー定義のXQueryライブラリ・モジュールおよびXMLスキーマをサポートします。

Oracle XQuery for Hadoopを呼び出すメインの問合せが存在するクライアント上の同じディレクトリに、ライブラリ・モジュールまたはXMLスキーマ・ファイルを配置します。
import moduleまたはimport schema文のロケーションURIパラメータを使用してメインの問合せからライブラリ・モジュールまたはXMLスキーマをインポートします。
Oracle XQuery for Hadoopを呼び出す場合は、ライブラリ・モジュールまたはXMLスキーマ・ファイルを-filesコマンドライン・オプションに指定します。

ユーザー定義のXQueryライブラリ・モジュールおよびXMLスキーマの使用例は、例5-9を参照してください。

関連項目:

XQuery 3.1: An XML Query Language のロケーションURIに関する項

5.4.6 XQuery変換の例

これらの例では、HDFSに次のテキスト・ファイルがあります。ファイルには、異なるWebページへのアクセス・ログが格納されます。各行はWebページへのアクセスを表し、時間、ユーザー名、アクセスしたページ、およびステータス・コードが格納されます。

mydata/visits1.log  
 
2013-10-28T06:00:00, john, index.html, 200
2013-10-28T08:30:02, kelly, index.html, 200
2013-10-28T08:32:50, kelly, about.html, 200
2013-10-30T10:00:10, mike, index.html, 401
 
mydata/visits2.log  
 
2013-10-30T10:00:01, john, index.html, 200
2013-10-30T10:05:20, john, about.html, 200
2013-11-01T08:00:08, laura, index.html, 200
2013-11-04T06:12:51, kelly, index.html, 200
2013-11-04T06:12:40, kelly, contact.html, 200

例5-1 基本的なフィルタ

この問合せは、ユーザーkellyがアクセスしたページをフィルタ処理し、そのファイルをテキスト・ファイルに書き込みます。

import module "oxh:text";

for $line in text:collection("mydata/visits*.log")
let $split := fn:tokenize($line, "\s*,\s*")
where $split[2] eq "kelly"
return text:put($line)

この問合せは、次の行を含むテキスト・ファイルを出力ディレクトリに作成します。

2013-11-04T06:12:51, kelly, index.html, 200
2013-11-04T06:12:40, kelly, contact.html, 200
2013-10-28T08:30:02, kelly, index.html, 200
2013-10-28T08:32:50, kelly, about.html, 200

例5-2 分類と集計

次の問合せは、ページに対する1日当たりのアクセス数を計算します。

import module "oxh:text";
 
for $line in text:collection("mydata/visits*.log")
let $split := fn:tokenize($line, "\s*,\s*")
let $time := xs:dateTime($split[1])
let $day := xs:date($time)
group by $day
return text:put($day || " => " || fn:count($line))

この問合せは、次の行を含むテキスト・ファイルを作成します。

2013-10-28 => 3
2013-10-30 => 3
2013-11-01 => 1
2013-11-04 => 2

例5-3 内部結合

この例は、他のファイルに加え、HDFSの次のテキスト・ファイルを問い合せます。このファイルには、ユーザーID、姓名、年齢などのユーザー・プロファイル情報がコロン(:)区切りで格納されています。

mydata/users.txt  
 
john:John Doe:45
kelly:Kelly Johnson:32
laura:Laura Smith:
phil:Phil Johnson:27

次の問合せは、users.txtとログ・ファイルの結合を実行します。30歳を超えるユーザーが各ページにアクセスした回数を計算します。

import module "oxh:text";
 
for $userLine in text:collection("mydata/users.txt")
let $userSplit := fn:tokenize($userLine, "\s*:\s*")
let $userId := $userSplit[1]
let $userAge := xs:integer($userSplit[3][. castable as xs:integer])
 
for $visitLine in text:collection("mydata/visits*.log")
let $visitSplit := fn:tokenize($visitLine, "\s*,\s*")
let $visitUserId := $visitSplit[2]
where $userId eq $visitUserId and $userAge gt 30
group by $page := $visitSplit[3]
return text:put($page || " " || fn:count($userLine))

この問合せは、次の行を含むテキスト・ファイルを作成します。

about.html 2
contact.html 1
index.html 4

次の問合せは、任意のページにアクセスした各ユーザーのアクセス数を計算します。ページにアクセスしたことがないユーザーは除外されます。

import module "oxh:text";
 
for $userLine in text:collection("mydata/users.txt")
let $userSplit := fn:tokenize($userLine, "\s*:\s*")
let $userId := $userSplit[1]
 
for $visitLine in text:collection("mydata/visits*.log")
   [$userId eq fn:tokenize(., "\s*,\s*")[2]]
 
group by $userId
return text:put($userId || " " || fn:count($visitLine))

この問合せは、次の行を含むテキスト・ファイルを作成します。

john 3
kelly 4
laura 1

注意:

2つのcollection関数の結果を結合する場合は、等価結合のみがサポートされます。ソースの一方または両方がcollection関数からのソースでない場合は、任意の結合条件が許可されます。

例5-4 左外部結合

この例は例5-3の2番目の問合せと類似していますが、ページにアクセスしなかったユーザーもカウントします。

import module "oxh:text";
 
for $userLine in text:collection("mydata/users.txt")
let $userSplit := fn:tokenize($userLine, "\s*:\s*")
let $userId := $userSplit[1]
 
for $visitLine allowing empty in text:collection("mydata/visits*.log")
   [$userId eq fn:tokenize(., "\s*,\s*")[2]]
 
group by $userId
return text:put($userId || " " || fn:count($visitLine))

この問合せは、次の行を含むテキスト・ファイルを作成します。

john 3
kelly 4
laura 1
phil 0

例5-5 セミ結合

次の問合せは、ページにアクセスしたユーザーを検出します。

import module "oxh:text";
 
for $userLine in text:collection("mydata/users.txt")
let $userId := fn:tokenize($userLine, "\s*:\s*")[1]
 
where some $visitLine in text:collection("mydata/visits*.log")
satisfies $userId eq fn:tokenize($visitLine, "\s*,\s*")[2]
 
return text:put($userId)

この問合せは、次の行を含むテキスト・ファイルを作成します。

john
kelly
laura

例5-6 複数の出力

次の問合せは、コードが401のWebページ・アクセスを検索し、XQueryのtext:trace()関数を使用してtrace*ファイルに書き込みます。残りのアクセス・レコードはデフォルトの出力ファイルに書き込みます。

import module "oxh:text";
 
for $visitLine in text:collection("mydata/visits*.log")
let $visitCode := xs:integer(fn:tokenize($visitLine, "\s*,\s*")[4])
return if ($visitCode eq 401) then text:trace($visitLine) else text:put($visitLine)

この問合せは、次の行を含むtrace*テキスト・ファイルを生成します。

2013-10-30T10:00:10, mike, index.html, 401

この問合せは、次の行を含むデフォルトの出力ファイルも生成します。

2013-10-30T10:00:01, john, index.html, 200
2013-10-30T10:05:20, john, about.html, 200
2013-11-01T08:00:08, laura, index.html, 200
2013-11-04T06:12:51, kelly, index.html, 200
2013-11-04T06:12:40, kelly, contact.html, 200
2013-10-28T06:00:00, john, index.html, 200
2013-10-28T08:30:02, kelly, index.html, 200
2013-10-28T08:32:50, kelly, about.html, 200

例5-7 補助入力データのアクセス

次の問合せは例5-3の2番目の問合せの代替バージョンですが、fn:unparsed-text-lines関数を使用してHadoop分散キャッシュのファイルにアクセスします。

import module "oxh:text";
 
for $visitLine in text:collection("mydata/visits*.log")
let $visitUserId := fn:tokenize($visitLine, "\s*,\s*")[2]
 
for $userLine in fn:unparsed-text-lines("users.txt")
let $userSplit := fn:tokenize($userLine, "\s*:\s*")
let $userId := $userSplit[1]
 
where $userId eq $visitUserId
 
group by $userId
return text:put($userId || " " || fn:count($visitLine))

問合せを実行するhadoopコマンドには、Hadoopの-filesオプションを使用する必要があります。「Hadoop分散キャッシュのデータへのアクセス」を参照してください。

hadoop jar $OXH_HOME/lib/oxh.jar -files users.txt query.xq

この問合せは、次の行を含むテキスト・ファイルを作成します。

john 3
kelly 4
laura 1

例5-8 XQueryからのカスタムJava関数の呼出し

次の問合せは、java.lang.String#formatメソッドを使用して入力データを書式設定します。

import module "oxh:text";
 
declare %ora-java:binding("java.lang.String#format")
   function local:string-format($pattern as xs:string, $data as xs:anyAtomicType*) as xs:string external;
 
for $line in text:collection("mydata/users*.txt")
let $split := fn:tokenize($line, "\s*:\s*")
return text:put(local:string-format("%s,%s,%s", $split))

この問合せは、次の行を含むテキスト・ファイルを作成します。

john,John Doe,45
kelly,Kelly Johnson,32
laura,Laura Smith,
phil,Phil Johnson,27

関連項目:

クラス文字列については、Java Platform Standard Edition 7 API Specification。

例5-9 ユーザー定義のXQueryライブラリ・モジュールおよびXMLスキーマの使用

この例では、mytools.xqというライブラリ・モジュールを使用します。

module namespace mytools = "urn:mytools";
 
declare %ora-java:binding("java.lang.String#format")
   function mytools:string-format($pattern as xs:string, $data as xs:anyAtomicType*) as xs:string external;

次の問合せは前の例と同等ですが、string-format関数をmytools.xqライブラリ・モジュールから呼び出します。

import module namespace mytools = "urn:mytools" at "mytools.xq";
import module "oxh:text";
 
for $line in text:collection("mydata/users*.txt")
let $split := fn:tokenize($line, "\s*:\s*")
return text:put(mytools:string-format("%s,%s,%s", $split))

この問合せは、次の行を含むテキスト・ファイルを作成します。

john,John Doe,45
kelly,Kelly Johnson,32
laura,Laura Smith,
phil,Phil Johnson,27

例5-10 Try/Catch式を使用したダーティ・データのフィルタリング

XQueryのtry/catch式は、入力データが予期しない形式である、破損しているあるいは失われている場合に広く対応するために使用できます。次の問合せでは、ユーザー名とユーザーの年齢が含まれた入力ファイルages.txtを読み込みます。

USER      AGE
------------------
john    45
kelly
laura   36
phil    OLD!

ファイルの最初の2行にヘッダー・テキストが含まれ、Kellyの年齢のエントリがなく、Philの年齢のエントリがダーティな値であることに注目してください。この問合せは、ファイルにある各ユーザーについて、ユーザー名とそのユーザーが40歳以上かどうかを書き出します。

import module "oxh:text";

for $line in text:collection("ages.txt")
let $split := fn:tokenize($line, "\s+")
return
   try {

      let $user := $split[1]
      let $age := $split[2] cast as xs:integer
      return
        if ($age gt 40) then
          text:put($user || " is over 40")
        else 
          text:put($user || " is not over 40")

   } catch * {
      text:trace($err:code || " : " || $line)
   }

この問合せは、次の行を含むテキスト出力ファイルを生成します。

john is over 40
laura is not over 40

この問合せは、次の行を含むtrace*ファイルも作成します。

err:FORG0001 : USER        AGE
err:XPTY0004 : ------------------
err:XPTY0004 : kelly
err:FORG0001 : phil        OLD!

5.5 問合せの実行

問合せを実行するには、hadoop jarコマンドを使用してoxhユーティリティを呼び出します。基本的な構文は次のとおりです。

hadoop jar $OXH_HOME/lib/oxh.jar [generic options] query.xq -output directory [-clean] [-ls] [-print] [-sharelib hdfs_dir][-skiperrors] [-version]

5.5.1 Oracle XQuery for Hadoopのオプション

query.xq

XQueryファイルを識別します。「XQuery変換の作成」を参照してください。

-clean

問合せの実行前に、出力ディレクトリからすべてのファイルを削除します。デフォルト・ディレクトリを使用する場合、Oracle XQuery for Hadoopはこのオプションが省略されている場合でも常にディレクトリを空にします。

-exportliboozie directory

Oracle XQuery for Hadoopの依存関係を指定されたディレクトリにコピーします。このオプションは、Oracle XQuery for HadoopをHadoop分散キャッシュおよびOozie共有ライブラリに追加する場合に使用します。外部依存関係もコピーされるため、KVHOME、OLH_HOMEおよびOXH_SOLR_MR_HOMEなどの環境変数が関連アダプタ(Oracle NoSQL Database、Oracle DatabaseおよびSolr)で使用するように設定されていることを確認します。

-ls

問合せ実行後に、出力ディレクトリの内容をリスト表示します。

-output directory

問合せの出力ディレクトリを指定します。ファイル・アダプタのput関数によって、このディレクトリにファイルが作成されます。書き込まれた値は、1つ以上のファイルに展開されます。作成されるファイル数は、問合せがどのように複数のタスクに分散されているかによって異なります。デフォルトの出力ディレクトリは/tmp/oxh-user_name/outputです。

put関数の説明は、「Oracle XQuery for Hadoopの関数について」を参照してください。

-print

出力ディレクトリ内の全ファイルの内容を標準出力(画面)に印刷します。Avroファイルの印刷時は、各レコードがJSONテキストとして印刷されます。

-sharelib hdfs_dir

Oracle XQuery for Hadoopおよびサードパーティ・ライブラリが含まれているHDFSフォルダの場所を指定します。

-skiperrors

エラーで処理が停止しないように、エラー・リカバリをオンに切り替えます。

問合せ処理中に発生したすべてのエラーがカウントされ、問合せ終了時に合計がログに記録されます。また、タスクごとに最初の20件のエラーのエラー・メッセージがログに記録されます。次の構成プロパティを参照してください。

-version

Oracle XQuery for Hadoopのバージョンを表示し、問合せを実行せずに終了します。

5.5.2 汎用オプション

任意の汎用的なhadoopコマンドライン・オプションを指定できます。Oracle XQuery for Hadoopは、org.apache.hadoop.util.Toolインタフェースを実装し、MapReduceアプリケーションを構築する標準的なHadoopの方法に従います。

Oracle XQuery for Hadoopでは、次の汎用オプションが一般的に使用されます。

-conf job_config.xml

ジョブ構成ファイルを識別します。「Oracle XQuery for Hadoopの構成プロパティ」を参照してください。

Oracle DatabaseまたはOracle NoSQL Databaseのアダプタを使用している場合は、このファイルに様々なジョブ・プロパティを設定できます。「Oracle Loader for Hadoop構成プロパティおよび対応する%oracle-property注釈」および「Oracle NoSQL Databaseアダプタの構成プロパティ」を参照してください。

-D property=value

構成プロパティを識別します。「Oracle XQuery for Hadoopの構成プロパティ」を参照してください。

-files

分散キャッシュに追加するファイルのカンマ区切りリストを指定します。「Hadoop分散キャッシュのデータへのアクセス」を参照してください。

関連項目:

汎用オプションの詳細は、次のサイトを参照してください。

http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/CommandsManual.html#Generic_Options

5.5.3 ローカルでの問合せの実行について

問合せの開発では、問合せをクラスタに送信する前に、ローカルで実行できます。ローカル実行を使用することで、小さいデータセットで問合せが動作する様子を確認し、潜在的な問題を迅速に診断できます。

ローカル・モードでは、HDFSではなくローカル・ファイル・システムに対して相対的なURIで解決し、問合せをシングル・プロセスで実行します。

問合せをローカル・モードで実行するには、次の手順を実行します。

Hadoopの-jtおよび-fs汎用引数をlocalに設定します。この例では、「例: Hello World!」に記載されている問合せをローカル・モードで実行します。
```
$ hadoop jar $OXH_HOME/lib/oxh.jar -jt local -fs local ./hello.xq -output ./myoutput -print
```
問合せのローカル出力ディレクトリ内の結果ファイルを、この例のようにして確認します。
```
$ cat ./myoutput/part-m-00000
Hello World!
```

5.6 Apache Oozieからの問合せの実行

Apache Oozieは、複数のMapReduceジョブを指定した順番に(およびオプションで予定された時間に)実行するワークフロー・ツールです。Oracle XQuery for Hadoopは、OozieワークフローからのOracle XQuery for Hadoopの問合せの実行に使用できるOozieアクション・ノードを提供します。

5.6.1 Oracle XQuery for HadoopのOozieアクションの使用に関する概要

次のステップに従って、Oozieワークフローで問合せを実行します。

Oracle XQuery for Hadoopで初めてOozieを使用する場合は、Oozieが正しく構成されているか確認します。「Oracle XQuery for HadoopアクションのOozieの構成」を参照してください。
いつもと同じようにOracle XQuery for Hadoopで問合せを開発します。
例5-11で示す例のように、ワークフローXMLファイルを作成します。「サポートされているXML要素」にリストされているXML要素を使用できます。
Oozieジョブ・パラメータを設定します。次のパラメータが必要です。
```
oozie.use.system.libpath=true
```
例5-13を参照してください。
次のような構文を使用してジョブを実行します。
```
oozie job -name http://example.com:11000/oozie -config filename -run
```
関連項目:
次のサイトにあるApache Oozieコマンド・ライン・インタフェース・ユーティリティのOozieコマンドライン使用方法に関する項

https://oozie.apache.org/docs/4.0.0/DG_CommandLineTool.html#Oozie_Command_Line_Usage

5.6.2 サポートされているXML要素

Oracle XQuery for Hadoopアクションにより、OozieのJavaアクションが拡張されます。このアクションでは、Javaアクションと同じ構文およびセマンティクスを持つ次のオプションの子XML要素がサポートされます。

archive
configuration
file
job-tracker
job-xml
name-node
prepare

関連項目:

Javaアクションは、次のサイトのOozie仕様で説明されています。

https://oozie.apache.org/docs/4.0.0/WorkflowFunctionalSpec.html#a3.2.7_Java_Action

さらに、Oracle XQuery for Hadoopアクションでは、次の要素がサポートされています。

script: Oracle XQuery for Hadoop問合せファイルの場所。必須。

問合せファイルはワークフロー・アプリケーション・ディレクトリに置く必要があります。相対パスはアプリケーション・ディレクトリに対して解決されます。

例: <script>myquery.xq</script>
output: 問合せの出力ディレクトリ。必須。

output要素にはオプションのclean属性があります。この属性をtrueに設定し、出力ディレクトリを削除してから問合せを実行します。出力ディレクトリがすでに存在し、clean属性が設定されていないかfalseに設定されている場合、エラーが発生します。ジョブを実行しているときは出力ディレクトリは存在できません。

例: <output clean="true">/user/jdoe/myoutput</output>

問合せ句の実行時に発生したあらゆるエラーによって、Oozieではそのアクションに対するエラー遷移が実行されます。

5.6.3 例: Hello World

この例では、次のファイルを使用します。

workflow.xml: hello.xqにある問合せの2つの構成値(HDFSファイルおよび文字列World!)を設定するOozieアクションの情報を表示します。

HDFS入力ファイルは/user/jdoe/data/hello.txtで、次の文字列が含まれています。
```
Hello
```
例5-11を参照してください。
hello.xq: Oracle XQuery for Hadoopを使用して問合せを実行します。

例5-12を参照してください。
job.properties: Oozieのジョブ・プロパティをリストします。例5-13を参照してください。

例を実行するには、このコマンドを使用します。

oozie job -oozie http://example.com:11000/oozie -config job.properties -run

ジョブを実行した後、/user/jdoe/myoutput出力ディレクトリには、テキスト"Hello World!"を含むファイルが置かれます。

例5-11 Hello Worldのworkflow.xmlファイル

このファイルは/user/jdoe/hello-oozie-oxh/workflow.xmlという名前です。job.propertiesファイルに定義された変数を使用します。

<workflow-app xmlns="uri:oozie:workflow:0.4" name="oxh-helloworld-wf">
  <start to="hello-node"/>
  <action name="hello-node">
    <oxh xmlns="oxh:oozie-action:v1">
      <job-tracker>${jobTracker}</job-tracker>
      <name-node>${nameNode}</name-node>

      <!-- 
        The configuration can be used to parameterize the query.
      -->
      <configuration>
        <property>
          <name>myinput</name>
          <value>${nameNode}/user/jdoe/data/src.txt</value>
        </property>
        <property>
          <name>mysuffix</name>
          <value> World!</value>
        </property>
      </configuration>
 
      <script>hello.xq</script>

      <output clean="true">${nameNode}/user/jdoe/myoutput</output>

    </oxh>
    <ok to="end"/>
    <error to="fail"/>
  </action>
  <kill name="fail">
    <message>OXH failed: [${wf:errorMessage(wf:lastErrorNode())}]</message>
  </kill>
  <end name="end"/>
</workflow-app>

例5-12 Hello Worldのhello.xq ファイル

このファイルは/user/jdoe/hello-oozie-oxh/hello.xqという名前です。

import module "oxh:text";

declare variable $input := oxh:property("myinput");
declare variable $suffix := oxh:property("mysuffix");

for $line in text:collection($input)
return
  text:put($line || $suffix)

例5-13 Hello Worldのjob.propertiesファイル

oozie.wf.application.path=hdfs://example.com:8020/user/jdoe/hello-oozie-oxh
nameNode=hdfs://example.com:8020
jobTracker=hdfs://example.com:8032
oozie.use.system.libpath=true

5.7 Oracle XQuery for Hadoopの構成プロパティ

Oracle XQuery for Hadoopでは、構成プロパティを指定する汎用メソッドをhadoopコマンドで使用します。構成ファイルを指定する場合は-confオプションを使用し、個別のプロパティを指定する場合は-Dオプションを使用します。「問合せの実行」を参照してください。

関連項目:

ジョブ構成ファイルに関するHadoopのドキュメント

http://wiki.apache.org/hadoop/JobConfFile

プロパティ説明

oracle.hadoop.xquery.lib.share

型: String

デフォルト値: 定義されていません。

説明: Oracle XQuery for Hadoopのライブラリおよびサードパーティ・ソフトウェアを含むHDFSディレクトリを特定します。次に例を示します。

http://path/to/shared/folder

すべてのHDFSファイルは同じディレクトリに置く必要があります。

または、コマンドラインで-sharelibオプションを使用します。

パターン一致: ディレクトリ名にパターン一致文字を使用できます。複数のディレクトリのパターンが一致した場合、変更のタイムスタンプが最新のディレクトリが使用されます。

ディレクトリ名を指定するには、英数字を使用し、オプションで次のパターン一致の特殊文字を使用します。

パターン	説明
?	任意の1文字に一致します。
*	0文字以上の文字に一致します。
[abc]	文字セット{a, b, c}内の1文字に一致します。
[a-b]	文字の範囲aからbの1文字に一致します。文字aは、文字b以下である必要があります。
[^a]	文字セットまたは範囲{a}外の1文字に一致します。カレット(^)は、左カッコの直後に付ける必要があります(空白なし)。
\c	文字cの特別な意味をすべて無効にします(エスケープ)。
{ab,cd}	文字列セット{ab, cd}内の文字列に一致します。
{ab,c{de,fh}}	文字列セット{ab, cde, cfh}内の文字列に一致します。

Oozieライブラリ: 値oxh:oozieは/user/{oozie,user}/share/lib/{oxh,*/oxh*}に自動的に展開されます。これらのディレクトリはサポートされているOozieバージョンの共通検索パスです。userは現在のユーザー名です。ただし、すべてのライブラリはHDFSに事前インストールされているため、Oracle XQuery for Hadoop Oozieアクションは問合せ実行時にこの設定を無視します。

oracle.hadoop.xquery.output

型: String

デフォルト値: /tmp/oxh-user_name/output。user_nameは、Oracle XQuery for Hadoopを実行しているユーザーの名前です。

説明: 問合せの出力ディレクトリを設定します。このプロパティは、-outputコマンドライン・オプションと同等です。「Oracle XQuery for Hadoopのオプション」を参照してください。

oracle.hadoop.xquery.scratch

型: String

デフォルト値: /tmp/oxh-user_name/scratch。user_nameは、Oracle XQuery for Hadoopを実行しているユーザーの名前です。

説明: 一時ファイルを格納するために、Oracle XQuery for HadoopのHDFS一時ディレクトリを設定します。

oracle.hadoop.xquery.timezone

型: String

デフォルト値: クライアント・システムのタイムゾーン

説明: XQueryの暗黙的なタイムゾーンで、date、timeまたはdatetime値にタイムゾーンがない場合に、比較または算術の操作に使用されます。値はJava TimeZoneクラスによって記述された形式である必要があります。次のサイトにある『Java 7 API Specification』のTimeZoneクラスの説明を参照してください。

http://docs.oracle.com/javase/7/docs/api/java/util/TimeZone.html

oracle.hadoop.xquery.skiperrors

型: Boolean

デフォルト値: false

説明: エラー・リカバリをオンにする場合はtrueに設定し、エラー発生時に処理を停止する場合はfalseに設定します。このプロパティは、-skiperrorsコマンドライン・オプションと同等です。

oracle.hadoop.xquery.skiperrors.counters

型: Boolean

デフォルト値: true

説明: エラー・コード別にエラーを分類する場合はtrueに設定し、すべてのエラーを単一のカウンタでレポートする場合はfalseに設定します。

oracle.hadoop.xquery.skiperrors.max

型: Integer

デフォルト値: Unlimited

説明: 単一のMapReduceタスクがリカバリできるエラーの最大数を設定します。

oracle.hadoop.xquery.skiperrors.log.max

型: Integer

デフォルト値: 20

説明: 単一のMapReduceタスクでログに記録するエラーの最大数を設定します。

log4j.logger.oracle.hadoop.xquery

型: String

デフォルト値: 定義されていません。

説明: 指定のしきい値レベルでタスクごとにlog4jロガーを構成します。プロパティを値OFF、FATAL、ERROR、WARN、INFO、DEBUG、ALLのいずれかに設定します。このプロパティが設定されていない場合、Oracle XQuery for Hadoopはlog4jを構成しません。

5.8 同梱されているソフトウェアのサードパーティ・ライセンス

Oracle XQuery for Hadoopは、次のサードパーティ製品に依存しています。

これらのソフトウェア・パッケージは、Apache 2.0 Licenseに基づいてライセンスが供与されています。

特に断りがないかぎり、あるいは、サードパーティ・ライセンス(LGPLなど)の条項で求められている場合、Apache Licensed Codeに関連するすべてのステートメントを含めた、この項のライセンスとステートメントは、告知のみを目的とするものです。

5.8.1 ANTLR 3.2

[The BSD License]

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:

Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
Neither the name of the author nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

5.8.2 Apache Ant 1.9.8

This product includes software developed by The Apache Software Foundation (http://www.apache.org).

This product includes also software developed by:

the W3C consortium (http://www.w3c.org)
the SAX project (http://www.saxproject.org)

Portions of this software were originally based on the following:

voluntary contributions made by Paul Eng on behalf of the Apache Software Foundation that were originally developed at iClick, Inc., software copyright (c) 1999

W3C® SOFTWARE NOTICE AND LICENSE

This work (and included software, documentation such as READMEs, or other related items) is being provided by the copyright holders under the following license.By obtaining, using and/or copying this work, you (the licensee) agree that you have read, understood, and will comply with the following terms and conditions.

Permission to copy, modify, and distribute this software and its documentation, with or without modification, for any purpose and without fee or royalty is hereby granted, provided that you include the following on ALL copies of the software and documentation or portions thereof, including modifications:

The full text of this NOTICE in a location viewable to users of the redistributed or derivative work.
Any pre-existing intellectual property disclaimers, notices, or terms and conditions.If none exist, the W3C Software Short Notice should be included (hypertext is preferred, text is permitted) within the body of any redistributed or derivative code.
Notice of any changes or modifications to the files, including the date changes were made.(We recommend you provide URIs to the location from which the code is derived.)

THIS SOFTWARE AND DOCUMENTATION IS PROVIDED "AS IS," AND COPYRIGHT HOLDERS MAKE NO REPRESENTATIONS OR WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO, WARRANTIES OF MERCHANTABILITY OR FITNESS FOR ANY PARTICULAR PURPOSE OR THAT THE USE OF THE SOFTWARE OR DOCUMENTATION WILL NOT INFRINGE ANY THIRD PARTY PATENTS, COPYRIGHTS, TRADEMARKS OR OTHER RIGHTS.

COPYRIGHT HOLDERS WILL NOT BE LIABLE FOR ANY DIRECT, INDIRECT, SPECIAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF ANY USE OF THE SOFTWARE OR DOCUMENTATION.

The name and trademarks of copyright holders may NOT be used in advertising or publicity pertaining to the software without specific, written prior permission.Title to copyright in this software and any associated documentation will at all times remain with copyright holders.

This formulation of W3C's notice and license became active on December 31 2002.This version removes the copyright ownership notice such that this license can be used with materials other than those owned by the W3C, reflects that ERCIM is now a host of the W3C, includes references to this specific dated version of the license, and removes the ambiguous grant of "use".Otherwise, this version is the same as the previous version and is written so as to preserve the Free Software Foundation's assessment of GPL compatibility and OSI's certification under the Open Source Definition.Please see our Copyright FAQ for common questions about using materials from our site, including specific terms and conditions for packages like libwww, Amaya, and Jigsaw.Other questions about this notice can be directed to site-policy@w3.org.

Joseph Reagle <site-policy@w3.org>

This license came from: http://www.megginson.com/SAX/copying.html

However please note future versions of SAX may be covered under http://saxproject.org/?selected=pd

SAX2 is Free!

I hereby abandon any property rights to SAX 2.0 (the Simple API for XML), and release all of the SAX 2.0 source code, compiled code, and documentation contained in this distribution into the Public Domain.SAX comes with NO WARRANTY or guarantee of fitness for any purpose.

David Megginson, david@megginson.com

2000-05-05

5.8.3 Stax2 API 3.1.4

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS," AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

5.8.4 Xerces 2 Java 2.11.0

Apache Xerces Java Copyright 1999-2010 The Apache Software Foundation This product includes software developed at The Apache Software Foundation (http://www.apache.org/).Portions of this software were originally based on the following:

5.8.5 XMLBeans 2.6.4

Apacheのライセンス

Version 2.0, January 2004

http://www.apache.org/licenses/

TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION

Definitions.

"License" shall mean the terms and conditions for use, reproduction, and distribution as defined by Sections 1 through 9 of this document.

"Licensor" shall mean the copyright owner or entity authorized by the copyright owner that is granting the License.

"Legal Entity" shall mean the union of the acting entity and all other entities that control, are controlled by, or are under common control with that entity.For the purposes of this definition, "control" means (i) the power, direct or indirect, to cause the direction or management of such entity, whether by contract or otherwise, or (ii) ownership of fifty percent (50%) or more of the outstanding shares, or (iii) beneficial ownership of such entity.

"You" (or "Your") shall mean an individual or Legal Entity exercising permissions granted by this License.

"Source" form shall mean the preferred form for making modifications, including but not limited to software source code, documentation source, and configuration files.

"Object" form shall mean any form resulting from mechanical transformation or translation of a Source form, including but not limited to compiled object code, generated documentation, and conversions to other media types.

"Work" shall mean the work of authorship, whether in Source or Object form, made available under the License, as indicated by a copyright notice that is included in or attached to the work (an example is provided in the Appendix below).

"Derivative Works" shall mean any work, whether in Source or Object form, that is based on (or derived from) the Work and for which the editorial revisions, annotations, elaborations, or other modifications represent, as a whole, an original work of authorship.For the purposes of this License, Derivative Works shall not include works that remain separable from, or merely link (or bind by name) to the interfaces of, the Work and Derivative Works thereof.

"Contribution" shall mean any work of authorship, including the original version of the Work and any modifications or additions to that Work or Derivative Works thereof, that is intentionally submitted to Licensor for inclusion in the Work by the copyright owner or by an individual or Legal Entity authorized to submit on behalf of the copyright owner.For the purposes of this definition, "submitted" means any form of electronic, verbal, or written communication sent to the Licensor or its representatives, including but not limited to communication on electronic mailing lists, source code control systems, and issue tracking systems that are managed by, or on behalf of, the Licensor for the purpose of discussing and improving the Work, but excluding communication that is conspicuously marked or otherwise designated in writing by the copyright owner as "Not a Contribution."

"Contributor" shall mean Licensor and any individual or Legal Entity on behalf of whom a Contribution has been received by Licensor and subsequently incorporated within the Work.
Grant of Copyright License.Subject to the terms and conditions of this License, each Contributor hereby grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable copyright license to reproduce, prepare Derivative Works of, publicly display, publicly perform, sublicense, and distribute the Work and such Derivative Works in Source or Object form.
Grant of Patent License.Subject to the terms and conditions of this License, each Contributor hereby grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable (except as stated in this section) patent license to make, have made, use, offer to sell, sell, import, and otherwise transfer the Work, where such license applies only to those patent claims licensable by such Contributor that are necessarily infringed by their Contribution(s) alone or by combination of their Contribution(s) with the Work to which such Contribution(s) was submitted.If You institute patent litigation against any entity (including a cross-claim or counterclaim in a lawsuit) alleging that the Work or a Contribution incorporated within the Work constitutes direct or contributory patent infringement, then any patent licenses granted to You under this License for that Work shall terminate as of the date such litigation is filed.
Redistribution.You may reproduce and distribute copies of the Work or Derivative Works thereof in any medium, with or without modifications, and in Source or Object form, provided that You meet the following conditions:
1. You must give any other recipients of the Work or Derivative Works a copy of this License; and
2. You must cause any modified files to carry prominent notices stating that You changed the files; and
3. You must retain, in the Source form of any Derivative Works that You distribute, all copyright, patent, trademark, and attribution notices from the Source form of the Work, excluding those notices that do not pertain to any part of the Derivative Works; and
4. If the Work includes a "NOTICE" text file as part of its distribution, then any Derivative Works that You distribute must include a readable copy of the attribution notices contained within such NOTICE file, excluding those notices that do not pertain to any part of the Derivative Works, in at least one of the following places: within a NOTICE text file distributed as part of the Derivative Works; within the Source form or documentation, if provided along with the Derivative Works; or, within a display generated by the Derivative Works, if and wherever such third-party notices normally appear.The contents of the NOTICE file are for informational purposes only and do not modify the License.You may add Your own attribution notices within Derivative Works that You distribute, alongside or as an addendum to the NOTICE text from the Work, provided that such additional attribution notices cannot be construed as modifying the License.
You may add Your own copyright statement to Your modifications and may provide additional or different license terms and conditions for use, reproduction, or distribution of Your modifications, or for any such Derivative Works as a whole, provided Your use, reproduction, and distribution of the Work otherwise complies with the conditions stated in this License.
Submission of Contributions.Unless You explicitly state otherwise, any Contribution intentionally submitted for inclusion in the Work by You to the Licensor shall be under the terms and conditions of this License, without any additional terms or conditions.Notwithstanding the above, nothing herein shall supersede or modify the terms of any separate license agreement you may have executed with Licensor regarding such Contributions.
Trademarks.This License does not grant permission to use the trade names, trademarks, service marks, or product names of the Licensor, except as required for reasonable and customary use in describing the origin of the Work and reproducing the content of the NOTICE file.
Disclaimer of Warranty.Unless required by applicable law or agreed to in writing, Licensor provides the Work (and each Contributor provides its Contributions) on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied, including, without limitation, any warranties or conditions of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A PARTICULAR PURPOSE.You are solely responsible for determining the appropriateness of using or redistributing the Work and assume any risks associated with Your exercise of permissions under this License.
Limitation of Liability.In no event and under no legal theory, whether in tort (including negligence), contract, or otherwise, unless required by applicable law (such as deliberate and grossly negligent acts) or agreed to in writing, shall any Contributor be liable to You for damages, including any direct, indirect, special, incidental, or consequential damages of any character arising as a result of this License or out of the use or inability to use the Work (including but not limited to damages for loss of goodwill, work stoppage, computer failure or malfunction, or any and all other commercial damages or losses), even if such Contributor has been advised of the possibility of such damages.
Accepting Warranty or Additional Liability.While redistributing the Work or Derivative Works thereof, You may choose to offer, and charge a fee for, acceptance of support, warranty, indemnity, or other liability obligations and/or rights consistent with this License.However, in accepting such obligations, You may act only on Your own behalf and on Your sole responsibility, not on behalf of any other Contributor, and only if You agree to indemnify, defend, and hold each Contributor harmless for any liability incurred by, or claims asserted against, such Contributor by reason of your accepting any such warranty or additional liability.

END OF TERMS AND CONDITIONS

APPENDIX: How to apply the Apache License to your work

To apply the Apache License to your work, attach the following boilerplate notice, with the fields enclosed by brackets "[]" replaced with your own identifying information.(Don't include the brackets!)The text should be enclosed in the appropriate comment syntax for the file format.We also recommend that a file or class name and description of purpose be included on the same "printed page" as the copyright notice for easier identification within third-party archives.

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License.You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.See the License for the specific language governing permissions and limitations under the License.

===========================================================================

ADDITIONAL LICENSES COVERING PARTS OF THIS DISTRIBUTION:

This distribution includes W3C XML Schema documents Copyright (c) 2001-2003 World Wide Web Consortium.These schemas are licensed under the W3C Software License, which is included in the same directory as the schemas.The license can also be found at: http://www.w3.org/Consortium/Legal/2002/copyright-software-20021231.

License

By obtaining and/or copying this work, you (the licensee) agree that you have read, understood, and will comply with the following terms and conditions.

Permission to copy, modify, and distribute this work, with or without modification, for any purpose and without fee or royalty is hereby granted, provided that you include the following on ALL copies of the work or portions thereof, including modifications:

The full text of this NOTICE in a location viewable to users of the redistributed or derivative work.Any pre-existing intellectual property disclaimers, notices, or terms and conditions.If none exist, the W3C Software and Document Short Notice should be included.

Notice of any changes or modifications, through a copyright statement on the new code or document such as "This software or document includes material copied from or derived from [title and URI of the W3C document].Copyright © [YEAR] W3C® (MIT, ERCIM, Keio, Beihang)."

Disclaimers

THIS WORK IS PROVIDED "AS IS," AND COPYRIGHT HOLDERS MAKE NO REPRESENTATIONS OR WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO, WARRANTIES OF MERCHANTABILITY OR FITNESS FOR ANY PARTICULAR PURPOSE OR THAT THE USE OF THE SOFTWARE OR DOCUMENT WILL NOT INFRINGE ANY THIRD PARTY PATENTS, COPYRIGHTS, TRADEMARKS OR OTHER RIGHTS.

COPYRIGHT HOLDERS WILL NOT BE LIABLE FOR ANY DIRECT, INDIRECT, SPECIAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF ANY USE OF THE SOFTWARE OR DOCUMENT.

The name and trademarks of copyright holders may NOT be used in advertising or publicity pertaining to the work without specific, written prior permission.Title to copyright in this work will at all times remain with copyright holders.

5.8.6 Woodstox XML Parser 5.0.2

This copy of Woodstox XML processor is licensed under the Apache (Software) License, version 2.0 ("the License").See the License for details about distribution rights, and the specific rights regarding derivate works.

You may obtain a copy of the License at:

http://www.apache.org/licenses/

A copy is also included with both the downloadable source code package and jar that contains class bytecodes, as file "ASL 2.0".In both cases, that file should be located next to this file: in source distribution the location should be "release-notes/asl"; and in jar "META-INF/"

This product currently only contains code developed by authors of specific components, as identified by the source code files.

Since product implements StAX API, it has dependencies to StAX API classes.

For additional credits (generally to people who reported problems) see CREDITS file.