4 ビッグ・データ環境でのプロパティ・グラフの使用

この章では、ビッグ・データ環境でのプロパティ・グラフ・データの作成、格納および操作に関する概念と使用方法の情報を提供します。

4.1 プロパティ・グラフについて

プロパティ・グラフでは、プロパティ(キー値ペア)をグラフの頂点およびエッジに簡単に関連付けることができ、大量のデータ・セット間の関係に基づく分析操作が可能になります。

4.1.1 プロパティ・グラフとは

プロパティ・グラフは、オブジェクトまたは頂点のセットと、これらのオブジェクトをつなぐ矢印またはエッジのセットで構成されます。頂点とエッジには複数のプロパティを含めることができ、それらはキー値ペアとして表されます。

各頂点には一意の識別子があり、次のものを含めることができます。

出力エッジのセット
入力エッジのセット
プロパティの集まり

各エッジには一意の識別子があり、次のものを含めることができます。

出力頂点
入力頂点
2つの頂点間の関係を示すテキスト・ラベル
プロパティの集まり

図4-1は、2つの頂点と1つのエッジを持つ非常に単純なプロパティ・グラフを示しています。2つの頂点には、識別子1および2があります。両方の頂点には、プロパティnameおよびageがあります。エッジは、出力頂点1から入力頂点2に向かっています。このエッジは、テキスト・ラベルknowsと、頂点1と2の関係のタイプを識別するプロパティtypeで表されます。

図4-1 単純なプロパティ・グラフの例

「図4-1 単純なプロパティ・グラフの例」の説明

プロパティ・グラフ・データ・モデルは標準に基づいていませんが、W3C標準ベースのResource Description Framework (RDF)グラフ・データ・モデルに類似しています。プロパティ・グラフ・データ・モデルはRDFよりも単純であり、精密ではありません。このような相違点により、これは次のようなユースケースでよい候補になります。

ソーシャル・ネットワークでの影響の特定
トレンドおよび顧客行動の予測
パターン一致に基づく関係の発見
キャンペーンをカスタマイズするためのクラスタの特定

4.1.2 プロパティ・グラフのビッグ・データ・サポートとは

プロパティ・グラフは、Hadoopのビッグ・データに対してサポートされます。このサポートは、データ・アクセス・レイヤーと分析レイヤーで構成されます。Hadoopでのデータベースの選択肢により、スケーラブルで一貫したストレージ管理が提供されます。

図4-2は、Oracleプロパティ・グラフのアーキテクチャを示しています。

図4-2 Oracleプロパティ・グラフのアーキテクチャ

「図4-2 Oracleプロパティ・グラフのアーキテクチャ」の説明

4.1.2.1 分析レイヤー

分析レイヤーにより、HadoopクラスタでMapReduceプログラムを使用して、プロパティ・グラフを分析できるようになります。これは、パス計算、ランキング、コミュニティ検出、レコメンダ・システムなどを含む、30を超える分析機能を提供します。

4.1.2.2 データ・アクセス・レイヤー

データ・アクセス・レイヤーは、プロパティ・グラフの作成および削除、頂点とエッジの追加および削除、キー値ペアを使用した頂点とエッジの検索、テキスト索引の作成、およびその他の操作の実行に使用できる、Java APIのセットを提供します。Java APIには、プロパティ・グラフ・データ・モデル用のTinkerPop Blueprintsグラフ・インタフェースの実装が含まれています。また、これらのAPIは、広く普及しているオープンソースのテキスト索引付けおよび検索エンジンであるApache LuceneおよびApache SolrCloudと統合されています。

4.1.2.3 ストレージ管理

プロパティ・グラフは、Oracle NoSQL DatabaseまたはApache HBaseのいずれかに格納できます。これらのデータベースは両方とも成熟しており、スケーラブルで、効率的なナビゲーション、問合せおよび分析をサポートします。両方とも、プロパティ・グラフの頂点およびエッジをモデル化するために表を使用します。

4.1.2.4 RESTful Webサービス

RESTful Webサービスを使用して、グラフ・データにアクセスしてグラフ操作を実行することもできます。たとえば、Linux curlコマンドを使用して、頂点とエッジを取得したり、グラフ要素を追加および削除したりすることができます。

4.2 プロパティ・グラフのデータ形式について

次のグラフ形式がサポートされます。

4.2.1 GraphMLデータ形式

GraphMLファイル形式では、XMLを使用してグラフを記述します。例4-1は、図4-1に示されているプロパティ・グラフのGraphMLの記述を示しています。

関連項目:

『The GraphML File Format』

http://graphml.graphdrawing.org/

例4-1 単純なプロパティ・グラフのGraphMLの記述

<?xml version="1.0" encoding="UTF-8"?>
<graphml xmlns="http://graphml.graphdrawing.org/xmlns">
    <key id="name" for="node" attr.name="name" attr.type="string"/>
    <key id="age" for="node" attr.name="age" attr.type="int"/>
    <key id="type" for="edge" attr.name="type" attr.type="string"/>
    <graph id="PG" edgedefault="directed">
        <node id="1">
            <data key="name">Alice</data>
            <data key="age">31</data>
        </node>
        <node id="2">
            <data key="name">Bob</data>
            <data key="age">27</data>
        </node>
        <edge id="3" source="1" target="2" label="knows">
            <data key="type">friends</data>
        </edge>
    </graph>
</graphml>

4.2.2 GraphSONデータ形式

GraphSONファイル形式では、JavaScript Object Notation (JSON)に基づいてグラフを記述します。例4-2は、図4-1に示されているプロパティ・グラフのGraphSONの記述を示しています。

関連項目:

『GraphSON Reader and Writer Library』

https://github.com/tinkerpop/blueprints/wiki/GraphSON-Reader-and-Writer-Library

例4-2 単純なプロパティ・グラフのGraphSONの記述

{
    "graph": {
        "mode":"NORMAL",
        "vertices": [
            {
                "name": "Alice",
                "age": 31,
                "_id": "1",
                "_type": "vertex"
            },
            {
                "name": "Bob",
                "age": 27,
                "_id": "2",
                "_type": "vertex"
            }       
        ],
        "edges": [
            {
                "type": "friends",
                "_id": "3",
                "_type": "edge",
                "_outV": "1",
                "_inV": "2",
                "_label": "knows"
            }
        ]
    }
}

4.2.3 GMLデータ形式

Graph Modeling Language (GML)ファイル形式では、ASCIIを使用してグラフを記述します。例4-3は、図4-1に示されているプロパティ・グラフのGMLの記述を示しています。

関連項目:

『GML: A Portable Graph File Format』、Michael Himsolt著

http://www.fim.uni-passau.de/fileadmin/files/lehrstuhl/brandenburg/projekte/gml/gml-technical-report.pdf

例4-3 単純なプロパティ・グラフのGMLの記述

graph [
   comment "Simple property graph"
   directed 1
   IsPlanar 1
   node [
      id 1
      label "1"
      name "Alice"
      age 31
        ]
   node [
      id 2
      label "2"
      name "Bob"
      age 27
        ]
   edge [
      source 1
      target 2
      label "knows"
      type "friends"
        ]
      ]

4.2.4 Oracleフラット・ファイル形式

Oracleフラット・ファイル形式はプロパティ・グラフのみを記述します。これは、その他のファイル形式よりも正確で、より優れたデータ型サポートを提供します。Oracleフラット・ファイル形式は、グラフを記述するために、頂点用とエッジ用にそれぞれ1つずつ、2つのファイルを使用します。レコードのフィールドはカンマで区切られます。

例4-4は、図4-1に示されているプロパティ・グラフを記述するOracleフラット・ファイルを示しています。

関連項目:

「Oracleフラット・ファイル形式の定義」

例4-4 単純なプロパティ・グラフのOracleフラット・ファイルの記述

頂点ファイル:

1,name,1,Alice,,
1,age,2,,31,
2,name,1,Bob,,
2,age,2,,27,

エッジ・ファイル:

1,1,2,knows,type,1,friends,,

4.3 プロパティ・グラフの開始

プロパティ・グラフを開始するには、次のようにします。

プロパティ・グラフを初めて使用する際、ソフトウェアがインストールされ、稼働可能であることを確認します。
Java APIで提供されるクラスを使用して、Javaプログラムを作成します。
「プロパティ・グラフ・データ用のJava APIの使用」を参照してください。

4.4 プロパティ・グラフ・データ用のJava APIの使用

プロパティ・グラフの作成には、Java APIを使用したプロパティ・グラフとそのオブジェクトの作成が含まれます。

4.4.1 Java APIの概要

プロパティ・グラフで使用できるJava APIには、次のものが含まれます。

4.4.1.1 Oracle Big Data Spatial and Graph Java API

Oracle Big Data Spatial and Graphのプロパティ・グラフ・サポートは、Apache HBaseおよびOracle NoSQL Database用のデータベース固有のAPIを提供します。データ・アクセス・レイヤーAPI (oracle.pg.*)は、Oracle NoSQL DatabaseおよびApache HBaseに格納されているプロパティ・グラフ用のTinkerPop Blueprints API、テキスト検索および索引付けを実装します。

Oracle Big Data Spatial and Graph APIを使用するには、次のクラスをJavaプログラムにインポートします。

import oracle.pg.nosql.*; // or oracle.pg.hbase.*
import oracle.pgx.config.*;
import oracle.pgx.common.types.*;

TinkerPop Blueprints Java APIも含めます。

関連項目:

Oracle Big Data Spatial and Graph Java APIリファレンス

4.4.1.2 TinkerPop Blueprints Java API

TinkerPop Blueprintsはプロパティ・グラフ・データ・モデルをサポートします。このAPIは、主にBig Data Spatial and Graphのデータ・アクセス・レイヤーJava APIを介して使用する、グラフ操作のためのユーティリティを提供します。

Blueprints APIを使用するには、次のクラスをJavaプログラムにインポートします。

import com.tinkerpop.blueprints.Vertex;
import com.tinkerpop.blueprints.Edge;

関連項目:

『Blueprints: A Property Graph Model Interface API』

http://www.tinkerpop.com/docs/javadocs/blueprints/2.3.0/index.html

4.4.1.3 Apache Hadoop Java API

Apache Hadoop Java APIを使用すると、JavaコードをHadoop分散フレームワーク内で実行するMapReduceプログラムとして作成できます。

Hadoop Java APIを使用するには、次のクラスをJavaプログラムにインポートします。次に例を示します。

import org.apache.hadoop.conf.Configuration;

関連項目:

『Apache Hadoop Main 2.5.0-cdh5.3.2 API』

http://archive.cloudera.com/cdh5/cdh/5/hadoop/api/

4.4.1.4 Oracle NoSQL Database Java API

The Oracle NoSQL Database APIを使用すると、キー値(KV)ストアを作成および移入し、Hadoop、HiveおよびOracle Databaseへのインタフェースを提供できます。

Oracle NoSQL Databaseをグラフ・データ・ストアとして使用するには、次のクラスをJavaプログラムにインポートします。次に例を示します。

import oracle.kv.*; 
import oracle.kv.table.TableOperation;

関連項目:

Oracle NoSQL Database Java APIリファレンス

http://docs.oracle.com/cd/NOSQL/html/javadoc/

4.4.1.5 Apache HBase Java API

Apache HBase APIを使用すると、キー値ペアを作成および操作できます。

HBaseをグラフ・データ・ストアとして使用するには、次のクラスをJavaプログラムにインポートします。次に例を示します。

import org.apache.hadoop.hbase.*; 
import org.apache.hadoop.hbase.client.*;
import org.apache.hadoop.hbase.filter.*; 
import org.apache.hadoop.hbase.util.Bytes; 
import org.apache.hadoop.conf.Configuration;

関連項目:

『HBase 0.98.6-cdh5.3.2 API』

http://archive.cloudera.com/cdh5/cdh/5/hbase/apidocs/index.html?overview-summary.html

4.4.2 グラフ・データのパラレル・ロード

グラフ・データのパラレル・ロードを実行するために、Java APIが提供されています。

頂点ファイル(または入力ストリーム)のセットとエッジ・ファイル(または入力ストリーム)のセットがある場合、それらを複数のチャンクに分割し、並列でデータベースにロードできます。チャンクの数は、ユーザーが指定した並列度(DOP)によって決定されます。

並列性は、頂点およびエッジ・フラット・ファイルを複数のチャンクに分割するスプリッタ・スレッドと、別個のデータベース接続を使用して各チャンクをデータベースにロードするローダー・スレッドによって実現されます。スプリッタ・スレッドとローダー・スレッドを接続するために、Javaパイプが使用されます -- スプリッタ: PipedOutputStreamおよびローダー: PipedInputStream。

データ・ロードAPIの最も単純な使用方法は、1つのプロパティ・グラフ・インスタンス、1つの頂点ファイル、1つのエッジ・ファイルおよびDOPを指定することです。

次のロード・プロセスの例では、最適化されたOracleフラット・ファイル形式の頂点ファイルおよびエッジ・ファイルに格納されたグラフ・データをロードし、並列度48でロードを実行します。

opgdl = OraclePropertyGraphDataLoader.getInstance();
vfile = "/home/alwu/pg-bda-nosql/demo/connections.opv";
efile = "/home/alwu/pg-bda-nosql/demo/connections.ope";
opgdl.loadData(opg, vfile, efile, 48);

4.4.2.1 パーティションを使用したパラレル・データ・ロード

データ・ロードAPIを使用することにより、複数のパーティションを使用してデータをデータベースにロードできます。このAPIは、プロパティ・グラフ、頂点ファイル、エッジ・ファイル、DOP、パーティション合計数およびパーティション・オフセット(0からパーティション合計数 - 1まで)を必要とします。たとえば、2つのパーティションを使用してデータをロードするには、パーティション・オフセットは0および1である必要があります。つまり、グラフ全体をロードするために2つのデータ・ロードAPIコールがあり、これら2つのAPIコールの相違点は、パーティション・オフセット(0および1)のみになります。

次のコード・フラグメントは、4つのパーティションを使用してグラフ・データをロードします。データ・ローダーの各コールは、単一システム上または複数のシステム上の別々のJavaクライアントを使用して処理できます。

OraclePropertyGraph opg = OraclePropertyGraph.getInstance(
 args, szGraphName);

int totalPartitions = 4;
int dop= 32; // degree of parallelism for each client.

String szOPVFile = "../../data/connections.opv";
String szOPEFile = "../../data/connections.ope";
SimpleLogBasedDataLoaderListenerImpl dll = SimpleLogBasedDataLoaderListenerImpl.getInstance(100 /* frequency */,
                                       true /* Continue on error */);

// Run the data loading using 4 partitions (Each call can be run from a
// separate Java Client)

// Partition 1
OraclePropertyGraphDataLoader opgdlP1 = OraclePropertyGraphDataLoader.getInstance();
opgdlP1.loadData(opg, szOPVFile, szOPEFile, dop,  
   4 /* Total number of partitions, default 1 */,
   0 /* Partition to load (from 0 to totalPartitions - 1, default 0 */,
   dll);

// Partition 2
OraclePropertyGraphDataLoader opgdlP2 = OraclePropertyGraphDataLoader.getInstance();
opgdlP2.loadData(opg, szOPVFile, szOPEFile, dop,  4 /* Total number of partitions, default 1 */,
 1 /* Partition to load (from 0 to totalPartitions - 1, default 0 */, dll);


// Partition 3
OraclePropertyGraphDataLoader opgdlP3 = OraclePropertyGraphDataLoader.getInstance();
opgdlP3.loadData(opg, szOPVFile, szOPEFile, dop,  4 /* Total number of partitions, default 1 */,
 2 /* Partition to load (from 0 to totalPartitions - 1, default 0 */, dll);

// Partition 4
OraclePropertyGraphDataLoader opgdlP4 = OraclePropertyGraphDataLoader.getInstance();
opgdlP4.loadData(opg, szOPVFile, szOPEFile, dop,  4 /* Total number of partitions, default 1 */,
 3 /* Partition to load (from 0 to totalPartitions - 1, default 0 */, dll);

4.4.2.2 ファインチューニングを使用したパラレル・データ・ロード

データ・ロードAPIは、ロードされるソース頂点ファイルおよびエッジ・ファイル内のこれらの行のファインチューニングもサポートします。頂点(またはエッジ)オフセット行数と頂点(またはエッジ)最大行数を指定できます。オフセット行数から最大行数までのデータがロードされます。最大行数が-1である場合、ロード・プロセスはファイルの終わりに達するまでデータをスキャンします。

次のコード・フラグメントは、ファインチューニングを使用してグラフ・データをロードします。

OraclePropertyGraph opg = OraclePropertyGraph.getInstance(
 args, szGraphName);

int totalPartitions = 4;
int dop= 32; // degree of parallelism for each client.

String szOPVFile = "../../data/connections.opv";
String szOPEFile = "../../data/connections.ope";
SimpleLogBasedDataLoaderListenerImpl dll = SimpleLogBasedDataLoaderListenerImpl.getInstance(100 /* frequency */,
                                       true /* Continue on error */);

// Run the data loading using fine tuning
OraclePropertyGraphDataLoader opgdl = OraclePropertyGraphDataLoader.getInstance();
opgdl.loadData(m_opg, m_szOPVFile, m_szOPEFile,
 m_lVertexOffsetlines /* offset of lines to start loading 
 from partition, default 0*/,
 m_lEdgeOffsetlines /* offset of lines to start loading 
 from partition, default 0*/,
 m_lVertexMaxlines /* maximum number of lines to start loading 
 from partition, default -1 (all lines in partition)*/,
 m_lEdgeMaxlines /* maximun number of lines to start loading 
 from partition, default -1 (all lines in partition)*/,
 dop,
 totalPartitions /* Total number of partitions, default 1 */,
 idPartition /* Partition to load (from 0 to totalPartitions - 1, 
 default 0 */,
 dll);

4.4.2.3 複数のファイルを使用したパラレル・データ・ロード

Oracle Big Data Spatial and Graphは、データベースへの複数の頂点ファイルおよび複数のエッジ・ファイルのロードもサポートします。指定された頂点ファイルがDOPチャンクに分割され、DOPスレッドを使用してデータベースに並列にロードされます。同様に、複数のエッジ・ファイルも分割されて、並列にロードされます。

次のコード・フラグメントは、パラレル・データ・ロードAPIを使用して、複数の頂点ファン・ファイルおよびエッジ・ファイルを並列にロードします。この例では、入力ファイルを保持するために2つの文字列配列szOPVFilesおよびszOPEFilesを使用しています。この例では1つの頂点ファイルと1つのエッジ・ファイルのみを使用していますが、これら2つの配列に複数の頂点ファイルと複数のエッジ・ファイルを指定できます。

OraclePropertyGraph opg = OraclePropertyGraph.getInstance(
 args, szGraphName);

String[] szOPVFiles = new String[] {"../../data/connections.opv"};
String[] szOPEFiles = new String[] {"../../data/connections.ope"};

// Clear existing vertices/edges in the property graph 
opg.clearRepository();
opg.setQueueSize(100); // 100 elements

// This object will handle parallel data loading over the property graph
OraclePropertyGraphDataLoader opgdl = OraclePropertyGraphDataLoader.getInstance();

opgdl.loadData(opg, szOPVFiles, szOPEFiles, dop);

System.out.println("Total vertices: " + opg.countVertices());
System.out.println("Total edges: " + opg.countEdges());

4.4.2.4 グラフ・データのパラレル取得

プロパティ・グラフのパラレル問合せは、頂点(またはエッジ)のパラレル・スキャンを実行するための単純なJava APIを提供します。パラレル取得は、バックエンド・データベースとの分割間でのデータの配分を利用する最適化されたソリューションであり、各分割は別個のデータベース接続を使用して問い合せられます。

パラレル取得では、各要素が特定の分割からのすべての頂点(またはエッジ)を保持する配列が生成されます。問い合せられたシャードのサブセットは、特定の開始分割IDと指定された接続の配列のサイズによって分離されます。これにより、サブセットは[開始, 開始 - 1 + 接続の配列のサイズ]の範囲内の分割を考慮します。N分割を含む頂点表にあるすべての分割に、([0, N - 1]の範囲の)整数IDが割り当てられることに注意してください。

次のコードは、Apache HBaseを使用してプロパティ・グラフをロードし、接続の配列を開き、開いた接続を使用してパラレル問合せを実行してすべての頂点およびエッジを取得します。getVerticesPartitioned (getEdgesPartitioned)メソッドの呼出し数は、分割の合計数と使用される接続の数によって制御されます。

OraclePropertyGraph opg = OraclePropertyGraph.getInstance(
 args, szGraphName);

// Clear existing vertices/edges in the property graph 
opg.clearRepository(); 

String szOPVFile = "../../data/connections.opv";
String szOPEFile = "../../data/connections.ope";

// This object will handle parallel data loading
OraclePropertyGraphDataLoader opgdl = OraclePropertyGraphDataLoader.getInstance();
opgdl.loadData(opg, szOPVFile, szOPEFile, dop); 

// Create connections used in parallel query
HConnection hConns= new HConnection[dop];
for (int i = 0; i < dop; i++) { 
Configuration conf_new = 
HBaseConfiguration.create(opg.getConfiguration());
hConns[i] = HConnectionManager.createConnection(conf_new); 
}


long lCountV = 0;
// Iterate over all the vertices’ splits to count all the vertices
for (int split = 0; split < opg.getVertexTableSplits(); 
 split += dop) { 
Iterable<Vertex>[] iterables 
 = opg.getVerticesPartitioned(hConns /* Connection array */, 
 true /* skip store to cache */, 
 split /* starting split */); 
lCountV += consumeIterables(iterables); /* consume iterables using 
 threads */
}

// Count all vertices
System.out.println("Vertices found using parallel query: " + lCountV);

long lCountE = 0;
// Iterate over all the edges’ splits to count all the edges
for (int split = 0; split < opg.getEdgeTableSplits(); 
 split += dop) { 
Iterable<Edge>[] iterables 
 = opg.getEdgesPartitioned(hConns /* Connection array */, 
 true /* skip store to cache */, 
 split /* starting split */); 
lCountE += consumeIterables(iterables); /* consume iterables using 
 threads */
}

// Count all edges
System.out.println("Edges found using parallel query: " + lCountE);

// Close the connections to the database after completed
for (int idx = 0; idx < conns.length; idx++) { 
conns[idx].close();
}

4.4.2.5 サブグラフ抽出のための要素フィルタ・コールバックの使用

Oracle Big Data Spatial and Graphは、ユーザー定義の要素フィルタ・コールバックを使用した、容易なサブグラフ抽出のサポートを提供します。要素フィルタ・コールバックは、サブグラフに頂点(またはエッジ)を保持するために、頂点(またはエッジ)が満たす必要のある条件セットを定義します。ユーザーはVertexFilterCallbackおよびEdgeFilterCallback APIインタフェースを実装することによって、独自の要素フィルタを定義できます。

次のコード・フラグメントは、頂点が政治的役割を持たず、その出生地が米国であるかどうかを検証するVertexFilterCallbackを実装します。

/**
* VertexFilterCallback to retrieve a vertex from the United States 
* that does not have a political role 
*/
private static class NonPoliticianFilterCallback 
implements VertexFilterCallback
{
@Override
public boolean keepVertex(OracleVertexBase vertex) 
{
String country = vertex.getProperty("country");
String role = vertex.getProperty("role");

if (country != null && country.equals("United States")) {
if (role == null || !role.toLowerCase().contains("political")) {
return true;
}
}

return false;
}

public static NonPoliticianFilterCallback getInstance()
{
return new NonPoliticianFilterCallback();
}
}

次のコード・フラグメントは、VertexFilterCallbackを使用して特定の入力頂点に接続しているエッジのうち、その接続が政治家ではなく米国出身であるエッジのみを保持するEdgeFilterCallbackを実装します。

/**
 * EdgeFilterCallback to retrieve all edges connected to an input 
 * vertex with "collaborates" label, and whose vertex is from the 
 * United States with a role different than political
*/
private static class CollaboratorsFilterCallback 
implements EdgeFilterCallback
{
private VertexFilterCallback m_vfc;
private Vertex m_startV;

public CollaboratorsFilterCallback(VertexFilterCallback vfc, 
 Vertex v) 
{
m_vfc = vfc;
m_startV = v; 
}

@Override
public boolean keepEdge(OracleEdgeBase edge) 
{
if ("collaborates".equals(edge.getLabel())) {
if (edge.getVertex(Direction.IN).equals(m_startV) && 
m_vfc.keepVertex((OracleVertex) 
edge.getVertex(Direction.OUT))) {
return true;
}
else if (edge.getVertex(Direction.OUT).equals(m_startV) && 
 m_vfc.keepVertex((OracleVertex) 
edge.getVertex(Direction.IN))) {
return true;
}
}

return false;
}

public static CollaboratorsFilterCallback
getInstance(VertexFilterCallback vfc, Vertex v)
{
return new CollaboratorsFilterCallback(vfc, v);
}

}

前に定義したフィルタ・コールバックを使用して、次のコード・フラグメントは、プロパティ・グラフをロードし、フィルタ・コールバックのインスタンスを作成し、その後に政治家ではなく米国出身であるBarack Obamaのすべての協力者を取得します。

OraclePropertyGraph opg = OraclePropertyGraph.getInstance(
 args, szGraphName);

// Clear existing vertices/edges in the property graph 
opg.clearRepository(); 

String szOPVFile = "../../data/connections.opv";
String szOPEFile = "../../data/connections.ope";

// This object will handle parallel data loading
OraclePropertyGraphDataLoader opgdl = OraclePropertyGraphDataLoader.getInstance();
opgdl.loadData(opg, szOPVFile, szOPEFile, dop); 

// VertexFilterCallback to retrieve all people from the United States // who are not politicians
NonPoliticianFilterCallback npvfc = NonPoliticianFilterCallback.getInstance();

// Initial vertex: Barack Obama
Vertex v = opg.getVertices("name", "Barack Obama").iterator().next();

// EdgeFilterCallback to retrieve all collaborators of Barack Obama 
// from the United States who are not politicians
CollaboratorsFilterCallback cefc = CollaboratorsFilterCallback.getInstance(npvfc, v);

Iterable<<Edge> obamaCollabs = opg.getEdges((String[])null /* Match any 
of the properties */,
cefc /* Match the 
EdgeFilterCallback */
);
Iterator<<Edge> iter = obamaCollabs.iterator();

System.out.println("\n\n--------Collaborators of Barack Obama from " +
 " the US and non-politician\n\n");
long countV = 0;
while (iter.hasNext()) {
Edge edge = iter.next(); // get the edge
// check if obama is the IN vertex
if (edge.getVertex(Direction.IN).equals(v)) {
 System.out.println(edge.getVertex(Direction.OUT) + "(Edge ID: " + 
 edge.getId() + ")"); // get out vertex
}
else {
System.out.println(edge.getVertex(Direction.IN)+ "(Edge ID: " + 
 edge.getId() + ")"); // get in vertex
}

countV++;
}

デフォルトでは、すべての頂点の取得、すべてのエッジの取得(および並列アプローチ)などのすべての読取り操作は、opg.setVertexFilterCallback(vfc)およびopg.setEdgeFilterCallback(efc)メソッドを使用してプロパティ・グラフに関連付けられたフィルタ・コールバックを使用します。フィルタ・コールバック・セットがない場合は、すべての頂点(またはエッジ)およびエッジが取得されます。

次のコード・フラグメントは、プロパティ・グラフでデフォルトのエッジ・フィルタ・コールバック・セットを使用して、エッジを取得します。

// VertexFilterCallback to retrieve all people from the United States // who are not politicians
NonPoliticianFilterCallback npvfc = NonPoliticianFilterCallback.getInstance();

// Initial vertex: Barack Obama
Vertex v = opg.getVertices("name", "Barack Obama").iterator().next();

// EdgeFilterCallback to retrieve all collaborators of Barack Obama 
// from the United States who are not politicians
CollaboratorsFilterCallback cefc = CollaboratorsFilterCallback.getInstance(npvfc, v);

opg.setEdgeFilterCallback(cefc);

Iterable<Edge> obamaCollabs = opg.getEdges();
Iterator<Edge> iter = obamaCollabs.iterator();

System.out.println("\n\n--------Collaborators of Barack Obama from " +
 " the US and non-politician\n\n");
long countV = 0;
while (iter.hasNext()) {
Edge edge = iter.next(); // get the edge
// check if obama is the IN vertex
if (edge.getVertex(Direction.IN).equals(v)) {
 System.out.println(edge.getVertex(Direction.OUT) + "(Edge ID: " + 
 edge.getId() + ")"); // get out vertex
}
else {
System.out.println(edge.getVertex(Direction.IN)+ "(Edge ID: " + 
 edge.getId() + ")"); // get in vertex
}

countV++;
}

4.4.2.6 プロパティ・グラフ・データの読取りでの最適化フラグの使用

Oracle Big Data Spatial and Graphは、グラフ反復パフォーマンスを向上させるために、最適化フラグのサポートを提供します。最適化フラグにより、情報がない、または最小限の情報(ID、ラベルおよび入力/出力頂点など)を持つオブジェクトとして、頂点(またはエッジ)を処理できます。これにより、反復中の各頂点(またはエッジ)の処理にかかる時間が削減されます。

次の表に、プロパティ・グラフで頂点(またはエッジ)を処理するときに使用できる最適化フラグを示します。

最適化フラグ	説明
DO_NOT_CREATE_OBJECT	頂点またはエッジを処理するときに、事前定義済の定数オブジェクトを使用します。
JUST_EDGE_ID	エッジの処理時に、IDのみを持つエッジ・オブジェクトを作成します。
JUST_LABEL_EDGE_ID	エッジの処理時に、IDとラベルのみを持つエッジ・オブジェクトを作成します。
JUST_LABEL_VERTEX_EDGE_ID	エッジの処理時に、ID、ラベルおよび入力/出力頂点IDのみを持つエッジ・オブジェクトを作成します。
JUST_VERTEX_EDGE_ID	エッジの処理時に、IDおよび入力/出力頂点IDのみを持つエッジ・オブジェクトを作成します。
JUST_VERTEX_ID	頂点の処理時に、IDのみを持つ頂点オブジェクトを作成します。

次のコード・フラグメントは、最適化フラグのセットを使用して、プロパティ・グラフの頂点およびエッジからすべてのIDのみを取得します。すべての頂点およびエッジの読取りによって取得されたオブジェクトには、IDのみが含まれ、キー/値プロパティや追加情報は含まれません。

OraclePropertyGraph opg = OraclePropertyGraph.getInstance(
 args, szGraphName);

// Clear existing vertices/edges in the property graph 
opg.clearRepository(); 

String szOPVFile = "../../data/connections.opv";
String szOPEFile = "../../data/connections.ope";

// This object will handle parallel data loading
OraclePropertyGraphDataLoader opgdl = OraclePropertyGraphDataLoader.getInstance();
opgdl.loadData(opg, szOPVFile, szOPEFile, dop); 


// Optimization flag to retrieve only vertices IDs
OptimizationFlag optFlagVertex = OptimizationFlag.JUST_VERTEX_ID;

// Optimization flag to retrieve only edges IDs
OptimizationFlag optFlagEdge = OptimizationFlag.JUST_EDGE_ID;

// Print all vertices
Iterator<Vertex> vertices = 
opg.getVertices((String[])null /* Match any of the 
properties */,
null /* Match the VertexFilterCallback */, 
optFlagVertex /* optimization flag */ 
).iterator();

System.out.println("----- Vertices IDs----");
long vCount = 0;
while (vertices.hasNext()) {
OracleVertex v = vertices.next();
System.out.println((Long) v.getId());
vCount++;
}
System.out.println("Vertices found: " + vCount);


// Print all edges
Iterator<Edge> edges 
opg.getEdges((String[])null /* Match any of the 
properties */,
null /* Match the EdgeFilterCallback */, 
optFlagEdge /* optimization flag */ 
).iterator();

System.out.println("----- Edges ----");
long eCount = 0;
while (edges.hasNext()) {
Edge e = edges.next();
System.out.println((Long) e.getId());
eCount++;
}
System.out.println("Edges found: " + eCount);

デフォルトでは、すべての頂点の取得、すべてのエッジの取得(および並列アプローチ)などのすべての読取り操作は、opg.setDefaultVertexOptFlag(optFlagVertex)およびopg.setDefaultEdgeOptFlag(optFlagEdge)メソッドを使用してプロパティ・グラフに関連付けられた最適化フラグを使用します。頂点またはエッジの処理に対する最適化フラグが定義されていない場合、頂点およびエッジに関するすべての情報が取得されます。

次のコード・フラグメントは、プロパティ・グラフでデフォルト最適化フラグのセットを使用して、その頂点およびエッジからすべてのIDのみを取得します。

// Optimization flag to retrieve only vertices IDs
OptimizationFlag optFlagVertex = OptimizationFlag.JUST_VERTEX_ID;

// Optimization flag to retrieve only edges IDs
OptimizationFlag optFlagEdge = OptimizationFlag.JUST_EDGE_ID;

opg.setDefaultVertexOptFlag(optFlagVertex);
opg.setDefaultEdgeOptFlag(optFlagEdge);

Iterator<Vertex> vertices = opg.getVertices().iterator();
System.out.println("----- Vertices IDs----");
long vCount = 0;
while (vertices.hasNext()) {
OracleVertex v = vertices.next();
System.out.println((Long) v.getId());
vCount++;
}
System.out.println("Vertices found: " + vCount);


// Print all edges
Iterator<Edge> edges = opg.getEdges().iterator();
System.out.println("----- Edges ----");
long eCount = 0;
while (edges.hasNext()) {
Edge e = edges.next();
System.out.println((Long) e.getId());
eCount++;
}
System.out.println("Edges found: " + eCount);

4.4.2.7 プロパティ・グラフのサブグラフの属性の追加および削除

Oracle Big Data Spatial and Graphは、ユーザーがカスタマイズした操作コールバックを使用した、頂点またはエッジ(あるいはその両方)のサブグラフへの属性(キー/値ペア)の更新をサポートします。操作コールバックは、(特定の属性および値を追加または削除することによって)頂点(またはエッジ)を更新するために、頂点(またはエッジ)が満たす必要のある条件セットを定義します。

VertexOpCallbackおよびEdgeOpCallback APIインタフェースを実装することにより、独自の属性操作を定義できます。更新操作に含まれる頂点(またはエッジ)によって満たされる必要のある条件を定義するneedOpメソッド、および要素の更新時に使用されるキー名とキー値をそれぞれ返すgetAttributeKeyNameメソッドとgetAttributeKeyValueメソッドをオーバーライドする必要があります。

次のコード・フラグメントは、Barack Obamaの協力者のみに関連付けられたobamaCollaborator属性を操作するVertexOpCallbackを実装します。このプロパティの値は、協力者の役職に基づいて指定されます。

private static class CollaboratorsVertexOpCallback 
implements VertexOpCallback
{
private OracleVertexBase m_obama;
private List<Vertex> m_obamaCollaborators;

public CollaboratorsVertexOpCallback(OraclePropertyGraph opg)
{
// Get a list of Barack Obama'sCollaborators
m_obama = (OracleVertexBase) opg.getVertices("name", 
 "Barack Obama")
.iterator().next();

Iterable<Vertex> iter = m_obama.getVertices(Direction.BOTH, 
"collaborates");
m_obamaCollaborators = OraclePropertyGraphUtils.listify(iter);
}

public static CollaboratorsVertexOpCallback 
getInstance(OraclePropertyGraph opg)
{
return new CollaboratorsVertexOpCallback(opg);
}

/**
 * Add attribute if and only if the vertex is a collaborator of Barack 
 * Obama
*/
@Override
public boolean needOp(OracleVertexBase v)
{
return m_obamaCollaborators != null && 
 m_obamaCollaborators.contains(v);
}

@Override
public String getAttributeKeyName(OracleVertexBase v)
{
return "obamaCollaborator";
}

/**
 * Define the property's value based on the vertex role
 */
@Override
public Object getAttributeKeyValue(OracleVertexBase v)
{
String role = v.getProperty("role");
role = role.toLowerCase();
if (role.contains("political")) {
return "political";
}
else if (role.contains("actor") || role.contains("singer") ||
 role.contains("actress") || role.contains("writer") ||
 role.contains("producer") || role.contains("director")) {
return "arts";
}
else if (role.contains("player")) {
return "sports";
}
else if (role.contains("journalist")) {
return "journalism";
}
else if (role.contains("business") || role.contains("economist")) {
return "business";
}
else if (role.contains("philanthropist")) {
return "philanthropy";
}
return " ";
}
}

次のコード・フラグメントは、Barack Obamaの敵対者のみに関連付けられたobamaFeud属性を操作するEdgeOpCallbackを実装します。このプロパティの値は、協力者の役職に基づいて指定されます。

private static class FeudsEdgeOpCallback 
implements EdgeOpCallback
{
private OracleVertexBase m_obama;
private List<Edge> m_obamaFeuds;

public FeudsEdgeOpCallback(OraclePropertyGraph opg)
{
// Get a list of Barack Obama's feuds
m_obama = (OracleVertexBase) opg.getVertices("name", 
 "Barack Obama")
.iterator().next();

Iterable<Vertex> iter = m_obama.getVertices(Direction.BOTH, 
"feuds");
m_obamaFeuds = OraclePropertyGraphUtils.listify(iter);
}

public static FeudsEdgeOpCallback getInstance(OraclePropertyGraph opg)
{
return new FeudsEdgeOpCallback(opg);
}

/**
 * Add attribute if and only if the edge is in the list of Barack Obama's 
 * feuds
*/
@Override
public boolean needOp(OracleEdgeBase e)
{
return m_obamaFeuds != null && m_obamaFeuds.contains(e);
}

@Override
public String getAttributeKeyName(OracleEdgeBase e)
{
return "obamaFeud";
}

/**
 * Define the property's value based on the in/out vertex role
 */
@Override
public Object getAttributeKeyValue(OracleEdgeBase e)
{
OracleVertexBase v = (OracleVertexBase) e.getVertex(Direction.IN);
if (m_obama.equals(v)) {
v = (OracleVertexBase) e.getVertex(Direction.OUT);
}
String role = v.getProperty("role");
role = role.toLowerCase();

if (role.contains("political")) {
return "political";
}
else if (role.contains("actor") || role.contains("singer") ||
 role.contains("actress") || role.contains("writer") ||
 role.contains("producer") || role.contains("director")) {
return "arts";
}
else if (role.contains("journalist")) {
return "journalism";
}
else if (role.contains("player")) {
return "sports";
}
else if (role.contains("business") || role.contains("economist")) {
return "business";
}
else if (role.contains("philanthropist")) {
return "philanthropy";
}
return " ";
}
}

前に定義した操作コールバックを使用して、次のコード・フラグメントはプロパティ・グラフをロードし、操作コールバックのインスタンスを作成し、その後にOraclePropertyGraphでaddAttributeToAllVerticesおよびaddAttributeToAllEdgesメソッドを使用して属性を適切な頂点およびエッジに追加します。

OraclePropertyGraph opg = OraclePropertyGraph.getInstance(
 args, szGraphName);

// Clear existing vertices/edges in the property graph 
opg.clearRepository(); 

String szOPVFile = "../../data/connections.opv";
String szOPEFile = "../../data/connections.ope";

// This object will handle parallel data loading
OraclePropertyGraphDataLoader opgdl = OraclePropertyGraphDataLoader.getInstance();
opgdl.loadData(opg, szOPVFile, szOPEFile, dop); 

// Create the vertex operation callback
CollaboratorsVertexOpCallback cvoc = CollaboratorsVertexOpCallback.getInstance(opg);

// Add attribute to all people collaborating with Obama based on their role
opg.addAttributeToAllVertices(cvoc, true /** Skip store to Cache */, dop);

// Look up for all collaborators of Obama
Iterable<Vertex> collaborators = opg.getVertices("obamaCollaborator", "political");
System.out.println("Political collaborators of Barack Obama " + getVerticesAsString(collaborators));

collaborators = opg.getVertices("obamaCollaborator", "business");
System.out.println("Business collaborators of Barack Obama " + 
getVerticesAsString(collaborators));

// Add an attribute to all people having a feud with Barack Obama to set
// the type of relation they have
FeudsEdgeOpCallback feoc = FeudsEdgeOpCallback.getInstance(opg);
opg.addAttributeToAllEdges(feoc, true /** Skip store to Cache */, dop);

// Look up for all feuds of Obama
Iterable<Edge> feuds = opg.getEdges("obamaFeud", "political");
System.out.println("\n\nPolitical feuds of Barack Obama " + getEdgesAsString(feuds));

feuds = opg.getEdges("obamaFeud", "business");
System.out.println("Business feuds of Barack Obama " + 
getEdgesAsString(feuds));

次のコード・フラグメントは、属性obamaCollaboratorにphilanthropyという値を持つ頂点を削除した後でAPI removeAttributeFromAllVerticesをコールするために使用できるVertexOpCallbackの実装を定義します。また、属性obamaFeudにbusinessという値を持つエッジを削除した後でAPI removeAttributeFromAllEdgesをコールするために使用できるEdgeOpCallbackの実装も定義します。

System.out.println("\n\nRemove 'obamaCollaborator' property from all the" + 
 "philanthropy collaborators");
PhilanthropyCollaboratorsVertexOpCallback pvoc = philanthropyCollaboratorsVertexOpCallback.getInstance();

opg.removeAttributeFromAllVertices(pvoc);

System.out.println("\n\nRemove 'obamaFeud' property from all the" + "business feuds");
BusinessFeudsEdgeOpCallback beoc = BusinessFeudsEdgeOpCallback.getInstance();

opg.removeAttributeFromAllEdges(beoc);

/**
 * Implementation of a EdgeOpCallback to remove the "obamaCollaborators" 
 * property from all people collaborating with Barack Obama that have a 
 * philanthropy role
 */
private static class PhilanthropyCollaboratorsVertexOpCallback implements VertexOpCallback
{
  public static PhilanthropyCollaboratorsVertexOpCallback getInstance()
  {
     return new PhilanthropyCollaboratorsVertexOpCallback();
  }
  
  /**
   * Remove attribute if and only if the property value for   
   * obamaCollaborator is Philanthropy
   */
  @Override
  public boolean needOp(OracleVertexBase v)
  {
    String type = v.getProperty("obamaCollaborator");
    return type != null && type.equals("philanthropy");
  }

  @Override
  public String getAttributeKeyName(OracleVertexBase v)
  {
    return "obamaCollaborator";
  }

  /**
   * Define the property's value. In this case can be empty
   */
  @Override
  public Object getAttributeKeyValue(OracleVertexBase v)
  {
    return " ";
  }
}

/**
 * Implementation of a EdgeOpCallback to remove the "obamaFeud" property
 * from all connections in a feud with Barack Obama that have a business role
 */
private static class BusinessFeudsEdgeOpCallback implements EdgeOpCallback
{
  public static BusinessFeudsEdgeOpCallback getInstance()
  {
    return new BusinessFeudsEdgeOpCallback();
  }

  /**
   * Remove attribute if and only if the property value for obamaFeud is       
   * business
   */
  @Override
  public boolean needOp(OracleEdgeBase e)
  {
    String type = e.getProperty("obamaFeud");
    return type != null && type.equals("business");
  }

 @Override
 public String getAttributeKeyName(OracleEdgeBase e)
 {
   return "obamaFeud";
 }

 /**
  * Define the property's value. In this case can be empty
  */
  @Override
  public Object getAttributeKeyValue(OracleEdgeBase e)
  {
    return " ";
  }
}

4.4.2.8 プロパティ・グラフ・メタデータの取得

データベース内のすべてのグラフ名など、グラフのメタデータと統計を取得できます。各グラフについて、最小/最大頂点ID、最小/最大エッジID、頂点プロパティ名、エッジ・プロパティ名、グラフ頂点内の分割の数、およびパラレル表スキャンをサポートするエッジ表などを取得します。

次のコード・フラグメントは、バックエンド・データベース(Oracle NoSQL DatabaseまたはApache HBase)に格納されている既存のプロパティ・グラフのメタデータおよび統計を取得します。必要な引数は、各データベースで異なります。

// Get all graph names in the database
List<String> graphNames = OraclePropertyGraphUtils.getGraphNames(dbArgs);

for (String graphName : graphNames) {
OraclePropertyGraph opg = OraclePropertyGraph.getInstance(args, 
graphName);

System.err.println("\n Graph name: " + graphName);
System.err.println(" Total vertices: " + 
 opg.countVertices(dop));
 
System.err.println(" Minimum Vertex ID: " + 
 opg.getMinVertexID(dop));
System.err.println(" Maximum Vertex ID: " + 
 opg.getMaxVertexID(dop));

Set<String> propertyNamesV = new HashSet<String>();
opg.getVertexPropertyNames(dop, 0 /* timeout,0 no timeout */,
 propertyNamesV);

System.err.println(" Vertices property names: " + 
getPropertyNamesAsString(propertyNamesV));

System.err.println("\n\n Total edges: " + opg.countEdges(dop));
System.err.println(" Minimum Edge ID: " + opg.getMinEdgeID(dop));
System.err.println(" Maximum Edge ID: " + opg.getMaxEdgeID(dop));

Set<String> propertyNamesE = new HashSet<String>();
opg.getEdgePropertyNames(dop, 0 /* timeout,0 no timeout */, 
 propertyNamesE);

System.err.println(" Edge property names: " +
getPropertyNamesAsString(propertyNamesE));

System.err.println("\n\n Table Information: ");
System.err.println("Vertex table number of splits: " + 
 (opg.getVertexTableSplits()));
System.err.println("Edge table number of splits: " + 
 (opg.getEdgeTableSplits()));

4.4.3 プロパティ・グラフ・インスタンスのオープンおよびクローズ

プロパティ・グラフを記述する際、次のOracle Property Graphクラスを使用して、プロパティ・グラフ・インスタンスを適切にオープンおよびクローズします。

OraclePropertyGraph.getInstance: Oracleプロパティ・グラフのインスタンスをオープンします。このメソッドには、接続情報とグラフ名の2つのパラメータがあります。接続情報のフォーマットは、バックエンド・データベースとしてHBaseまたはOracle NoSQL Databaseのいずれを使用しているかによって異なります。
OraclePropertyGraph.clearRepository: プロパティ・グラフ・インスタンスからすべての頂点およびエッジを削除します。
OraclePropertyGraph.shutdown: グラフ・インスタンスをクローズします。

さらに、Oracle NoSQL DatabaseまたはHBase APIからの適切なクラスを使用する必要があります。

4.4.3.1 Oracle NoSQL Databaseの使用

Oracle NoSQL Databaseでは、OraclePropertyGraph.getInstanceメソッドはKVストア名、ホスト・コンピュータ名およびポート番号を接続に使用します。

String kvHostPort = "cluster02:5000";
String kvStoreName = "kvstore";
String kvGraphName = "my_graph";

// Use NoSQL Java API
KVStoreConfig kvconfig = new KVStoreConfig(kvStoreName, kvHostPort);

OraclePropertyGraph opg = OraclePropertyGraph.getInstance(kvconfig, kvGraphName);
opg.clearRepository(); 
//     .
//     .  Graph description
//     .
// Close the graph instance
opg.shutdown();

アプリケーションでインメモリー分析機能が必要な場合、GraphConfigBuilderを使用してOracle NoSQL Databaseに対してグラフconfigを作成し、引数としてconfigを指定してOraclePropertyGraphをインスタンス化することをお薦めします。

例として、次のコード・スニペットはグラフconfigを作成し、OraclePropertyGraphインスタンスを取得し、いくつかのデータをそのグラフにロードして、インメモリー・アナリストを取得します。

    import oracle.pgx.api.Pgx;
    import oracle.pgx.config.*;
    import oracle.pgx.api.analyst.*;
    import oracle.pgx.common.types.*;
 
    ...
 
    String[] hhosts = new String[1];
    hhosts[0]          = "my_host_name:5000"; // need customization
    String szStoreName =  "kvstore";          // need customization
    String szGraphName = "my_graph";
    int dop            =  8;
 
    PgNosqlGraphConfig cfg = GraphConfigBuilder.forNosql()
                                               .setName(szGraphName)
                                               .setHosts(Arrays.asList(hhosts))
                                               .setStoreName(szStoreName)
                                               .addEdgeProperty("lbl", PropertyType.STRING, "lbl")
                                               .addEdgeProperty("weight", PropertyType.DOUBLE, "1000000")
                                               .build();
 
    OraclePropertyGraph opg = OraclePropertyGraph.getInstance(cfg);  
 
    String szOPVFile = "../../data/connections.opv";
    String szOPEFile = "../../data/connections.ope";
 
    // perform a parallel data load
    OraclePropertyGraphDataLoader opgdl = OraclePropertyGraphDataLoader.getInstance();
    opgdl.loadData(opg, szOPVFile, szOPEFile, dop); 
 
    ...
    Analyst analyst = opg.getInMemAnalyst();
    ...

4.4.3.2 Apache HBaseの使用

Apache HBaseでは、OraclePropertyGraph.getInstanceメソッドはHadoopノードとApache HBaseポート番号を接続に使用します。

String hbQuorum = "bda01node01.example.com, bda01node02.example.com, bda01node03.example.com";
String hbClientPort = "2181"
String hbGraphName = "my_graph";
 
// Use HBase Java APIs
Configuration conf = HBaseConfiguration.create();
  conf.set("hbase.zookeeper.quorum", hbQuorum);
  conf.set("hbase.zookeper.property.clientPort", hbClientPort);
HConnection conn = HConnectionManager.createConnection(conf);
 
// Open the property graph
OraclePropertyGraph opg = OraclePropertyGraph.getInstance(conf, conn, hbGraphName);
opg.clearRepository(); 
//     .
//     .  Graph description
//     .
// Close the graph instance
opg.shutdown();
// Close the HBase connection
conn.close();

アプリケーションでインメモリー分析機能が必要な場合、GraphConfigBuilderを使用してグラフconfigを作成し、引数としてconfigを指定してOraclePropertyGraphをインスタンス化することをお薦めします。

例として、次のコード・スニペットはインメモリー分析の構成を設定し、Apache HBaseに対してグラフconfigを作成し、OraclePropertyGraphインスタンスをインスタンス化し、インメモリー・アナリストを取得して、グラフ内のトライアングルの数をカウントします。

int dop = 8;
Map<PgxConfig.Field, Object> confPgx = new HashMap<PgxConfig.Field, Object>();
confPgx.put(PgxConfig.Field.ENABLE_GM_COMPILER, false);
confPgx.put(PgxConfig.Field.NUM_WORKERS_IO, dop + 2);
confPgx.put(PgxConfig.Field.NUM_WORKERS_ANALYSIS, 8); // <= # of physical cores
confPgx.put(PgxConfig.Field.NUM_WORKERS_FAST_TRACK_ANALYSIS, 2);
confPgx.put(PgxConfig.Field.SESSION_TASK_TIMEOUT_SECS, 0);  // no timeout set
confPgx.put(PgxConfig.Field.SESSION_IDLE_TIMEOUT_SECS, 0);  // no timeout set
PgxConfig.init(confPgx);
 
int iClientPort = Integer.parseInt(szClientPort);
int iSplitsPerRegion = 2;
 
PgHbaseGraphConfig cfg = GraphConfigBuilder.forHbase()
                           .setName(hbGraphName)
                           .setZkQuorum(hbQuorum)
                           .setZkQuorum(szQuorum)
                           .setZkClientPort(iClientPort)
                           .setZkSessionTimeout(60000)
                           .setMaxNumConnections(dop)
                           .setLoadEdgeLabel(true)
                           .setSplitsPerRegion(splitsPerRegion)
                           .addEdgeProperty("lbl", PropertyType.STRING, "lbl")
                           .addEdgeProperty("weight", PropertyType.DOUBLE, "1000000")
                           .build();
OraclePropertyGraph opg = OraclePropertyGraph.getInstance(cfg);   
Analyst analyst = opg.getInMemAnalyst();
long triangles = analyst.countTriangles(false).get();

4.4.4 頂点の作成

頂点を作成するには、次のOracle Property Graphメソッドを使用します。

OraclePropertyGraph.addVertex: グラフに頂点インスタンスを追加します。
OracleVertex.setProperty: 頂点にキー値プロパティを割り当てます。
OraclePropertyGraph.commit: すべての変更をプロパティ・グラフ・インスタンスに保存します。

次のコード・フラグメントは、opgプロパティ・グラフ・インスタンスに年齢、名前、体重、身長、性別のプロパティを指定して、V1およびV2という名前の2つの頂点を作成します。v1プロパティはデータ型を明示的に設定します。

// Create vertex v1 and assign it properties as key-value pairs
Vertex v1 = opg.addVertex(1l);
  v1.setProperty("age",  Integer.valueOf(31));
  v1.setProperty("name", "Alice");
  v1.setProperty("weight", Float.valueOf(135.0f));
  v1.setProperty("height", Double.valueOf(64.5d));
  v1.setProperty("female", Boolean.TRUE);
  
Vertex v2 = opg.addVertex(2l);
  v2.setProperty("age",  27);
  v2.setProperty("name", "Bob");
  v2.setProperty("weight", Float.valueOf(156.0f));
  v2.setProperty("height", Double.valueOf(69.5d));
  v2.setProperty("female", Boolean.FALSE);

4.4.5 エッジの作成

エッジを作成するには、次のOracle Property Graphメソッドを使用します。

OraclePropertyGraph.addEdge: グラフにエッジ・インスタンスを追加します。
OracleEdge.setProperty: エッジにキー値プロパティを割り当てます。

次のコード・フラグメントは、2つの頂点(v1およびv2)と1つのエッジ(e1)を作成します。

// Add vertices v1 and v2
Vertex a = opg.addVertex(1l);
v1.setProperty("name", "Alice");
v1.setProperty("age", 31);

Vertex v2 = opg.addVertex(2l);  
v2.setProperty("name", "Bob");
v2.setProperty("age", 27);

// Add edge e1
Edge e1 = opg.addEdge(1l, v1, v2, "knows");
e1.setProperty("type", "friends");

4.4.6 頂点およびエッジの削除

頂点インスタンスとエッジ・インスタンスを個別に、またはすべて同時に削除できます。次のメソッドを使用します。

OraclePropertyGraph.removeEdge: 指定されたエッジをグラフから削除します。
OraclePropertyGraph.removeVertex: 指定された頂点をグラフから削除します。
OraclePropertyGraph.clearRepository: プロパティ・グラフ・インスタンスからすべての頂点およびエッジを削除します。

次のコード・フラグメントは、グラフ・インスタンスからエッジe1と頂点v1を削除します。頂点を削除するときに、隣接するエッジもグラフから削除されます。これは、すべてのエッジは開始頂点と終了頂点を持っている必要があるためです。開始頂点または終了頂点を削除した後、そのエッジは有効なエッジではなくなります。

// Remove edge e1
opg.removeEdge(e1);

// Remove vertex v1
opg.removeVertex(v1);

OraclePropertyGraphインスタンスからすべてのコンテンツを削除するには、OraclePropertyGraph.clearRepositoryメソッドを使用できます。ただし、この操作は元に戻せないため、慎重に使用してください。

4.4.7 プロパティ・グラフの削除

データベースからプロパティ・グラフを削除するには、OraclePropertyGraphUtils.dropPropertyGraphメソッドを使用します。このメソッドには、接続情報とグラフ名の2つのパラメータがあります。

接続情報のフォーマットは、バックエンド・データベースとしてHBaseまたはOracle NoSQL Databaseのいずれを使用しているかによって異なります。これは、OraclePropertyGraph.getInstanceに指定する接続情報と同じです。

4.4.7.1 Oracle NoSQL Databaseの使用

Oracle NoSQL Databaseでは、OraclePropertyGraphUtils.dropPropertyGraphメソッドはKVストア名、ホスト・コンピュータ名およびポート番号を接続に使用します。このコード・フラグメントは、my_graphという名前のグラフをOracle NoSQL Databaseから削除します。

String kvHostPort = "cluster02:5000";
String kvStoreName = "kvstore";
String kvGraphName = "my_graph";

// Use NoSQL Java API
KVStoreConfig kvconfig = new KVStoreConfig(kvStoreName, kvHostPort);

// Drop the graph
OraclePropertyGraphUtils.dropPropertyGraph(kvconfig, kvGraphName);

4.4.7.2 Apache HBaseの使用

Apache HBaseでは、OraclePropertyGraphUtils.dropPropertyGraphメソッドはHadoopノードとApache HBaseポート番号を接続に使用します。このコード・フラグメントは、my_graphという名前のグラフをApache HBaseから削除します。

String hbQuorum = "bda01node01.example.com, bda01node02.example.com, bda01node03.example.com";
String hbClientPort = "2181"
String hbGraphName = "my_graph";
 
// Use HBase Java APIs
Configuration conf = HBaseConfiguration.create();
    conf.set("hbase.zookeeper.quorum", hbQuorum);
    conf.set("hbase.zookeper.property.clientPort", hbClientPort);
HConnection conn = HConnectionManager.createConnection(conf);
 
// Drop the graph
OraclePropertyGraphUtils.dropPropertyGraph(conf, hbGraphName);

4.5 プロパティ・グラフ・データのテキスト索引付けの管理

Oracle Big Data Spatial and Graphの索引付けは、特定のキー/値またはキー/テキスト・ペア別に要素を高速に読み取ることを可能にします。これらの索引は、要素タイプ(頂点またはエッジ)、キー(および値)のセットおよび索引タイプに基づいて作成されます。

Oracle Big Data Spatial and Graphは、手動と自動の2つのタイプの索引付け構造をサポートしています。

自動テキスト索引は、プロパティ・キーのセット別に頂点またはエッジの自動索引付けを行います。この主な目的は、特定のキー/値ペアに基づく頂点およびエッジの問合せパフォーマンスを向上させることです。
手動テキスト索引付けでは、プロパティ・グラフの指定された頂点およびエッジのセットに対して複数の索引を定義できます。索引に含めるグラフ要素を指定する必要があります。

Oracle Big Data Spatial and Graphは、Oracle NoSQL DatabaseおよびApache HBaseのプロパティ・グラフに対して手動および自動テキスト索引を作成するためのAPIを提供します。索引は、使用可能な検索エンジン、Apache LuceneおよびSolrCloudを使用して管理されます。この節の続きでは、データ・アクセス・レイヤーのプロパティ・グラフ機能を使用してテキスト索引を作成する方法に焦点を当てます。

4.5.1 Apache Lucene検索エンジンによる自動索引の使用

提供されているExampleNoSQL6およびExampleHBase6の例では、入力ファイルからプロパティ・グラフを作成し、頂点に自動テキスト索引を作成し、Apache Luceneを使用していくつかのテキスト検索問合せを実行します。

次のコード・フラグメントは、name、role、religion、countryのプロパティ・キーを指定して、既存のプロパティ・グラフの頂点に自動索引を作成します。自動テキスト索引は、/home/data/text-indexディレクトリの下にある4つのサブディレクトリに格納されます。Apache Luceneデータ型処理が有効になります。この例では、再索引付けタスクのためにDOP (並列度) 4を使用しています。

OraclePropertyGraph opg = OraclePropertyGraph.getInstance(
      args,  szGraphName);
 
String szOPVFile = "../../data/connections.opv";
String szOPEFile = "../../data/connections.ope";
 
// Do a parallel data loading
OraclePropertyGraphDataLoader opgdl = 
OraclePropertyGraphDataLoader.getInstance();
opgdl.loadData(opg, szOPVFile, szOPEFile, dop); 
    
// Create an automatic index using Apache Lucene engine. 
// Specify Index Directory parameters (number of directories, 
// number of connections to database, batch size, commit size, 
// enable datatypes, location)
OracleIndexParameters indexParams = 
     OracleIndexParameters.buildFS(4, 4, 10000, 50000, true, 
             "/home/data/text-index ");
opg.setDefaultIndexParameters(indexParams);
    
// specify indexed keys
String[] indexedKeys = new String[4];
indexedKeys[0] = "name";
indexedKeys[1] = "role";
indexedKeys[2] = "religion";
indexedKeys[3] = "country";
 
// Create auto indexing on above properties for all vertices
opg.createKeyIndex(indexedKeys, Vertex.class);

デフォルトでは、索引はopg.setDefaultIndexParameters(indexParams)メソッドを使用してプロパティ・グラフに関連付けられたOracleIndexParametersに基づいて構成されます。

別のパラメータ・セットを指定して索引を作成することもできます。これは次のコード・スニペットに示されています。

// Create an OracleIndexParameters object to get Index configuration (search engine, etc).
OracleIndexParameters indexParams = OracleIndexParameters.buildFS(args)  
 
// Create auto indexing on above properties for all vertices
opg.createKeyIndex("name", Vertex.class, indexParams.getParameters());

次の例のコード・フラグメントは、すべての頂点に対して問合せを実行し、キー値ペアname:Barack Obamaを持つすべての一致する頂点を検索します。この操作は、テキスト索引の検索を実行します。

さらに、getVertices APIコールでuseWildCardsパラメータを指定することにより、ワイルドカード検索がサポートされます。ワイルドカード検索は、指定されたプロパティ・キーに対して自動索引が有効化されている場合にのみサポートされます。Apache Luceneを使用したテキスト検索構文の詳細は、https://lucene.apache.org/core/2_9_4/queryparsersyntax.htmlを参照してください。

// Find all vertices with name Barack Obama. 
    Iterator<Vertices> vertices = opg.getVertices("name", "Barack Obama").iterator();
    System.out.println("----- Vertices with name Barack Obama -----");
    countV = 0;
    while (vertices.hasNext()) {
      System.out.println(vertices.next());
      countV++;
    }
    System.out.println("Vertices found: " + countV);
 
   // Find all vertices with name including keyword "Obama"
   // Wildcard searching is supported.
    boolean useWildcard = true;
    Iterator<Vertices> vertices = opg.getVertices("name", "*Obama*").iterator();
    System.out.println("----- Vertices with name *Obama* -----");
    countV = 0;
    while (vertices.hasNext()) {
      System.out.println(vertices.next());
      countV++;
    }
    System.out.println("Vertices found: " + countV);

前述のコード例では、次のような出力が生成されます。

----- Vertices with name Barack Obama-----
Vertex ID 1 {name:str:Barack Obama, role:str:political authority, occupation:str:44th president of United States of America, country:str:United States, political party:str:Democratic, religion:str:Christianity}
Vertices found: 1
 
----- Vertices with name *Obama* -----
Vertex ID 1 {name:str:Barack Obama, role:str:political authority, occupation:str:44th president of United States of America, country:str:United States, political party:str:Democratic, religion:str:Christianity}
Vertices found: 1

関連項目:

サンプル・プログラムの探索

4.5.2 SolrCloud検索エンジンによる手動索引の使用

提供されているExampleNoSQL7およびExampleHBase7の例では、入力ファイルからプロパティ・グラフを作成し、エッジに手動テキスト索引を作成し、いくつかのデータを索引に含めて、Apache SolrCloudを使用していくつかのテキスト検索問合せを実行します。

SolrCloudを使用する場合、「ZookeeperへのコレクションのSolrCloud構成のアップロード」で説明されているように、最初にテキスト索引のためのコレクションの構成をApache Zookeeperにロードする必要があります。

次のコード・フラグメントは、4つのシャード(ノードごとに1つのシャード)とレプリケーション係数1を使用して既存のプロパティ・グラフに手動テキスト索引を作成します。シャードの数は、SolrCloudクラスタ上のノードの数と一致します。

OraclePropertyGraph opg = OraclePropertyGraph.getInstance(args, 
                                                          szGraphName);
 
String szOPVFile = "../../data/connections.opv";
String szOPEFile = "../../data/connections.ope";
 
// Do a parallel data loading
OraclePropertyGraphDataLoader opgdl = 
OraclePropertyGraphDataLoader.getInstance();
opgdl.loadData(opg, szOPVFile, szOPEFile, dop); 
    
// Create a manual text index using SolrCloud// Specify Index Directory parameters: configuration name, Solr Server URL, Solr Node set, 
// replication factor, zookeeper timeout (secs),
// maximum number of shards per node,  
   // number of connections to database, batch size, commit size, 
         // write timeout (in secs)
             String configName = "opgconfig";
             String solrServerUrl = "nodea:2181/solr"
             String solrNodeSet = "nodea:8983_solr,nodeb:8983_solr," +  
                                  "nodec:8983_solr,noded:8983_solr";
 
         int zkTimeout = 15;
         int numShards = 4;
         int replicationFactor = 1;
         int maxShardsPerNode = 1;
 
OracleIndexParameters indexParams = 
                 OracleIndexParameters.buildSolr(configName, 
                                       solrServerUrl, 
                                       solrNodeSet, 
                                       zkTimeout,
                                       numShards,
                                       replicationFactor,
                                       maxShardsPerNode,
                                       4,
                                       10000,
                                       500000,
                                       15);
opg.setDefaultIndexParameters(indexParams);
    
 
// Create manual indexing on above properties for all vertices
OracleIndex<Edge> index = ((OracleIndex<Edge>) opg.createIndex("myIdx", Edge.class));
 
Vertex v1 = opg.getVertices("name", "Barack Obama").iterator().next();
 
Iterator<Edge> edges
                = v1.getEdges(Direction.OUT, "collaborates").iterator();
 
          while (edges.hasNext()) {
             Edge edge = edges.next();
             Vertex vIn = edge.getVertex(Direction.IN);
             index.put("collaboratesWith", vIn.getProperty("name"), edge);
          }

次のコード・フラグメントは、手動索引に対して問合せを実行して、キー値ペアcollaboratesWith:Beyonceを持つすべてのエッジを取得します。さらに、get APIコールでuseWildCardsパラメータを指定することにより、ワイルドカード検索がサポートされます。

// Find all edges with collaboratesWith Beyonce. 
   // Wildcard searching is supported using true parameter.
    edges = index.get("collaboratesWith", "Beyonce").iterator();
    System.out.println("----- Edges with name Beyonce -----");
    countE = 0;
    while (edges.hasNext()) {
      System.out.println(edges.next());
      countE++;
    }
    System.out.println("Edges found: "+ countE);
 
// Find all vertices with name including Bey*. 
   // Wildcard searching is supported using true parameter.
    edges = index.get("collaboratesWith", "*Bey*", true).iterator();
    System.out.println("----- Edges with collaboratesWith Bey* -----");
    countE = 0;
    while (edges.hasNext()) {
      System.out.println(edges.next());
      countE++;
    }
    System.out.println("Edges found: " + countE);

前述のコード例では、次のような出力が生成されます。

----- Edges with name Beyonce -----
Edge ID 1000 from Vertex ID 1 {country:str:United States, name:str:Barack Obama, occupation:str:44th president of United States of America, political party:str:Democratic, religion:str:Christianity, role:str:political authority} =[collaborates]=> Vertex ID 2 {country:str:United States, music genre:str:pop soul , name:str:Beyonce, role:str:singer actress} edgeKV[{weight:flo:1.0}]
Edges found: 1
 
----- Edges with collaboratesWith Bey* -----
Edge ID 1000 from Vertex ID 1 {country:str:United States, name:str:Barack Obama, occupation:str:44th president of United States of America, political party:str:Democratic, religion:str:Christianity, role:str:political authority} =[collaborates]=> Vertex ID 2 {country:str:United States, music genre:str:pop soul , name:str:Beyonce, role:str:singer actress} edgeKV[{weight:flo:1.0}]
Edges found: 1

関連項目:

サンプル・プログラムの探索

4.5.3 データ型の処理

Oracleのプロパティ・グラフ・サポートは、値のデータ型に基づいて要素のキー/値ペアを索引付けおよび格納します。データ型処理の主な目的は、数値や日付範囲の問合せなど、包括的な問合せのサポートを提供することです。

デフォルトでは、特定のキー/値ペアに対する検索は、その値のデータ型に基づく問合せ式まで行われます。たとえば、キー値ペアage:30を持つ頂点を検索する場合は、整数のデータ型が指定されたすべての年齢フィールドに対して問合せが実行されます。値が問合せ式である場合は、API get(String key, Object value, Class dtClass, Boolean useWildcards)をコールすることにより、検索する値のデータ型クラスも指定できます。データ型が指定されていない場合、問合せ式はすべての使用可能なデータ型と一致します。

ブール演算子を使用して処理する場合は、問合せで適切な一致が見つかるように、後続の各キー/値ペアにデータ型の接頭辞/接尾辞を追加する必要があります。次の各トピックでは、Apache LuceneおよびSolrCloudでこの接頭辞/接尾辞を追加する方法について説明します。

4.5.3.1 Apache Luceneでのデータ型識別子の追加

Luceneのデータ型処理が有効化されている場合、問合せ式のキーの接尾辞として適切なデータ型識別子を追加する必要があります。これは、キーに対してString.concat()の演算を実行することによって行うことができます。Luceneのデータ型処理が無効化されている場合、データ型識別子を値Stringに接頭辞として挿入する必要があります。表4-1は、Apache Luceneを使用したテキスト索引付けに使用可能なデータ型識別子を示しています(LuceneIndexのJavadocも参照)。

表4-1 Apache Luceneデータ型識別子

Luceneデータ型識別子	説明
TYPE_DT_STRING	文字列
TYPE_DT_BOOL	ブール
TYPE_DT_DATE	日付
TYPE_DT_FLOAT	Float
TYPE_DT_DOUBLE	Double
TYPE_DT_INTEGER	整数
TYPE_DT_SERIALIZABLE	シリアライズ可能

次のコード・フラグメントは、Luceneのデータ型処理を使用してエッジに手動索引を作成し、データを追加してから、手動索引に対して問合せを実行し、ワイルドカードを使用してキー/値ペアcollaboratesWith:Beyonce AND country1:United*を持つすべてのエッジを取得します。

OraclePropertyGraph opg = OraclePropertyGraph.getInstance(args,               
                                                          szGraphName);
 
String szOPVFile = "../../data/connections.opv";
String szOPEFile = "../../data/connections.ope";
 
// Do a parallel data loading
OraclePropertyGraphDataLoader opgdl = 
OraclePropertyGraphDataLoader.getInstance();
opgdl.loadData(opg, szOPVFile, szOPEFile, dop); 
    
// Specify Index Directory parameters (number of directories, 
   // number of connections to database, batch size, commit size, 
         // enable datatypes, location)
OracleIndexParameters indexParams = 
     OracleIndexParameters.buildFS(4, 4, 10000, 50000, true, 
            "/ home/data/text-index ");
opg.setDefaultIndexParameters(indexParams);
// Create manual indexing on above properties for all edges
OracleIndex<Edge> index = ((OracleIndex<Edge>) opg.createIndex("myIdx", Edge.class));
 
Vertex v1 = opg.getVertices("name", "Barack Obama").iterator().next();
 
Iterator<Edge> edges
                = v1.getEdges(Direction.OUT, "collaborates").iterator();
 
          while (edges.hasNext()) {
             Edge edge = edges.next();
             Vertex vIn = edge.getVertex(Direction.IN);
             index.put("collaboratesWith", vIn.getProperty("name"), edge);
             index.put("country", vIn.getProperty("country"), edge);
          }
 
// Wildcard searching is supported using true parameter.
    String key = "country";
    key = key.concat(String.valueOf(oracle.pg.text.lucene.LuceneIndex.TYPE_DT_STRING));
   
    String queryExpr = "Beyonce AND " + key + ":United*";
    edges = index.get("collaboratesWith", queryExpr, true /*UseWildcard*/).iterator();
    System.out.println("----- Edges with query: " + queryExpr + " -----");
    countE = 0;
    while (edges.hasNext()) {
      System.out.println(edges.next());
      countE++;
    }
    System.out.println("Edges found: "+ countE);

前述のコード例では、次のような出力が生成される可能性があります。

----- Edges with name Beyonce AND country1:United* -----
Edge ID 1000 from Vertex ID 1 {country:str:United States, name:str:Barack Obama, occupation:str:44th president of United States of America, political party:str:Democratic, religion:str:Christianity, role:str:political authority} =[collaborates]=> Vertex ID 2 {country:str:United States, music genre:str:pop soul , name:str:Beyonce, role:str:singer actress} edgeKV[{weight:flo:1.0}]
Edges found: 1

次のコード・フラグメントは、頂点に自動索引を作成し、Luceneのデータ型処理を無効化し、データを追加してから、前の例からの手動索引に対して問合せを実行し、ワイルドカードを使用してキー/値ペアcountry:United* AND role:1*political*を持つすべての頂点を取得します。

OraclePropertyGraph opg = OraclePropertyGraph.getInstance(args,               
                                                          szGraphName);
 
String szOPVFile = "../data/connections.opv";
String szOPEFile = "../data/connections.ope";
 
// Do a parallel data loading
OraclePropertyGraphDataLoader opgdl = 
OraclePropertyGraphDataLoader.getInstance();
opgdl.loadData(opg, szOPVFile, szOPEFile, dop); 
    
// Create an automatic index using Apache Lucene engine. 
// Specify Index Directory parameters (number of directories, 
   // number of connections to database, batch size, commit size, 
         // enable datatypes, location)
OracleIndexParameters indexParams = 
     OracleIndexParameters.buildFS(4, 4, 10000, 50000, false, "/ home/data/text-index ");
opg.setDefaultIndexParameters(indexParams);
    
// specify indexed keys
String[] indexedKeys = new String[4];
indexedKeys[0] = "name";
indexedKeys[1] = "role";
indexedKeys[2] = "religion";
indexedKeys[3] = "country";
 
// Create auto indexing on above properties for all vertices
opg.createKeyIndex(indexedKeys, Vertex.class);
 
// Wildcard searching is supported using true parameter.
    String value = "*political*";
    value = String.valueOf(LuceneIndex.TYPE_DT_STRING) + value;
String queryExpr = "United* AND role:" + value;
    
 
vertices = opg.getVertices("country", queryExpr,  true /*useWildcard*/).iterator();
    System.out.println("----- Vertices with query: " + queryExpr + " -----");
    countV = 0;
    while (vertices.hasNext()) {
      System.out.println(vertices.next());
      countV++;
    }
    System.out.println("Vertices found: " + countV);

前述のコード例では、次のような出力が生成される可能性があります。

----- Vertices with query: United* and role:1*political* -----
Vertex ID 30 {name:str:Jerry Brown, role:str:political authority, occupation:str:34th and 39th governor of California, country:str:United States, political party:str:Democratic, religion:str:roman catholicism}
Vertex ID 24 {name:str:Edward Snowden, role:str:political authority, occupation:str:system administrator, country:str:United States, religion:str:buddhism}
Vertex ID 22 {name:str:John Kerry, role:str:political authority, country:str:United States, political party:str:Democratic, occupation:str:68th United States Secretary of State, religion:str:Catholicism}
Vertex ID 21 {name:str:Hillary Clinton, role:str:political authority, country:str:United States, political party:str:Democratic, occupation:str:67th United States Secretary of State, religion:str:Methodism}
Vertex ID 19 {name:str:Kirsten Gillibrand, role:str:political authority, country:str:United States, political party:str:Democratic, occupation:str:junior United States Senator from New York, religion:str:Methodism}
Vertex ID 13 {name:str:Ertharin Cousin, role:str:political authority, country:str:United States, political party:str:Democratic}
Vertex ID 11 {name:str:Eric Holder, role:str:political authority, country:str:United States, political party:str:Democratic, occupation:str:United States Deputy Attorney General}
Vertex ID 1 {name:str:Barack Obama, role:str:political authority, occupation:str:44th president of United States of America, country:str:United States, political party:str:Democratic, religion:str:Christianity}
Vertices found: 8

4.5.3.2 SolrCloudでのデータ型識別子の追加

SolrCloudテキスト索引のブール演算では、問合せ式のキーの接尾辞として適切なデータ型識別子を追加する必要があります。これは、キーに対してString.concat()の演算を実行することによって行うことができます。表4-2は、SolrCloudを使用したテキスト索引付けに使用可能なデータ型識別子を示しています(SolrIndexのJavadocも参照)。

表4-2 SolrCloudデータ型識別子

Solrデータ型識別子	説明
TYPE_DT_STRING	文字列
TYPE_DT_BOOL	ブール
TYPE_DT_DATE	日付
TYPE_DT_FLOAT	Float
TYPE_DT_DOUBLE	Double
TYPE_DT_INTEGER	整数
TYPE_DT_SERIALIZABLE	シリアライズ可能

次のコード・フラグメントは、SolrCloudを使用してエッジに手動索引を作成し、データを追加してから、手動索引に対して問合せを実行し、ワイルドカードを使用してキー/値ペアcollaboratesWith:Beyonce AND country1:United*を持つすべてのエッジを取得します。

OraclePropertyGraph opg = OraclePropertyGraph.getInstance(args,       
                                                          szGraphName);
 
String szOPVFile = "../../data/connections.opv";
String szOPEFile = "../../data/connections.ope";
 
// Do a parallel data loading
OraclePropertyGraphDataLoader opgdl = 
OraclePropertyGraphDataLoader.getInstance();
opgdl.loadData(opg, szOPVFile, szOPEFile, dop); 
    
// Create a manual text index using SolrCloud// Specify Index Directory parameters: configuration name, Solr Server URL, Solr Node set, 
// replication factor, zookeeper timeout (secs),
// maximum number of shards per node,  
   // number of connections to database, batch size, commit size, 
         // write timeout (in secs)
             String configName = "opgconfig";
             String solrServerUrl = "nodea:2181/solr"
             String solrNodeSet = "nodea:8983_solr,nodeb:8983_solr," +  
                                  "nodec:8983_solr,noded:8983_solr";
 
         int zkTimeout = 15;
         int numShards = 4;
         int replicationFactor = 1;
         int maxShardsPerNode = 1;
 
OracleIndexParameters indexParams = 
                 OracleIndexParameters.buildSolr(configName, 
                                       solrServerUrl, 
                                       solrNodeSet, 
                                       zkTimeout,
                                       numShards,
                                       replicationFactor,
                                       maxShardsPerNode,
                                       4,
                                       10000,
                                       500000,
                                       15);
opg.setDefaultIndexParameters(indexParams);
    
// Create manual indexing on above properties for all vertices
OracleIndex<Edge> index = ((OracleIndex<Edge>) opg.createIndex("myIdx", Edge.class));
 
Vertex v1 = opg.getVertices("name", "Barack Obama").iterator().next();
 
Iterator<Edge> edges
                = v1.getEdges(Direction.OUT, "collaborates").iterator();
 
          while (edges.hasNext()) {
             Edge edge = edges.next();
             Vertex vIn = edge.getVertex(Direction.IN);
             index.put("collaboratesWith", vIn.getProperty("name"), edge);
             index.put("country", vIn.getProperty("country"), edge);
          }
 
// Wildcard searching is supported using true parameter.
    String key = "country";
    key = key.concat(oracle.pg.text.solr.SolrIndex.TYPE_DT_STRING);
   
    String queryExpr = "Beyonce AND " + key + ":United*";
    edges = index.get("collaboratesWith", queryExpr, true /** UseWildcard*/).iterator();
    System.out.println("----- Edges with query: " + query + " -----");
    countE = 0;
    while (edges.hasNext()) {
      System.out.println(edges.next());
      countE++;
    }
    System.out.println("Edges found: "+ countE);

前述のコード例では、次のような出力が生成される可能性があります。

----- Edges with name Beyonce AND country_str:United* -----
Edge ID 1000 from Vertex ID 1 {country:str:United States, name:str:Barack Obama, occupation:str:44th president of United States of America, political party:str:Democratic, religion:str:Christianity, role:str:political authority} =[collaborates]=> Vertex ID 2 {country:str:United States, music genre:str:pop soul , name:str:Beyonce, role:str:singer actress} edgeKV[{weight:flo:1.0}]
Edges found: 1

4.5.4 ZookeeperへのコレクションのSolrCloud構成のアップロード

Oracle Big Data Spatial and Graphプロパティ・グラフでSolrCloudテキスト索引を使用する前に、コレクションの構成をZookeeperにアップロードする必要があります。これは、SolrCloudクラスタ・ノードのいずれかでZkCliツールを使用して実行できます。

事前定義済のコレクション構成ディレクトリは、インストール・ホームの下にあるdal/opg-solr-configにあります。次に、PropertyGraph構成ディレクトリのアップロード方法の例を示します。

インストール・ホームの下にあるdal/opg-solr-configを、Solrクラスタ・ノードのいずれかにある/tmpディレクトリにコピーします。次に例を示します。
```
scp –r dal/opg-solr-config user@solr-node:/tmp
```

同じノードでZkCliツールを使用して、次の例のように次のコマンドを実行します。

$SOLR_HOME/bin/zkcli.sh -zkhost 127.0.0.1:2181/solr -cmd upconfig –confname opgconfig -confdir /tmp/opg-solr-config

4.5.5 プロパティ・グラフ・データのテキスト索引の構成設定の更新

Oracleのプロパティ・グラフ・サポートは、Apache LuceneおよびSolrCloudとの統合によって手動および自動テキスト索引を管理します。作成時に、テキスト検索で使用される検索エンジンとその他の構成設定を指定するOracleIndexParametersオブジェクトを作成する必要があります。プロパティ・グラフのテキスト索引が作成された後、これらの構成設定を変更することはできません。自動索引の場合、すべての頂点索引キーは単一のテキスト索引によって管理され、すべてのエッジ索引キーは最初の頂点またはエッジ・キーが索引付けされるときに指定された構成を使用して、異なるテキスト索引によって管理されます。

構成設定を変更する必要がある場合は、最初に現在の索引を無効にしてから、新しいOracleIndexParametersオブジェクトを使用してそれを再度作成する必要があります。次のコード・フラグメントは、既存のプロパティ・グラフに対して2つのApache Luceneベースの自動索引(頂点およびエッジ)を作成し、それらを無効にしてから、SolrCloudを使用するために再作成します。

OraclePropertyGraph opg = OraclePropertyGraph.getInstance(
      args,  szGraphName);
 
String szOPVFile = "../../data/connections.opv";
String szOPEFile = "../../data/connections.ope";
 
// Do parallel data loading
OraclePropertyGraphDataLoader opgdl =
OraclePropertyGraphDataLoader.getInstance();
opgdl.loadData(opg, szOPVFile, szOPEFile, dop);
 
// Create an automatic index using Apache Lucene.
// Specify Index Directory parameters (number of directories,
// number of connections to database, batch size, commit size,
// enable datatypes, location)
OracleIndexParameters luceneIndexParams =
     OracleIndexParameters.buildFS(4, 4, 10000, 50000, true,
             "/home/data/text-index ");
 
// Specify indexed keys
String[] indexedKeys = new String[4];
indexedKeys[0] = "name";
indexedKeys[1] = "role";
indexedKeys[2] = "religion";
indexedKeys[3] = "country";
 
// Create auto indexing on above properties for all vertices
opg.createKeyIndex(indexedKeys, Vertex.class, luceneIndexParams.getParameters());
 
// Create auto indexing on weight for all edges
opg.createKeyIndex("weight", Edge.class, luceneIndexParams.getParameters());
 
// Disable auto indexes to change parameters
opg.getOracleIndexManager().disableVertexAutoIndexer();
opg.getOracleIndexManager().disableEdgeAutoIndexer();
 
 
// Recreate text indexes using SolrCloud
// Specify Index Directory parameters: configuration name, Solr Server URL, Solr Node set,
// replication factor, zookeeper timeout (secs),
// maximum number of shards per node,
// number of connections to database, batch size, commit size,
// write timeout (in secs)
String configName = "opgconfig";
String solrServerUrl = "nodea:2181/solr"
String solrNodeSet = "nodea:8983_solr,nodeb:8983_solr," +
   "nodec:8983_solr,noded:8983_solr";
 
int zkTimeout = 15;
int numShards = 4;
int replicationFactor = 1;
int maxShardsPerNode = 1;
 
OracleIndexParameters solrIndexParams =
OracleIndexParameters.buildSolr(configName,
                                solrServerUrl,
                                solrNodeSet,
                                zkTimeout,
                                numShards,
                                replicationFactor,
                                maxShardsPerNode,
                                4,
                                10000,
                                500000,
                                15);
 
// Create auto indexing on above properties for all vertices
opg.createKeyIndex(indexedKeys, Vertex.class, solrIndexParams.getParameters());
 
// Create auto indexing on weight for all edges
opg.createKeyIndex("weight", Edge.class, solrIndexParams.getParameters());

4.5.6 プロパティ・グラフ・データのテキスト索引でのパラレル問合せの使用

Oracle Big Data Spatial and Graphのテキスト索引では、パラレル問合せ実行を使用して、特定のキー/値またはキー/テキスト・ペア別に数百万の頂点およびエッジに対してテキスト問合せを実行できます。

パラレル・テキスト問合せは、SolrCloud(またはApache Luceneのサブディレクトリ)内のシャード間での索引データの配分を利用する最適化されたソリューションであり、各シャードは別個の索引接続を使用して問い合せられます。これには、読取り操作のパフォーマンスを向上させて索引から複数の要素を取得するために、複数のスレッドとSolrCloud (またはApache Lucene)検索エンジンへの接続が関連します。このアプローチでは、スコアに基づく一致結果のランク付けは行われません。

パラレル・テキスト問合せでは、各要素がシャードからの指定されたK/Vペアに一致する属性を持つすべての頂点(またはエッジ)を保持する配列が生成されます。問い合せられたシャードのサブセットは、特定の開始サブディレクトリIDと指定された接続の配列のサイズによって区切られます。これにより、サブセットは[開始, 開始 - 1 + 接続の配列のサイズ]の範囲内のシャードを考慮します。Nシャードを含む索引にあるすべてのシャードに、([0, N - 1]の範囲の)整数IDが割り当てられることに注意してください。

Apache Luceneを使用したパラレル・テキスト問合せ

LuceneIndexでgetPartitionedメソッドを呼び出して、サブディレクトリのセットへの接続の配列(SearcherManagerオブジェクト)、検索するキー/値ペア、および開始サブディレクトリIDを指定することにより、Apache Luceneを使用したパラレル・テキスト問合せを使用できます。各サブディレクトリは索引のその他のサブディレクトリに依存していないため、各接続が適切なサブディレクトリにリンクしている必要があります。

次のコード・フラグメントは、Apache Lucene検索エンジンを使用して自動テキスト索引を生成し、パラレル・テキスト問合せを実行します。LuceneIndexクラスでのgetPartitionedメソッドの呼出し数は、サブディレクトリの合計数と使用される接続の数によって制御されます。

OraclePropertyGraph opg = OraclePropertyGraph.getInstance(
 args, szGraphName);

// Clear existing vertices/edges in the property graph 
opg.clearRepository(); 

String szOPVFile = "../../data/connections.opv";
String szOPEFile = "../../data/connections.ope";

// This object will handle parallel data loading
OraclePropertyGraphDataLoader opgdl = OraclePropertyGraphDataLoader.getInstance();
opgdl.loadData(opg, szOPVFile, szOPEFile, dop); 

// Create an automatic index
OracleIndexParameters indexParams 
= OracleIndexParameters.buildFS(dop /* number of directories */,
dop /* number of connections
used when indexing */,
10000 /* batch size before commit*/,
500000 /* commit size before Lucene commit*/,
true /* enable datatypes */,
"./lucene-index" /* index location */);

opg.setDefaultIndexParameters(indexParams);

// Create auto indexing on name property for all vertices
System.out.println("Create automatic index on name for vertices");
opg.createKeyIndex("name", Vertex.class);

// Get the SolrIndex object 
LuceneIndex<Vertex> index = (LuceneIndex<Vertex>) opg.getAutoIndex(Vertex.class);


long lCount = 0;
for (int split = 0; split < index.getTotalShards(); 
 split += conns.length) {
// Gets a connection object from subdirectory split to 
//(split + conns.length)
for (int idx = 0; idx < conns.length; idx++) { 
conns[idx] = index.getOracleSearcherManager(idx + split); 
}

// Gets elements from split to split + conns.length
Iterable<Vertex>[] iterAr }
= index.getPartitioned(conns /* connections */, 
 "name"/* key */, 
 "*" /* value */, 
 true /* wildcards */, 
 split /* start split ID */);

lCount = countFromIterables(iterAr); /* Consume iterables in parallel */

// Close the connections to the sub-directories after completed
for (int idx = 0; idx < conns.length; idx++) { 
conns[idx].close();
}
}

// Count all vertices
System.out.println("Vertices found using parallel query: " + lCount);

SolrCloudを使用したパラレル・テキスト検索

SolrIndexでgetPartitionedメソッドを呼び出して、SolrCloudへの接続の配列(CloudSolrServerオブジェクト)、検索するキー/値ペア、および開始シャードIDを指定することにより、SolrCloudを使用したパラレル・テキスト問合せを使用できます。

次のコード・フラグメントは、SolrCloud検索エンジンを使用して自動テキスト索引を生成し、パラレル・テキスト問合せを実行します。SolrIndexクラスでのgetPartitionedメソッドの呼出し数は、索引内のシャードの合計数と使用される接続の数によって制御されます。

OraclePropertyGraph opg = OraclePropertyGraph.getInstance(
 args, szGraphName);

// Clear existing vertices/edges in the property graph 
opg.clearRepository(); 

String szOPVFile = "../../data/connections.opv";
String szOPEFile = "../../data/connections.ope";

// This object will handle parallel data loading
OraclePropertyGraphDataLoader opgdl = OraclePropertyGraphDataLoader.getInstance();
opgdl.loadData(opg, szOPVFile, szOPEFile, dop); 

String configName = "opgconfig";
String solrServerUrl = args[4];//"localhost:2181/solr"
String solrNodeSet = args[5]; //"localhost:8983_solr";
 
int zkTimeout = 15; // zookeeper timeout in seconds
int numShards = Integer.parseInt(args[6]); // number of shards in the index
int replicationFactor = 1; // replication factor
int maxShardsPerNode = 1; // maximum number of shards per node
 
// Create an automatic index using SolrCloud
OracleIndexParameters indexParams = 
 OracleIndexParameters.buildSolr(configName, 
 solrServerUrl, 
 solrNodeSet, 
 zkTimeout /* zookeeper timeout in seconds */,
 numShards /* total number of shards */,
 replicationFactor /* Replication factor */,
 maxShardsPerNode /* maximum number of shardsper node*/,
 4 /* dop used for scan */,
 10000 /* batch size before commit*/,
 500000 /* commit size before SolrCloud commit*/,
 15 /* write timeout in seconds */);

opg.setDefaultIndexParameters(indexParams);

// Create auto indexing on name property for all vertices
System.out.println("Create automatic index on name for vertices");
opg.createKeyIndex("name", Vertex.class);

// Get the SolrIndex object 
SolrIndex<Vertex> index = (SolrIndex<Vertex>) opg.getAutoIndex(Vertex.class);

// Open an array of connections to handle connections to SolrCloud needed for parallel text search
CloudSolrServer[] conns = new CloudSolrServer[dop];


for (int idx = 0; idx < conns.length; idx++) {
conns[idx] = index.getCloudSolrServer(15 /* write timeout in 
secs*/);
}

// Iterate to cover all the shards in the index
long lCount = 0;
for (int split = 0; split < index.getTotalShards(); 
 split += conns.length) {
// Gets elements from split to split + conns.length
Iterable<Vertex>[] iterAr = index.getPartitioned(conns /* connections */, 
 "name"/* key */, 
 "*" /* value */, 
 true /* wildcards */, 
 split /* start split ID */);

lCount = countFromIterables(iterAr); /* Consume iterables in parallel */
}

// Close the connections to the sub-directories after completed
for (int idx = 0; idx < conns.length; idx++) { 
conns[idx].shutdown();
}

// Count results
System.out.println("Vertices found using parallel query: " + lCount);

4.6 セキュアなOracle NoSQL Databaseのサポート

Oracle Big Data Spatial and Graphプロパティ・グラフ・サポートは、Oracle NoSQL Databaseのセキュア・インストールと非セキュア・インストールの両方で機能します。このトピックでは、セキュアなOracle NoSQL Database設定でプロパティ・グラフ機能を使用する方法についての情報を提供します。ここでは、セキュアなOracle NoSQL Databaseがすでにインストールされていると想定します(http://docs.oracle.com/cd/NOSQL/html/SecurityGuide/secure_installation.htmlにある『Oracle NoSQL Databaseセキュリティ・ガイド』の「Oracle NoSQL Databaseのセキュア・インストールの実行」で説明されているプロセス)。

セキュア・データベースにアクセスするには、正しい資格情報を持っている必要があります。次のようなユーザーを作成します。

kv-> plan create-user -name myusername -admin -wait

このユーザーにreadwriteおよびdbaadminロールを付与します。次に例を示します。

kv-> plan grant -user myusername -role readwrite -wait
kv-> plan grant -user myusername -role dbadmin -wait

client.securityファイルからlogin_properties.txtを生成する際、ユーザー名が正しいことを確認します。次に例を示します。

oracle.kv.auth.username=myusername

Oracleプロパティ・グラフのクライアント側で、セキュアなOracle NoSQL Databaseと対話するためのセキュリティ関連ファイルおよびライブラリを持っている必要があります。最初に、次のファイル(ディレクトリ)をKVROOT/security/からクライアント側にコピーします。

client.security
client.trust	
login.wallet/
login_properties.txt

セキュア・データベースへのアクセスに必要なパスワードを保持するためにOracle Walletを使用する場合、次の3つのライブラリをクライアント側にコピーして、クラス・パスを正しく設定します。

oraclepki.jar
osdt_cert.jar
osdt_core.jar

データベースとOracleプロパティ・グラフのクライアント側を正しく構成した後、次の2つのアプローチのいずれかを使用して、セキュアなNoSQL Databaseに格納されているグラフに接続できます。

Java VM設定を次の形式で使用して、ログイン・プロパティ・ファイルを指定します。
```
-Doracle.kv.security=/<your-path>/login_properties.txt
```
このJava VMプロパティは、J2EEコンテナにデプロイされたアプリケーション(インメモリー分析を含む)にも設定できます。たとえば、WebLogicサーバーを起動する前に、次の形式で環境変数を設定してログイン・プロパティ構成ファイルを参照できます。
```
setenv JAVA_OPTIONS "-Doracle.kv.security=/<your-path>/login_properties.txt"
```
次に、通常どおりにOraclePropertyGraph.getInstance(kconfig, szGraphName)を呼び出してOraclePropertyGraphインスタンスを作成します。

OraclePropertyGraph.getInstance(kconfig, szGraphName, username, password, truStoreFile)を呼び出し、usernameおよびpasswordにはセキュアなOracle NoSQL Databaseにアクセスするための正しい資格情報、truStoreFileにはクライアント側トラスト・ストア・ファイルclient.trustへのパスを指定します。

次のコード・フラグメントは、セキュアなOracle NoSQL Databaseにプロパティ・グラフを作成し、データをロードしてから、グラフ内の頂点およびエッジの数をカウントします。

// This object will handle operations over the property graph 
OraclePropertyGraph opg = OraclePropertyGraph.getInstance(kconfig,
szGraphName,
username,
password,
truStoreFile);

// Clear existing vertices/edges in the property graph 
opg.clearRepository();
opg.setQueueSize(100); // 100 elements

String szOPVFile = "../../data/connections.opv";
String szOPEFile = "../../data/connections.ope";
// This object will handle parallel data loading over the property graph
System.out.println("Load data for graph " + szGraphName);
OraclePropertyGraphDataLoader opgdl = OraclePropertyGraphDataLoader.getInstance();
opgdl.loadData(opg, szOPVFile, szOPEFile, dop);
// Count all vertices
long countV = 0;
Iterator<Vertex> vertices = opg.getVertices().iterator();
while (vertices.hasNext()) {
vertices.next();
countV++;
}

System.out.println("Vertices found: " + countV);
// Count all edges
long countE = 0;
Iterator<Edge> edges = opg.getEdges().iterator();
while (edges.hasNext()) {
edges.next();
countE++;
}

System.out.println("Edges found: " + countE);

4.7 セキュアなApache HBase/Hadoopのサポート

Oracle Big Data Spatial and Graphのプロパティ・グラフを保護するために、Apache HBaseでKerberos認証を使用することが推奨されます。

Oracleのプロパティ・グラフ・サポートは、Cloudera Hadoop (CDH)クラスタのセキュア・インストールと非セキュア・インストールの両方で機能します。このトピックでは、セキュアなApache HBaseインストールについての情報を提供します。

Oracle Big Data Spatial and Graphのプロパティ・グラフを保護するために、Apache HBaseでKerberos認証を使用することが推奨されます。

このトピックでは、Kerberosを使用してセキュアなApache HBaseが構成されており、クライアント・マシンにKerberosライブラリがインストールされ、正しい資格情報を持っていると想定します。詳細は、http://www.cloudera.com/content/cloudera/en/documentation/core/latest/topics/cdh_sg_hbase_authentication.htmlの『Configuring Kerberos Authentication for HBase』を参照してください。Kerberosクラスタおよびクライアントの設定方法の情報は、http://web.mit.edu/kerberos/krb5-latest/doc/index.htmlの『MIT Kerberos Documentation』を参照してください。

クライアント側で、対応HDFSデーモンと対話するために、Kerberos資格情報を持っている必要があります。さらに、(krb5.confにある)Kerberos構成情報を変更して、レルムおよびホスト名マッピングをセキュアなCDHクラスタで使用されるKerberosレルムに含める必要があります。

次のコード・フラグメントは、BDA.COM上のセキュアなCDHクラスタで使用されるレルムとホスト名マッピングを示しています。

[libdefaults]
 default_realm = EXAMPLE.COM
 dns_lookup_realm = false
 dns_lookup_kdc = false
 ticket_lifetime = 24h
 renew_lifetime = 7d
 forwardable = yes

[realms]
 EXAMPLE.COM = {
kdc = hostname1.example.com:88
kdc = hostname2.example.com:88
admin_server = hostname1.example.com:749
default_domain = example.com
 }
BDA.COM = {
kdc = hostname1.bda.com:88
kdc = hostname2.bda.com:88
admin_server = hostname1.bda.com:749
default_domain = bda.com
 }

[domain_realm]
 .example.com = EXAMPLE.COM
 example.com = EXAMPLE.COM
 .bda.com = BDA.COM
 bda.com = BDA.COM

krb5.confを変更した後、Java Authentication and Authorization Service (JAAS)構成ファイルを使用して資格情報をアプリケーションに提供することにより、Apache HBaseに格納されているグラフに接続できます。これは、非セキュアなApache HBaseインストールを使用するアプリケーションがすでに存在する場合に、コードをまったく変更することなく前述の例と同じ機能を提供します。

JAAS構成が設定されたHBaseでプロパティ・グラフ・サポートを使用するには、次の形式の内容を含むファイルを作成し、keytabおよびprincipalエントリを独自の情報に置き換えます。

Client {
com.sun.security.auth.module.Krb5LoginModule required
useKeyTab=true
useTicketCache=true
keyTab="/path/to/your/keytab/user.keytab"
principal="your-user/your.fully.qualified.domain.name@YOUR.REALM";
};

次のコード・フラグメントは、BDA.COM上のセキュアなCDHクラスタで使用されるレルムを含むJAASファイルの例を示しています。

Client {
com.sun.security.auth.module.Krb5LoginModule required
useKeyTab=true
useTicketCache=true
keyTab="/path/to/keytab/user.keytab"
principal="hbaseuser/hostname1@BDA.COM";
};

セキュアなHBaseアプリケーションを実行するには、java.security.auth.login.configフラグを使用して、作成したJAAS構成ファイルを指定する必要があります。次の形式のコマンドを使用して、アプリケーションを実行できます。

java -Djava.security.auth.login.config=/path/to/your/jaas.conf/ -classpath ./classes/:../../lib/'*' YourJavaApplication

その後、通常どおりにOraclePropertyGraph.getInstance(conf, hconn, szGraphName)を呼び出してOracleプロパティ・グラフを作成できます。

Oracle Big Data Spatial and Graphプロパティ・グラフ・サポートをセキュアなApache HBaseインストールで使用するための別の方法は、セキュアなHBase構成を使用することです。次のコード・フラグメントは、prepareSecureConfig()を使用してセキュアなHBase構成を取得する方法を示しています。このAPIは、Apache HadoopおよびApache HBaseで使用されるセキュリティ認証設定、および認可されたチケットを認証および取得するためのKerberos資格情報セットを必要とします。

次のコード・フラグメントは、セキュアなApache HBaseにプロパティ・グラフを作成し、データをロードしてから、グラフ内の頂点およびエッジの数をカウントします。

String szQuorum= "hostname1,hostname2,hostname3";
String szCliPort = "2181"; 
String szGraph = "SecureGraph";

String hbaseSecAuth="kerberos";
String hadoopSecAuth="kerberos";
String hmKerberosPrincipal="hbase/_HOST@BDA.COM";
String rsKerberosPrincipal="hbase/_HOST@BDA.COM";
String userPrincipal = "hbase/hostname1@BDA.COM";
String keytab= "/path/to/your/keytab/hbase.keytab";
int dop= 8;

Configuration conf = HBaseConfiguration.create();
conf.set("hbase.zookeeper.quorum", szQuorum);
conf.set("hbase.zookeeper.property.clientPort", szCliPort);

// Prepare the secure configuration providing the credentials in the keytab
conf = OraclePropertyGraph.prepareSecureConfig(conf, 
 hbaseSecAuth, 
 hadoopSecAuth, 
 hmKerberosPrincipal, 
 rsKerberosPrincipal, 
 userPrincipal, 
 keytab);
HConnection hconn = HConnectionManager.createConnection(conf);

OraclePropertyGraph opg=OraclePropertyGraph.getInstance(conf, hconn, szGraph);
opg.setInitialNumRegions(24);
opg.clearRepository();

String szOPVFile = "../../data/connections.opv";
String szOPEFile = "../../data/connections.ope";

// Do a parallel data loading
OraclePropertyGraphDataLoader opgdl = OraclePropertyGraphDataLoader.getInstance();
opgdl.loadData(opg, szOPVFile, szOPEFile, dop);
opg.commit();

4.8 プロパティ・グラフ・データでのGroovyシェルの使用

Oracle Big Data Spatial and Graphプロパティ・グラフ・サポートには、組込みのGroovyシェル(オリジナルのGremlin Groovyシェル・スクリプトに基づく)が含まれています。このコマンドライン・シェル・インタフェースを使用して、Java APIを参照できます。

Groovyシェルを起動するには、インストール・ホーム(デフォルトでは/opt/oracle/oracle-spatial-graph/property_graph)の下にあるdal/groovyディレクトリに移動します。次に例を示します。

cd /opt/oracle/oracle-spatial-graph/property_graph/dal/groovy/

ここには、Oracle NoSQL DatabaseとApache HBaseにそれぞれ接続するためのスクリプトgremlin-opg-nosql.shおよびgremlin-opg-hbase.shが含まれています。

次の例は、Oracle NoSQL Databaseに接続し、グラフ名myGraphを持つOraclePropertyGraphのインスタンスを取得し、いくつかのサンプル・グラフ・データをロードして、頂点およびエッジのリストを取得します。

$ ./gremlin-opg-nosql.sh
 
opg-nosql>
opg-nosql> hhosts = new String[1];
==>null
 
opg-nosql> hhosts[0] = "bigdatalite:5000";
==>bigdatalite:5000
 
opg-nosql> cfg = GraphConfigBuilder.forNosql().setName("myGraph").setHosts(Arrays.asList(hhosts)).setStoreName("mystore").addEdgeProperty("lbl", PropertyType.STRING, "lbl").addEdgeProperty("weight", PropertyType.DOUBLE, "1000000").build();
==>{"db_engine":"NOSQL","loading":{},"format":"pg","name":"myGraph","error_handling":{},"hosts":["bigdatalite:5000"],"node_props":[],"store_name":"mystore","edge_props":[{"type":"string","name":"lbl","default":"lbl"},{"type":"double","name":"weight","default":"1000000"}]}
 
opg-nosql> opg = OraclePropertyGraph.getInstance(cfg);
==>oraclepropertygraph with name myGraph
 
opg-nosql> opgdl = OraclePropertyGraphDataLoader.getInstance();
==>oracle.pg.nosql.OraclePropertyGraphDataLoader@576f1cad
 
opg-nosql> opgdl.loadData(opg, new FileInputStream("../../data/connections.opv"), new FileInputStream("../../data/connections.ope"), 1, 1, 0, null);
==>null
 
opg-nosql> opg.getVertices();
==>Vertex ID 5 {country:str:Italy, name:str:Pope Francis, occupation:str:pope, religion:str:Catholicism, role:str:Catholic religion authority}
[... other output lines omitted for brevity ...]
 
opg-nosql>  opg.getEdges();
==>Edge ID 1139 from Vertex ID 64 {country:str:United States, name:str:Jeff Bezos, occupation:str:business man} =[leads]=> Vertex ID 37 {country:str:United States, name:str:Amazon, type:str:online retailing} edgeKV[{weight:flo:1.0}]
[... other output lines omitted for brevity ...]

次の例は、インメモリー分析用のいくつかの構成パラメータをカスタマイズします。ここでは、Apache HBaseに接続し、グラフ名myGraphを持つOraclePropertyGraphのインスタンスを取得し、いくつかのサンプル・グラフ・データをロードし、頂点およびエッジのリストを取得し、インメモリー・アナリストを取得して、組込み分析の1つであるトライアングル・カウンティングを実行します。

$ ./gremlin-opg-hbase.sh
opg-hbase>
opg-hbase> dop=2;   // degree of parallelism
==>2
opg-hbase> confPgx = new HashMap<PgxConfig.Field, Object>();
opg-hbase> confPgx.put(PgxConfig.Field.ENABLE_GM_COMPILER, false);
==>null
opg-hbase> confPgx.put(PgxConfig.Field.NUM_WORKERS_IO, dop + 2);
==>null
opg-hbase> confPgx.put(PgxConfig.Field.NUM_WORKERS_ANALYSIS, 3);
==>null
opg-hbase> confPgx.put(PgxConfig.Field.NUM_WORKERS_FAST_TRACK_ANALYSIS, 2);
==>null
opg-hbase> confPgx.put(PgxConfig.Field.SESSION_TASK_TIMEOUT_SECS, 0);
==>null
opg-hbase> confPgx.put(PgxConfig.Field.SESSION_IDLE_TIMEOUT_SECS, 0);
==>null
opg-hbase> PgxConfig.init(confPgx);
==>null
opg-hbase>
 
opg-hbase> iClientPort = 2181;
==>2181
 
opg-hbase> cfg = GraphConfigBuilder.forHbase() .setName("myGraph") .setZkQuorum("bigdatalite") .setZkClientPort(iClientPort) .setZkSessionTimeout(60000) .setMaxNumConnections(dop) .setLoadEdgeLabel(true) .setSplitsPerRegion(1) .addEdgeProperty("lbl", PropertyType.STRING, "lbl") .addEdgeProperty("weight", PropertyType.DOUBLE, "1000000") .build();
==>{"splits_per_region":1,"max_num_connections":2,"node_props":[],"format":"pg","load_edge_label":true,"name":"myGraph","zk_client_port":2181,"zk_quorum":"bigdatalite","edge_props":[{"type":"string","default":"lbl","name":"lbl"},{"type":"double","default":"1000000","name":"weight"}],"loading":{},"error_handling":{},"zk_session_timeout":60000,"db_engine":"HBASE"}
 
opg-hbase> opg = OraclePropertyGraph.getInstance(cfg);  
==>oraclepropertygraph with name myGraph
 
 
opg-hbase> opgdl = OraclePropertyGraphDataLoader.getInstance();
==>oracle.pg.hbase.OraclePropertyGraphDataLoader@3451289b
 
opg-hbase> opgdl.loadData(opg, "../../data/connections.opv", "../../data/connections.ope", 1, 1, 0, null);
==>null
 
opg-hbase> opg.getVertices();
==>Vertex ID 78 {country:str:United States, name:str:Hosain Rahman, occupation:str:CEO of Jawbone}
...
 
opg-hbase> opg.getEdges();
==>Edge ID 1139 from Vertex ID 64 {country:str:United States, name:str:Jeff Bezos, occupation:str:business man} =[leads]=> Vertex ID 37 {country:str:United States, name:str:Amazon, type:str:online retailing} edgeKV[{weight:flo:1.0}]
[... other output lines omitted for brevity ...]
 
opg-hbase> analyst = opg.getInMemAnalyst();
==>oracle.pgx.api.analyst.Analyst@534df5ff
 
opg-hbase>  triangles = analyst.countTriangles(false).get();
==>22

Java APIの詳細は、インストール・ホーム(デフォルトでは/opt/oracle/oracle-spatial-graph/property_graph/)の下のdoc/dal/およびdoc/pgx/にあるJavadocリファレンス情報を参照してください。

4.9 サンプル・プログラムの探索

ソフトウェア・インストールには、プロパティ・グラフの作成および操作方法を習得するために使用できる、プログラムの例のディレクトリが含まれています。

4.9.1 サンプル・プログラムについて

サンプル・プログラムは、examples/dalという名前のインストール・サブディレクトリに配布されます。例はHBaseおよびOracle NoSQL Database用にレプリケートされるため、選択したバックエンド・データベースに対応するプログラムのセットを使用できます。表4-3にいくつかのプログラムを示します。

表4-3 プロパティ・グラフのプログラム例(一部)

プログラム名	説明
ExampleNoSQL1 ExampleHBase1	1つの頂点からなる最小プロパティ・グラフを作成し、様々なデータ型のプロパティを頂点に設定し、保存済のグラフの記述をデータベースに問い合せます。
ExampleNoSQL2 ExampleHBase2	Example1と同じ最小プロパティ・グラフを作成してから、それを削除します。
ExampleNoSQL3 ExampleHBase3	複数の頂点およびエッジを持つグラフを作成します。いくつかの頂点およびエッジを明示的に削除し、その他は他の必須オブジェクトを削除することによって暗黙的に削除します。この例は、オブジェクトの現在のリストを表示するために、データベースを繰り返し問い合せます。

プログラム名

説明

ExampleNoSQL1

ExampleHBase1

1つの頂点からなる最小プロパティ・グラフを作成し、様々なデータ型のプロパティを頂点に設定し、保存済のグラフの記述をデータベースに問い合せます。

ExampleNoSQL2

ExampleHBase2

Example1と同じ最小プロパティ・グラフを作成してから、それを削除します。

ExampleNoSQL3

ExampleHBase3

複数の頂点およびエッジを持つグラフを作成します。いくつかの頂点およびエッジを明示的に削除し、その他は他の必須オブジェクトを削除することによって暗黙的に削除します。この例は、オブジェクトの現在のリストを表示するために、データベースを繰り返し問い合せます。

4.9.2 サンプル・プログラムのコンパイルおよび実行

Javaソース・ファイルをコンパイルおよび実行するには、次のようにします。

例のディレクトリに変更します。
```
cd examples/dal
```
Javaコンパイラを使用します。
```
javac -classpath ../../lib/'*' filename.java
```
例: javac -classpath ../../lib/'*' ExampleNoSQL1.java
コンパイルしたコードを実行します。
```
java -classpath ../../lib/'*':./ filename args
```
引数は、グラフを格納するためにOracle NoSQL DatabaseまたはApache HBaseのいずれを使用しているかによって異なります。値はOraclePropertyGraph.getInstanceに渡されます。

Apache HBaseの引数の説明

HBaseの例を使用している場合は、次の引数を指定します。

quorum: "node01.example.com, node02.example.com, node03.example.com"など、HBaseの実行ノードを識別する名前のカンマ区切りリスト。
client_port: "2181"など、HBaseクライアントのポート番号。
graph_name: "customer_graph"など、グラフの名前。

Oracle NoSQL Databaseの引数の説明

NoSQLの例を使用している場合は、次の引数を指定します。

host_name: "cluster02:5000"など、Oracle NoSQL Database登録のクラスタ名およびポート番号。
store_name: "kvstore"など、キー値ストアの名前
graph_name: "customer_graph"など、グラフの名前。

4.9.3 出力の例について

プログラムの例は、System.out.println を使用して、プロパティ・グラフの記述を格納先のデータベース(Oracle NoSQL DatabaseまたはApache HBaseのいずれか)から取得します。キー名、データ型および値はコロンで区切られます。たとえば、weight:flo:30.0は、キー名がweight、データ型がfloat、値が30.0であることを示しています。

表4-4に、出力で使用されるデータ型の略称を示します。

表4-4 プロパティ・グラフのデータ型の略称

略称	データ型
bol	ブール
dat	日付
dbl	double
flo	float
int	整数
ser	シリアライズ可能
str	文字列

4.9.4 例: プロパティ・グラフの作成

ExampleNoSQL1およびExampleHBase1は、1つの頂点からなる最小プロパティ・グラフを作成します。例4-5のコード・フラグメントは、v1という名前の頂点を作成し、様々なデータ型のプロパティを設定します。次に、保存済のグラフの記述をデータベースに問い合せます。

例4-5 プロパティ・グラフの作成

// Create a property graph instance named opg
OraclePropertyGraph opg = OraclePropertyGraph.getInstance(args);
 
// Clear all vertices and edges from opg
    opg.clearRepository();

// Create vertex v1 and assign it properties as key-value pairs
    Vertex v1 = opg.addVertex(1l);
    v1.setProperty("age",  Integer.valueOf(18));
    v1.setProperty("name", "Name");
    v1.setProperty("weight", Float.valueOf(30.0f));
    v1.setProperty("height", Double.valueOf(1.70d));
    v1.setProperty("female", Boolean.TRUE);
 
// Save the graph in the database
    opg.commit();

// Display the stored vertex description
System.out.println("Fetch 1 vertex: " + opg.getVertices().iterator().next());
 
// Close the graph instance
    opg.shutdown();

OraclePropertyGraph.getInstance引数(args)は、グラフを格納するためにOracle NoSQL DatabaseまたはApache HBaseのいずれを使用しているかによって異なります。「サンプル・プログラムのコンパイルおよび実行」を参照してください。

System.out.printlnは次の出力を表示します。

Fetch 1 vertex: Vertex ID 1 {age:int:18, name:str:Name, weight:flo:30.0, height:dbl:1.7, female:bol:true}

次については、プロパティ・グラフ・サポートのJavadoc (デフォルトでは/opt/oracle/oracle-spatial-graph/property_graph/doc/pgx)を参照してください。

OraclePropertyGraph.addVertex
OraclePropertyGraph.clearRepository
OraclePropertyGraph.getInstance
OraclePropertyGraph.getVertices
OraclePropertyGraph.shutdown
Vertex.setProperty

4.9.5 例: プロパティ・グラフの削除

ExampleNoSQL2およびExampleHBase2は、「例: プロパティ・グラフの作成」で作成したようなグラフを作成し、それをデータベースから削除します。

例4-6のコード・フラグメントは、グラフを削除します。OraclePropertyGraphUtils.dropPropertyGraph引数の説明は、「サンプル・プログラムのコンパイルおよび実行」を参照してください。

例4-6 プロパティ・グラフの削除

// Drop the property graph from the database
OraclePropertyGraphUtils.dropPropertyGraph(args);

// Display confirmation that the graph was dropped
System.out.println("Graph " + graph_name + " dropped. ");

System.out.printlnは次の出力を表示します。

Graph graph_name dropped.

OraclePropertyGraphUtils.dropPropertyGraphのJavadocを参照してください。

4.9.6 例: 頂点およびエッジの追加と削除

ExampleNoSQL3およびExampleHBase3は、頂点とエッジの両方を追加および削除します。

例4-7 頂点の作成

例4-7のコード・フラグメントは、3つの頂点を作成します。これは例4-5をシンプルにしたバリエーションです。

// Create a property graph instance named opg
OraclePropertyGraph opg = OraclePropertyGraph.getInstance(args);
 
// Clear all vertices and edges from opg
    opg.clearRepository();

// Add vertices a, b, and c
    Vertex a = opg.addVertex(1l);
    a.setProperty("name", "Alice");
    a.setProperty("age", 31);
  
    Vertex b = opg.addVertex(2l);  
    b.setProperty("name", "Bob");
    b.setProperty("age", 27);
 
    Vertex c = opg.addVertex(3l);
    c.setProperty("name", "Chris");
    c.setProperty("age", 33);

例4-8 エッジの作成

例4-8のコード・フラグメントは、頂点a、bおよびcを使用してエッジを作成します。

// Add edges e1, e2, and e3
    Edge e1 = opg.addEdge(1l, a, b, "knows");
    e1.setProperty("type", "partners");
 
    Edge e2 = opg.addEdge(2l, a, c, "knows");
    e2.setProperty("type", "friends");
 
    Edge e3 = opg.addEdge(3l, b, c, "knows");
    e3.setProperty("type", "colleagues");

例4-9 エッジおよび頂点の削除

例4-9のコード・フラグメントは、エッジe3と頂点bを明示的に削除します。これにより、頂点bに接続していたエッジe1が暗黙的に削除されます。

 // Remove edge e3
    opg.removeEdge(e3);

// Remove vertex b and all related edges
    opg.removeVertex(b);

例4-10 頂点およびエッジの問合せ

この例は、データベースに問い合せて、オブジェクトがいつ追加および削除されるかを表示します。例4-10のコード・フラグメントは、使用されるメソッドを表示します。

// Print all vertices
    vertices = opg.getVertices().iterator();
    System.out.println("----- Vertices ----");
    vCount = 0;
    while (vertices.hasNext()) {
      System.out.println(vertices.next());
      vCount++;
    }
    System.out.println("Vertices found: " + vCount);
 
    // Print all edges
    edges = opg.getEdges().iterator();
    System.out.println("----- Edges ----");
    eCount = 0;
    while (edges.hasNext()) {
      System.out.println(edges.next());
      eCount++;
    }
    System.out.println("Edges found: " + eCount);

このトピックの例では、次のような出力が生成されます。

----- Vertices ----
Vertex ID 3 {name:str:Chris, age:int:33}
Vertex ID 1 {name:str:Alice, age:int:31}
Vertex ID 2 {name:str:Bob, age:int:27}
Vertices found: 3
----- Edges ----
Edge ID 2 from Vertex ID 1 {name:str:Alice, age:int:31} =[knows]=> Vertex ID 3 {name:str:Chris, age:int:33} edgeKV[{type:str:friends}]
Edge ID 3 from Vertex ID 2 {name:str:Bob, age:int:27} =[knows]=> Vertex ID 3 {name:str:Chris, age:int:33} edgeKV[{type:str:colleagues}]
Edge ID 1 from Vertex ID 1 {name:str:Alice, age:int:31} =[knows]=> Vertex ID 2 {name:str:Bob, age:int:27} edgeKV[{type:str:partners}]
Edges found: 3
 Remove edge Edge ID 3 from Vertex ID 2 {name:str:Bob, age:int:27} =[knows]=> Vertex ID 3 {name:str:Chris, age:int:33} edgeKV[{type:str:colleagues}]
----- Vertices ----
Vertex ID 1 {name:str:Alice, age:int:31}
Vertex ID 2 {name:str:Bob, age:int:27}
Vertex ID 3 {name:str:Chris, age:int:33}
Vertices found: 3
----- Edges ----
Edge ID 2 from Vertex ID 1 {name:str:Alice, age:int:31} =[knows]=> Vertex ID 3 {name:str:Chris, age:int:33} edgeKV[{type:str:friends}]
Edge ID 1 from Vertex ID 1 {name:str:Alice, age:int:31} =[knows]=> Vertex ID 2 {name:str:Bob, age:int:27} edgeKV[{type:str:partners}]
Edges found: 2
Remove vertex Vertex ID 2 {name:str:Bob, age:int:27}
----- Vertices ----
Vertex ID 1 {name:str:Alice, age:int:31}
Vertex ID 3 {name:str:Chris, age:int:33}
Vertices found: 2
----- Edges ----
Edge ID 2 from Vertex ID 1 {name:str:Alice, age:int:31} =[knows]=> Vertex ID 3 {name:str:Chris, age:int:33} edgeKV[{type:str:friends}]
Edges found: 1

4.10 Oracleフラット・ファイル形式の定義

プロパティ・グラフは、頂点用の記述ファイルとエッジ用の記述ファイルの2つのフラット・ファイルに定義できます。

4.10.1 プロパティ・グラフ記述ファイルについて

ファイルのペアにより、プロパティ・グラフが記述されます。

頂点ファイル: プロパティ・グラフの頂点を記述します。このファイルには、.opvというファイル名拡張子が付きます。
エッジ・ファイル: プロパティ・グラフのエッジを記述します。このファイルには、.opeというファイル名拡張子が付きます。

これら2つのファイルは同じベース名を共有することをお薦めします。たとえば、simple.opvおよびsimple.opeによってプロパティ・グラフを定義します。

4.10.2 頂点ファイル

頂点ファイルの各行は、プロパティ・グラフの1つの頂点を記述する1つのレコードです。1つのレコードは1つの頂点の1つのキー値プロパティを記述できるため、複数のプロパティを持つ頂点を記述する場合は複数のレコード/行が使用されます。

レコードには、カンマで区切られた6つのフィールドが含まれています。各レコードには、値があるかどうかに関係なく、すべてのフィールドを区切るために5つのカンマが含まれている必要があります。

vertex_ID, key_name, value_type, value, value, value

表4-5に、頂点ファイルのレコードを構成するフィールドを示します。

表4-5 頂点ファイルのレコードの形式

フィールド番号	Name(1)	説明
1	vertex_ID	頂点を一意に識別する整数
2	key_name	キー値ペアのキーの名前頂点にプロパティが指定されていない場合は、空白(`%20`)を入力します。この例は、プロパティが指定されていない頂点1を示しています。 1,%20,,,,
3	value_type	キー値ペアの値のデータ型を表す整数。 `1` 文字列 `2` 整数 `3` Float `4` Double `5` 日付 `6` ブール `10` シリアライズ可能Javaオブジェクト
4	value	数値でも日付でもない場合のkey_nameのエンコードされたnullでない値
5	value	数値である場合のkey_nameのエンコードされたnullでない値
6	value	日付である場合のkey_nameのエンコードされたnullでない値日付の形式を識別するには、Java `SimpleDateFormat`クラスを使用します。この例は、`2015-03-26Th00:00:00.000-05:00`という日付形式を示しています。 SimpleDateFormat sdf = new SimpleDateFormat("yyyy-MM-dd'Th'HH:mm:ss.SSSXXX"); encode(sdf.format((java.util.Date) value));

頂点の必須グループ化: 1つの頂点には複数のプロパティを指定でき、頂点ファイルには、頂点IDとその頂点の1つのプロパティの組合せごとに1つのレコード(フラット・ファイル内の単一テキスト行によって表される)が含まれます。頂点ファイルでは、各頂点のすべてのレコードがまとめてグループ化されている(つまり、他の頂点のレコードが間に入っていない)必要があります。これは任意の方法で行えますが、便利な方法は、頂点ファイルのレコードを頂点ID別に昇順(または降順)にソートすることです。(ただし、頂点ファイルのすべてのレコードを頂点ID別にソートする必要があるわけではありません。これはグループ化要件を満たすための単なる1つの方法です。)

4.10.3 エッジ・ファイル

エッジ・ファイルの各行は、プロパティ・グラフの1つのエッジを記述する1つのレコードです。1つのレコードは1つのエッジの1つのキー値プロパティを記述できるため、複数のプロパティを持つエッジを記述する場合は複数のレコードが使用されます。

レコードには、カンマで区切られた9つのフィールドが含まれています。各レコードには、値があるかどうかに関係なく、すべてのフィールドを区切るために8つのカンマが含まれている必要があります。

edge_ID, source_vertex_ID, destination_vertex_ID, edge_label, key_name, value_type, value, value, value

表4-6に、エッジ・ファイルのレコードを構成するフィールドを示します。

表4-6 エッジ・ファイルのレコードの形式

フィールド番号	Name(2)	説明
1	edge_ID	エッジを一意に識別する整数
2	source_vertex_ID	エッジの出力始点のvertex_ID。
3	destination_vertex_ID	エッジの入力終点のvertex_ID。
4	edge_label	2つの頂点間の関係を記述する、エッジのエンコードされたラベル
5	key_name	キー値ペアのキーのエンコードされた名前エッジにプロパティが指定されていない場合は、空白(`%20`)を入力します。この例は、プロパティが指定されていないエッジ100を示しています。 100,1,2,likes,%20,,,,
6	value_type	キー値ペアの値のデータ型を表す整数。 `1` 文字列 `2` 整数 `3` Float `4` Double `5` 日付 `6` ブール `10` シリアライズ可能Javaオブジェクト
7	value	数値でも日付でもない場合のkey_nameのエンコードされたnullでない値
8	value	数値である場合のkey_nameのエンコードされたnullでない値
9	value	日付である場合のkey_nameのエンコードされたnullでない値日付の形式を識別するには、Java `SimpleDateFormat`クラスを使用します。この例は、`2015-03-26Th00:00:00.000-05:00`という日付形式を示しています。 SimpleDateFormat sdf = new SimpleDateFormat("yyyy-MM-dd'Th'HH:mm:ss.SSSXXX"); encode(sdf.format((java.util.Date) value));

エッジの必須グループ化: 1つのエッジには複数のプロパティを指定でき、エッジ・ファイルには、エッジIDとそのエッジの1つのプロパティの組合せごとに1つのレコード(フラット・ファイル内の単一テキスト行によって表される)が含まれます。エッジ・ファイルでは、各エッジのすべてのレコードがまとめてグループ化されている(つまり、他のエッジのレコードが間に入っていない)必要があります。これは任意の方法で行えますが、便利な方法は、エッジ・ファイルのレコードをエッジID別に昇順(または降順)にソートすることです。(ただし、エッジ・ファイルのすべてのレコードをエッジID別にソートする必要があるわけではありません。これはグループ化要件を満たすための単なる1つの方法です。)

4.10.4 特殊文字のエンコーディング

頂点およびエッジ・ファイルのエンコーディングはUTF-8です。表4-7に、頂点またはエッジ・プロパティ(キー値ペア)またはエッジ・ラベルに表示されるときに文字列としてエンコードされる必要のある特殊文字をリストします。その他すべての文字は、エンコーディングは必要ありません。

表4-7 Oracleフラット・ファイル形式での特殊文字コード

特殊文字	文字列のエンコーディング	説明
`%`	`%25`	パーセント
`\t`	`%09`	タブ
	`%20`	空白
`\n`	`%OA`	改行
`\r`	`%OD`	リターン
`,`	`%2C`	カンマ

4.10.5 Oracleフラット・ファイル形式でのプロパティ・グラフの例

Oracleフラット・ファイル形式でのプロパティ・グラフの例を次に示します。この例では、2つの頂点(JohnおよびMary)と、JohnがMaryの友人であることを示す単一のエッジがあります。

%cat simple.opv
1,age,2,,10,
1,name,1,John,,
2,name,1,Mary,,
2,hobby,1,soccer,,
 
%cat simple.ope
100,1,2,friendOf,%20,,,,

4.11 Pythonユーザー・インタフェースの例

The Oracle Big Data Spatial and Graphのプロパティ・グラフ・サポートには、Pythonユーザー・インタフェースの例が含まれています。これは、様々なプロパティ・グラフ操作を実行するPythonスクリプトおよびモジュールの例のセットを起動できます。

Pythonユーザー・インタフェースの例のインストール手順は、インストール・ホーム(デフォルトでは/opt/oracle/oracle-spatial-graph)の下にある/property_graph/examples/pyopg/READMEファイルにあります。

/property_graph/examples/pyopg/にあるPythonスクリプトの例は、Oracle Spatial and Graphプロパティ・グラフとともに使用でき、ニーズに応じてその例(またはそのコピー)を変更および拡張できます。

例を実行するためにユーザー・インタフェースを起動するには、pyopg.shスクリプトを使用します。

例には次のものが含まれています。

例1: Oracle NoSQL Databaseに接続して、頂点およびエッジの数の単純なチェックを実行します。これを実行するには、次のようにします。
```
cd /opt/oracle/oracle-spatial-graph/property_graph/examples/pyopg
./pyopg.sh
 
connectONDB("mygraph", "kvstore", "localhost:5000")
print "vertices", countV()
print "edges", countE()
```
前の例で、mygraphはOracle NoSQL Databaseに格納されているグラフの名前で、kvstoreおよびlocalhost:5000はOracle NoSQL Databaseにアクセスするための接続情報です。これらは環境に応じてカスタマイズする必要があります。
例2: Apache HBaseに接続して、頂点およびエッジの数の単純なチェックを実行します。これを実行するには、次のようにします。
```
cd /opt/oracle/oracle-spatial-graph/property_graph/examples/pyopg
./pyopg.sh
 
connectHBase("mygraph", "localhost", "2181")
print "vertices", countV()
print "edges", countE()
```
前の例で、mygraphはApache HBaseに格納されているグラフの名前で、localhostおよび2181はApache HBaseにアクセスするための接続情報です。これらは環境に応じてカスタマイズする必要があります。

例3: Oracle NoSQL Databaseに接続していくつかの分析機能を実行します。これを実行するには、次のようにします。

cd /opt/oracle/oracle-spatial-graph/property_graph/examples/pyopg
./pyopg.sh
 
connectONDB("mygraph", "kvstore", "localhost:5000")
print "vertices", countV()
print "edges", countE()
 
import pprint
 
analyzer = analyst()
print "# triangles in the graph", analyzer.countTriangles()
 
graph_communities = [{"commid":i.getId(),"size":i.size()} for i in analyzer.communities().iterator()]
 
import pandas as pd
import numpy as np
 
community_frame = pd.DataFrame(graph_communities)
community_frame[:5]
 
import matplotlib as mpl
import matplotlib.pyplot as plt
 
fig, ax = plt.subplots(nrows=1, ncols=1, figsize=(16,12));
community_frame["size"].plot(kind="bar", title="Communities and Sizes")
ax.set_xticklabels(community_frame.index);
plt.show()

前の例は、Oracle NoSQL Databaseに接続し、頂点およびエッジに関する基本情報を出力し、インメモリー・アナリストを取得し、トライアングル数を計算し、コミュニティ検出を実行して、最後にコミュニティとそのサイズを棒グラフに表します。

例4: Apache HBaseに接続していくつかの分析機能を実行します。これを実行するには、次のようにします。

cd /opt/oracle/oracle-spatial-graph/property_graph/examples/pyopg
./pyopg.sh
 
connectHBase("mygraph", "localhost", "2181")
print "vertices", countV()
print "edges", countE()
 
import pprint
 
analyzer = analyst()
print "# triangles in the graph", analyzer.countTriangles()
 
graph_communities = [{"commid":i.getId(),"size":i.size()} for i in analyzer.communities().iterator()]
import pandas as pd
import numpy as np
community_frame = pd.DataFrame(graph_communities)
community_frame[:5]
 
import matplotlib as mpl
import matplotlib.pyplot as plt
 
fig, ax = plt.subplots(nrows=1, ncols=1, figsize=(16,12));
community_frame["size"].plot(kind="bar", title="Communities and Sizes")
ax.set_xticklabels(community_frame.index);
plt.show()

前の例は、Apache HBaseに接続し、頂点およびエッジに関する基本情報を出力し、インメモリー・アナリストを取得し、トライアングル数を計算し、コミュニティ検出を実行して、最後にコミュニティとそのサイズを棒グラフに表します。

Pythonインタフェースの例の詳細は、インストール・ホームの下にある次のディレクトリを参照してください。

property_graph/examples/pyopg/doc/

¹ 名前は頂点ファイルには表示されませんが、ここではフィールド参照を単純にするために提供されています。

² 名前はエッジ・ファイルには表示されませんが、ここではフィールド参照を単純にするために提供されています。