シャーディング・キーとしてのJSONフィールドの使用

JSONコレクション用のシャード表の作成

SODA ID以外のシャーディング・キーを使用するシャード表を作成するには、次のようにします。

シャード・カタログにSharding/SODAユーザーとして接続します。
SHARD DDLを有効にします。
表領域セットを作成します。
次の例に示すように、CREATE SHARDED TABLEを実行します。

次の例では、顧客プロファイル・ドキュメント(CUSTPROFILE)のJSONコレクション用のシャード表(Customers)を作成します。

SODA ID (ID)の列によってJSONエントリが識別されます。SODAを使用してJSONエントリを表に作成すると、アプリケーションによってID列に一意の値が移入されます。

シャーディング・キー列(ZIP)は、JSONドキュメントから抽出された郵便番号値です。

その他の列は、SODAによって基礎となるコレクションを保持する表が作成されたときに指定されたデフォルトの列名です。これは、SODAコレクションを作成し、作成した表を調べると自分で確認できます。

ID列自体を主キーにはできません。主キーはシャーディング・キー(この場合はZIP)であるか、シャーディング・キーを含んでいる必要があります。アプリケーションの例では、IDとZIPの両方を使用して、シャード・データを操作します。前述の例では、ZIPは単独で一意の値にならないため、主キーはシャーディング・キーおよびSODA ID (ZIP、ID)で構成されます。

Oracle 21cでは、(ZIP, ID)または(ID, ZIP)を組み合せた主キーとして使用できます。通常、このテーブルへのアクセスは、組合せではなく、これらの値に対して個別にアクセスする必要があります。IDおよび顧客の問合せを検索するこれらの例のSODAアクセスは、JSONフィールド(この場合はZIP)を使用している可能性があるため、いずれの場合も個々の索引を作成します。

適切なシャーディング・キーの選択は、使用状況とアプリケーションの要件によって異なります。一意のシャーディング・キー(顧客IDなど)を使用できますが、その場合は、SODA IDを使用してシャーディング・キーを格納することもできます。

シャード表の作成: システム管理

/* Enable shard DDL */
ALTER SESSION ENABLE SHARD DDL;

/* Create a tablespace set */
CREATE TABLESPACE SET TSP_SET_1 USING TEMPLATE
 (datafile size 100m autoextend on next 10M maxsize unlimited
 extent management local segment space management auto); 

/* Create the sharded table */
CREATE SHARDED TABLE CUSTOMERS
(
"ID" VARCHAR2(255) NOT NULL,
"CREATED_ON" timestamp default sys_extract_utc(SYSTIMESTAMP) NOT NULL,
"LAST_MODIFIED" timestamp default sys_extract_utc(SYSTIMESTAMP) NOT NULL,
"VERSION" varchar2(255) NOT NULL,
"ZIP" VARCHAR2(60) NOT NULL,
"CUSTPROFILE" JSON,
PRIMARY KEY (ID,ZIP))
TABLESPACE SET TSP_SET_1
PARTITION BY CONSISTENT HASH (ZIP) PARTITIONS AUTO;

シャード表の作成: ユーザー定義

必要なすべての表領域およびシャード領域が作成されていることを確認します。

/* Enable shard DDL */
ALTER SESSION ENABLE SHARD DDL;

/* Create the sharded table */
CREATE SHARDED TABLE CUSTOMERS
(
"ID" VARCHAR2(255) NOT NULL,
"CREATED_ON" timestamp default sys_extract_utc(SYSTIMESTAMP) NOT NULL,
"LAST_MODIFIED" timestamp default sys_extract_utc(SYSTIMESTAMP) NOT NULL,
"VERSION" varchar2(255) NOT NULL,
"ZIP" VARCHAR2(60) NOT NULL,
"CUSTPROFILE" JSON,
PRIMARY KEY (ID,ZIP))
PARTITION BY RANGE (ZIP)
(PARTITION p1 VALUES LESS THAN ('50000') TABLESPACE ts1,
PARTITION p2 VALUES LESS THAN ('99999') TABLESPACE ts2)

シャード表でのマップされたSODAコレクションの作成

マップを作成して、シャード表を操作する際に使用する列をSODAに認識させ、シャード表をコレクションのリストに追加します。

プロシージャを実行してマップを作成できますが、このプロシージャは、分散データベース内のすべてのシャードで実行する必要もあります。また、このプロシージャは、将来追加されるどのシャードでも実行する必要があります。これらの要件はいずれも、シャーディング固有のPL/SQLプロシージャsys.exec_shard_plsql()を使用して達成できます。

すべてのシャードにわたってSODAマップを作成するには、次のようにします。

Sharding/SODAユーザーとして、SHARD DDLを有効にして、次のコマンドをシャード・カタログで実行します。シャード・カタログは、自動的に処理されるように、すべてのシャードにプロシージャを伝播します。

create or replace procedure COLLECTION_PROC_CUSTOMERS AS
METADATA varchar2(8000);
COL SODA_COLLECTION_T;
begin
METADATA := '{"tableName":"CUSTOMERS",
"keyColumn":{"name":"ID"},
"contentColumn":{"name":"CUSTPROFILE","sqlType":"JSON"},
"versionColumn":{"name":"VERSION","method":"UUID"},
"lastModifiedColumn":{"name":"LAST_MODIFIED"},
"creationTimeColumn":{"name":"CREATED_ON"},
"readOnly":false}';
 -- Create a collection using "map" mode, based on
 -- the table you've created above and specified in 
 -- the custom metadata under "tableName" field.
COL := dbms_soda.create_collection('CUSTOMERS',METADATA,DBMS_SODA.CREATE_MODE_MAP);
end ;
/
 
exec sys.exec_shard_plsql('app_schema.collection_proc_customers()',4+1);

keyColumnはIDであり、コレクションを挿入および取得するためにSODAで使用されるキーです。マッピングでSODAが使用していないため、ZIP列への参照はありません。

この時点で、CREATE COLLECTIONコマンドを実行したかのように新しいコレクションが作成されました。

いくつかPL/SQLを実行してコレクションを表示できます。シャード・カタログで次のコマンドを実行し、Customers表が出力に表示されていることを確認します。

SET SERVEROUTPUT ON
DECLARE
l_coll_list SODA_COLLNAME_LIST_T;
BEGIN
l_coll_list := DBMS_SODA.list_collection_names;
 
IF l_coll_list.COUNT > 0 THEN
FOR i IN 1 .. l_coll_list.COUNT LOOP
DBMS_OUTPUT.put_line(i || ' : ' || l_coll_list(i));
END LOOP;
END IF;
END;
/
1 : CUSTOMERS
 
PL/SQL procedure successfully completed.
 
SQL>

シャーディング・キーを移入するためのトリガーの作成

SODAは、ドキュメントを挿入または更新すると、コレクション・メタデータ(ID、CUSTPROFILE、LAST_MODIFIED、CREATED_ONおよびVERSION)で説明されている基礎となる表の列を自動的に移入します。ただし、ZIP列にも移入する必要があり、値はJSONドキュメント内から取得する必要があります。これは、トリガーを使用して実行します。

これはBEFOREトリガーであるため、その列が主キーであっても列に移入できます。

シャード・カタログで、アプリケーション・スキーマ・ユーザーとして次の文を実行します。プロシージャsys.exec_shard_plsqlは、すべてのシャードおよび今後のすべてのシャードでも実行されるようにします。

alter session enable shard ddl
 
create or replace procedure COLLECTION_BF_ZIP_CUSTOMERS AS
begin
EXECUTE IMMEDIATE 'alter session enable shard operations';
EXECUTE IMMEDIATE q'%
Create or Replace TRIGGER CUST_BF_TRIG
BEFORE INSERT or UPDATE on CUSTOMERS
FOR EACH ROW
begin
:new.ZIP := JSON_VALUE(:NEW.CUSTPROFILE, '$.ZIP' error on error error on empty);
end;
%';
end;
/

exec sys.exec_shard_plsql('app_schema.collection_bf_zip_customers()',4+1+2);

前述の例では、ZIPはJSONドキュメントのトップレベル・フィールドであると想定されています。ネストされたフィールド(たとえば、ADDRESSフィールドの下)に値がある場合は、フィールド階層を含める('$.ADDRESS.ZIP'など)必要があります。

コード・サンプル

「シャーディング・キーとしてのJSONフィールドの使用」のJavaおよびPythonコード・サンプルは、分散データベース環境でSODAを使用してJSONオブジェクトを実際に操作する方法を示しています。

これらの例では、シャード・キーを使用してシャードに接続し、新しいドキュメントを挿入しています。

分散データベース環境でSODAを使用する場合は、シャード・カタログを使用せずに特定のシャードに接続して新しいドキュメントを作成する必要があることに注意してください。

Javaコード・サンプル

次のJavaコード・サンプルは、データがJSONフィールド(この例では郵便番号)によってシャーディングされるコレクションにJSONドキュメントを挿入する方法を示しています。

import java.sql.Connection;
import java.util.Properties;
import java.util.List;
 
// SODA specific imports
import oracle.soda.rdbms.OracleRDBMSClient;
import oracle.soda.OracleDatabase;
import oracle.soda.OracleCursor;
import oracle.soda.OracleCollection;
import oracle.soda.OracleDocument;
import oracle.soda.OracleException;
 
// Sharding and UCP imports
import oracle.jdbc.OracleShardingKey;
import oracle.jdbc.OracleType;
import oracle.jdbc.pool.OracleDataSource;
import oracle.ucp.jdbc.PoolDataSourceFactory;
import oracle.ucp.jdbc.PoolDataSource;
 
 
/*
* The sample demonstrates connecting to a distributed database using
* Oracle JDBC driver and UCP as a client side connection pool.
*/
public class QuickInsertShardJSONField {
 
 public static void main(String args[]) throws Exception {
 
// TNS_ADMIN - Should be the path where the tnsnames.ora file resides
// dbshard_rw - It is the TNS alias present in tnsnames.ora.
// Note that the connection is to the Shard Director (GSM) and the service name is the shard RW service
final String DB_URL="jdbc:oracle:thin:@dbshard_rw?TNS_ADMIN=/home/opc/dbhome/";
 
// Update the Database Username and Password to the Shard User
final String DB_USER = "app_schema";
String DB_PASSWORD = "<user_password>" ;
 
 
// Get the PoolDataSource for UCP
PoolDataSource pds = PoolDataSourceFactory.getPoolDataSource();
 
// Set the connection factory first before all other properties
pds.setConnectionFactoryClassName(OracleDataSource.class.getName());
pds.setURL(DB_URL);
pds.setUser(DB_USER);
pds.setPassword(DB_PASSWORD);
pds.setConnectionPoolName("JDBC_UCP_POOL");
 
// Default is 0. Set the initial number of connections to be created
// when UCP is started.
pds.setInitialPoolSize(10);
// Default is 0. Set the minimum number of connections
// that is maintained by UCP at runtime.
pds.setMinPoolSize(10);
// Instead of Max Pool Size, we can set the number of max connections per shard
pds.setMaxConnectionsPerShard(20);
 
// We cannot get the connection until we have the Shard key which is part of the SQL
//We first set the sharding key which in our case is the value of the ZIP code field
String shardingKeyVal="94065";
 
// Now we build the connection using this shard key
OracleShardingKey sdkey = pds.createShardingKeyBuilder().subkey(shardingKeyVal, OracleType.VARCHAR2).build();
System.out.println("Initiating UCP and Creating Connection...");
Connection conn = pds.createConnectionBuilder().shardingKey(sdkey).build();
 
// Enable the SODA Shared Metadata cache
Properties props = new Properties();
props.put("oracle.soda.sharedMetadataCache", "true");
OracleRDBMSClient cl = new OracleRDBMSClient(props);
 
// Get a DB Connection
OracleDatabase db = cl.getDatabase(conn);
 
// Print all the Collections in this DB
List<String> names =  db.admin().getCollectionNames();
for (String name : names)
 System.out.println ("Collection name: " + name);
 
// Open up the CUSTOMERS Collection
OracleCollection col = db.openCollection("CUSTOMERS");
 
//We do not provide an SODA ID column.
//This is provided by SODA when the document is created
// Note that the ZIP field MUST match what we have specified as the key
OracleDocument cDoc = db.createDocumentFromString("{\"name\": \"Matilda\", \"State\": \"CA\", \"ZIP\":\"94065\"}");
 
// Insert the document above
OracleDocument insertedDoc = col.insertAndGet(cDoc);
 
// Get the document key
String dockey = insertedDoc.getKey();
 
// Get the document back by key
// We are still connected to the same shard
OracleDocument doc = col.find().key(dockey).getOne();
String content = doc.getContentAsString();
System.out.println("Retrieved content is: " + content);
 
// We are done, so close the connection to the shard
conn.close();
 
// At this point we could open up a new shard connection using a different sharding key
 
 
 }} // End of QuickInsertShardJSONField

Pythonコード・サンプル

Pythonのこのコード・サンプルは、分散データベース環境でSODAを使用してJSONオブジェクトを実際に操作する方法を示しています。

このサンプル・コードを環境で使用するには、Python用のcx_Oracleモジュールをインストールする手順に従います: https://cx-oracle.readthedocs.io/en/latest/user_guide/installation.html

この例では、シャーディング・キーを使用してシャードに接続し、新しいドキュメントを挿入します。

分散データベース環境でSODAを使用する場合は、シャード・カタログを使用せずに特定のシャードに接続して新しいドキュメントを作成する必要があることに注意してください。

# import the cx_Oracle module for Python
import cx_Oracle
 
# Create a connection pool that will be used for connecting to all shards
# The components of the dsn are hostname (shard director),
# port (usually 1522), global service (created using GDSCTL)
# We also enable SODA metadata caching
dsn=cx_Oracle.makedsn("shard_director_host",1522,service_name="service_name")
pool=cx_Oracle.SessionPool("app_schema","password",dsn,soda_metadata_cache=True)

# Connect to a specific shard by using the shard key, a ZIP code. which in this
# example is set explicitly as '94065', but this might be passed in or part of a loop 
# You must know beforehand whether you are creating or working with a document
# with a specific ZIP code value.
connection=pool.acquire(shardingkey=["94065"])
 
# set autocommit and open the CUSTOMERS collection
connection.autocommit = True
soda = connection.getSodaDatabase()
collection = soda.openCollection("CUSTOMERS")
 
# Insert a document
# A system generated SODA key is created by default.
content = {'name': 'Matilda', 'STATE': 'CA', 'ZIP': '94065'}
doc = collection.insertOneAndGet(content)
 
# The SODA key can now be used to work with this document directly
# We can retrieve it immediately
key = doc.key
print('The key of the new SODA document is: ', key)
 
# Fetch the document back by this same SODA key. 
# This only works because we are still connected to the same shard
doc = collection.find().key(key).getOne()
content = doc.getContent()
print('Retrieved SODA document dictionary is:')
print(content)
 

# Next, add another customer. We are in the shard containing 94065,
# so we can add a customer with the same ZIP code '94065'
content = {'name': 'Mildred', 'STATE': 'CA', 'ZIP: '94065'}
doc = collection.insertOneAndGet(content)
 
# Now do a query.
# It is important to note that this query is ONLY executed within this one shard,
# the shard which contains the part of the sharded table with 94065 ZIP codes.
# In other words, the actual query has the additional bound of customers whose
# names start with 'M' in 94065
# and any other ZIPs stored on this shard. This is unlikely to be a useful query
# for system-managed sharding.
documents = collection.find().filter({'name': {'$like': 'M%'}}).getDocuments()
for d in documents:
    content = d.getContent()
    print(content["name"])
 
# After you have finished, release this connection back into the pool
pool.release(connection)
 
# If you want to add or work with more customers with a different
# shard key start with another connection
# For example: connection=pool.acquire(shardingkey=["10012"]) and so on.
 
# When you are completely finished working with customers, shut down the pool.
pool.close()

このコード・サンプルは、マルチシャード問合せを実行して、すべてのシャードで名前がMで始まるすべての顧客名を返す方法を示しています。

import cx_Oracle
 
# Create an unpooled connection to the shard catalog
# The connect string connects to the shard director, but uses the catalog service,
# e.g. GD$catalog.oradbcloud
connection = cx_Oracle.connect("app_schema","password","db_connect_string")

# Open the CUSTOMERS collection
connection.autocommit = True
soda = connection.getSodaDatabase()
collection = soda.openCollection("CUSTOMERS")
  
# Now query the collection
# It is important to note that this is a query across ALL shards
# In other words, you will get ALL users whose name starts with M across ALL Zip codes
documents = collection.find().filter({'name': {'$like': 'M%'}}).getDocuments()
for d in documents:
    content = d.getContent()
    print(content["name"])
  
#Close the connection
connection.close()