5.7.2 Text Queries on Property Graphs

If values of a property (vertex property or edge property) contain free text, then it might help performance to create an Oracle Text index on the V column.

Oracle Text can process text that is directly stored in the database. The text can be short strings (such as names or addresses), or it can be full-length documents. These documents can be in a variety of textual format.

The text can also be in many different languages. Oracle Text can handle any space-separated languages (including character sets such as Greek or Cyrillic). In addition, Oracle Text is able to handle the Chinese, Japanese and Korean pictographic languages)

Because the property graph feature uses NVARCHAR typed column for better support of Unicode, it is highly recommended that UTF8 (AL32UTF8) be used as the database character set.

To create an Oracle Text index on the vertices table (or edges table), the ALTER SESSION privilege is required. For example:

SQL> grant alter session to <YOUR_USER_SCHEMA_HERE>;

If customization is required, also grant the EXECUTE privilege on CTX_DDL:

SQL> grant execute on ctx_ddl to <YOUR_USER_SCHEMA_HERE>;

The following shows some example statements for granting these privileges to SCOTT.

SQL> conn / as sysdba
Connected.
SQL> -- This is a PDB setup -- 
SQL> alter session set container=orcl;
Session altered.

SQL> grant execute on ctx_ddl to scott;
Grant succeeded.

SQL> grant alter session to scott;
Grant succeeded.

Example 5-6 Create a Text Index

This example creates an Oracle Text index on the vertices table (V column) of the connections graph in the SCOTT schema. Note that the Oracle Text index created here is for all property keys, not just one or a subset of property keys. In addition, if a new property is added to the graph and the property value is of String data type, then it will automatically be included in the same text index.

The example uses the OPG_AUTO_LEXER lexer owned by MDSYS.

SQL> execute opg_apis.create_vertices_text_idx('scott', 'connections', pref_owner=>'MDSYS', lexer=>'OPG_AUTO_LEXER', dop=>2);

If customization is desired, you can use the ctx_ddl.create_preference API. For example:

SQL> -- The following requires access privilege to CTX_DDL
SQL> exec ctx_ddl.create_preference('SCOTT.OPG_AUTO_LEXER', 'AUTO_LEXER');

PL/SQL procedure successfully completed.

SQL> execute opg_apis.create_vertices_text_idx('scott', 'connections', pref_owner=>'scott', lexer=>'OPG_AUTO_LEXER', dop=>2);

PL/SQL procedure successfully completed.

You can now use a rich set of functions provided by Oracle Text to perform queries against graph elements.

Note:

If you no longer need an Oracle Text index, you can use the drop_vertices_text_idx or opg_apis.drop_edges_text_idx API to drop it. The following statements drop the text indexes on the vertices and edges of a graph named connections owned by SCOTT:

SQL> exec opg_apis.drop_vertices_text_Idx('scott', 'connections');
SQL> exec opg_apis.drop_edges_text_Idx('scott', 'connections');

Example 5-7 Find a Vertex that Has a Property Value

The following example find a vertex that has a property value (of string type) containing the keyword "Smith".

SQL> select vid, k, t, v 
       from connectionsVT$ 
      where t=1 
        and contains(v, 'Smith', 1) > 0 
      order by score(1) desc
      ;

The output and SQL execution plan from the preceding statement may appear as follows. Note that DOMAIN INDEX appears as an operation in the execution plan.

     1  name    1  Robert Smith

Execution Plan
----------------------------------------------------------
Plan hash value: 1619508090

-----------------------------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name            | Rows  | Bytes | Cost (%CPU) | Time     | Pstart| Pstop |
-----------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |                 |     1 |    56 |     5  (20) | 00:00:01 |       |       |
|   1 |  SORT ORDER BY                      |                 |     1 |    56 |     5  (20) | 00:00:01 |       |       |
|*  2 |   TABLE ACCESS BY GLOBAL INDEX ROWID| CONNECTIONSVT$  |     1 |    56 |     4    (0)| 00:00:01 | ROWID | ROWID |
|*  3 |    DOMAIN INDEX                     | CONNECTIONSXTV$ |       |       |     4    (0)| 00:00:01 |       |       |
-----------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("T"=1 AND INTERNAL_FUNCTION("K") AND INTERNAL_FUNCTION("V"))
   3 - access("CTXSYS"."CONTAINS"("V",'Smith',1)>0)

Example 5-8 Fuzzy Match

The following example finds a vertex that has a property value (of string type) containing variants of "ameriian" (a deliberate misspelling for this example) Fuzzy match is used.

SQL> select vid, k, t, v 
       from connectionsVT$ 
      where contains(v, 'fuzzy(ameriian,,,weight)', 1) > 0 
      order by score(1) desc;

The output and SQL execution plan from the preceding statement may appear as follows.

     8 role      1  american business man
     9 role      1  american business man
     4 role      1  american economist
     6 role      1  american comedian actor
     7 role      1  american comedian actor
     1 occupation 1 44th president of United States of America

6 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 1619508090

-----------------------------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name            | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
-----------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |                 |     1 |    56 |     5  (20)| 00:00:01 |       |       |
|   1 |  SORT ORDER BY                      |                 |     1 |    56 |     5  (20)| 00:00:01 |       |       |
|*  2 |   TABLE ACCESS BY GLOBAL INDEX ROWID| CONNECTIONSVT$  |     1 |    56 |     4   (0)| 00:00:01 | ROWID | ROWID |
|*  3 |    DOMAIN INDEX                     | CONNECTIONSXTV$ |       |       |     4   (0)| 00:00:01 |       |       |
-----------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter(INTERNAL_FUNCTION("K") AND INTERNAL_FUNCTION("V"))

Example 5-9 Query Relaxation

The following example is a sophisticated Oracle Text query that implements query relaxation, which enables you to execute the most restrictive version of a query first, progressively relaxing the query until the required number of matches is obtained. Using query relaxation with queries that contain multiple strings, you can provide guidance for determining the “best” matches, so that these appear earlier in the results than other potential matches.

This example searchs for "american actor" with a query relaxation sequence.

SQL> select vid, k, t, v  
       from connectionsVT$ 
      where CONTAINS (v,
 '<query>
   <textquery lang="ENGLISH" grammar="CONTEXT">
     <progression>
       <seq>{american} {actor}</seq>
       <seq>{american} NEAR {actor}</seq>
       <seq>{american} AND {actor}</seq>
       <seq>{american} ACCUM {actor}</seq>
     </progression>
   </textquery>
   <score datatype="INTEGER" algorithm="COUNT"/>
  </query>') > 0;

The output and SQL execution plan from the preceding statement may appear as follows.

     7 role       1 american comedian actor
     6 role       1 american comedian actor
    44 occupation 1 actor
     8 role       1 american business man
    53 occupation 1 actor film producer
    52 occupation 1 actor
     4 role       1 american economist
    47 occupation 1 actor
     9 role       1 american business man

9 rows selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 2158361449

----------------------------------------------------------------------------------------------------------------------
| Id  | Operation                          | Name         | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
----------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                   |                 |       1 |      56 |       4   (0)| 00:00:01 |         |         |
|*  1 |  TABLE ACCESS BY GLOBAL INDEX ROWID| CONNECTIONSVT$  |       1 |      56 |       4   (0)| 00:00:01 | ROWID | ROWID |
|*  2 |   DOMAIN INDEX                     | CONNECTIONSXTV$ |         |         |       4   (0)| 00:00:01 |         |         |
----------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(INTERNAL_FUNCTION("K") AND INTERNAL_FUNCTION("V"))
   2 - access("CTXSYS"."CONTAINS"("V",'<query>      <textquery lang="ENGLISH" grammar="CONTEXT">
          <progression>       <seq>{american} {actor}</seq>    <seq>{american} NEAR {actor}</seq>
          <seq>{american} AND {actor}</seq>        <seq>{american} ACCUM {actor}</seq>    </progression>
          </textquery>    <score datatype="INTEGER" algorithm="COUNT"/> </query>')>0)

Example 5-10 Find an Edge

Just as with vertices, you can create an Oracle Text index on the V column of the edges table (GE$) of a property graph. The following example uses the OPG_AUTO_LEXER lexer owned by MDSYS.

SQL> exec opg_apis.create_edges_text_idx('scott', 'connections', pref_owner=>'mdsys', lexer=>'OPG_AUTO_LEXER', dop=>4);

If customization is required, use the ctx_ddl.create_preference API.