24 JSON Data Guide

A JSON data guide lets you discover information about the structure and content of JSON documents stored in Oracle Database.

Some ways that you can use this information include:

  • Generating a JSON Schema document that describes a set of JSON documents.

  • Generating a JSON Schema document that you can use to validate JSON documents.

  • Creating views that you can use to perform SQL operations on the data in the documents.

  • Automatically adding or updating virtual columns that correspond to added or changed fields in the documents.

See Also:

JSON Schema

24.1 Overview of JSON Data Guide

A data guide is a summary of the structural and type information contained in a set of JSON documents. It records metadata about the fields used in those documents.

For example, for the JSON object presented in Example 1-1, a data guide specifies that the document has, among other things, an object ShippingInstructions with fields name, Address, and Phone, of types string, object, and array, respectively. The structure of object Address is recorded similarly, as are the types of the elements in array Phone.

JSON data-guide information can be saved persistently as part of a JSON search index infrastructure. This information is then updated automatically as new JSON content is added. You specify this optional data-guide support when you create a JSON search index.

You can use a data guide:

  • As a basis for developing applications that involve data mining, business intelligence, or other analysis of JSON documents.

  • As a basis for providing user assistance about requested JSON information, including search.

  • To check or manipulate new JSON documents before adding them to a document set (for example: validate, type-check, or exclude certain fields).

For such purposes you can:

  • Query a data guide directly for information about the document set, such as field lengths or which fields occur with at least a certain frequency.

  • Create views, or add virtual columns, that project particular JSON fields of interest, based on their significance according to a data guide.

Note:

  • The advantages of virtual columns over a view are that you can build an index on a virtual column and you can obtain statistics on it for the optimizer.

  • The number of virtual columns per table is limited by the value of initialization parameter MAX_COLUMNS. By default that value is STANDARD, which means 1000 columns maximum. See MAX_COLUMNS in Oracle Database Reference.

Note:

A data guide serves as a guide to the structure of an existing set of JSON documents. To validate JSON data, use a JSON schema. A data guide created with either FORMAT_SCHEMA or FORMAT_HIERARCHICAL is a JSON schema, but only a data guide created with FORMAT_SCHEMA is useful for validating.

The following data-guide capabilities apply:

Note:

  • Path length: 4000 bytes. A path longer than 4000 bytes is ignored by a data guide.

  • Number of children under a parent node: 5000. A node that has more than 5000 children is ignored by a data guide.

  • Field value length: 32767 bytes. If a JSON field has a value longer than 32767 bytes then the data guide reports the length as 32767.

  • Data-guide behavior is undefined for data that contains zero-length (empty) object field name ("").

See Also:

24.2 Persistent Data-Guide Information as Part of a JSON Search Index

JSON data-guide information can be saved persistently as part of the JSON search index infrastructure. This information is then updated automatically as new JSON content is added. You specify this optional behavior when you create a JSON search index.

You can use CREATE SEARCH INDEX with keywords FOR JSON to create a search index, a data guide, or both at the same time. The default behavior is to create a search index without data-guide support.

To create persistent data-guide information as part of a JSON search index you specify DATAGUIDE as ON in the PARAMETERS clause for CREATE SEARCH INDEX. You can enable data-guide support without enabling support for search, by also specifying SEARCH_ON NONE. Example 24-1 illustrates this.

You can use ALTER INDEX ... REBUILD to enable or disable data-guide support for an existing JSON search index. Example 24-2 illustrates this — it enables data-guide support for the search index of Example 30-24.

Note:

To create a data guide-enabled JSON search index, or to data guide-enable an existing JSON search index, you need database privilege CTXAPP and Oracle Database Release 12c (12.2.0.1) or later.

Note:

A data guide-enabled JSON search index can be built only on a column that is known to contain JSON data, which means that it is either of JSON data type or it has an is json check constraint. In the latter case, for the data-guide information in the index to be updated, the check constraint must be enabled.

If the check constraint becomes disabled for some reason then you must rebuild the data-guide information in the index and re-enable the check constraint, to resume automatic data-guide support updating, as follows:

ALTER INDEX index_name REBUILD ('dataguide off');
ALTER INDEX index_name REBUILD ('dataguide on');
ALTER TABLE table_name ENABLE CONSTRAINT is_json_check_constraint_name;

In particular, using SQL*Loader (sqlldr) disables is json check constraints.

When you enable persistent data-guide information it is part of the search index infrastructure, so it is always live: its content is automatically updated whenever the index is synchronized. Changes in the indexed data are reflected in the search index, including in its data-guide information, only after the index is synchronized.

In addition, update of data-guide information in a search index is always additive: none of it is ever deleted. This is another reason that the index often does not accurately reflect the data in its document set: deletions within the documents it applies to are not reflected in its data-guide information. If you need to ensure that such information accurately reflects the current data then you must drop the JSON search index and create it anew.

The persistent data-guide information in a search index can also include statistics, such as how frequently each JSON field is used in the document set. Statistics are present only if you explicitly gather them on the document set (gather them on the JSON search index, for example). They are not updated automatically — gather statistics anew if you want to be sure they are up to date. Example 24-3 gathers statistics on the JSON data indexed by JSON search index po_search_idx, which is created in Example 30-24.

Note:

When a local data guide-enabled JSON search index is created in a sharding environment, each individual shard contains the data-guide information for the JSON documents stored in that shard. For this reason, if you invoke data guide-related operations on the shard catalog database then they will have no effect.

Considerations for a Data Guide-Enabled Search Index on a Partitioned Table

The data-guide information in a data guide-enabled JSON search index that is local to a partitioned table is not partitioned. It is shared among all partitions.

Because the data-guide information in the index is only additive, dropping, merging, splitting, or truncating partitions has no impact on the index.

Exchanging a partitioned table with a table that is not partitioned updates the data-guide information in an index on the partitioned table, but any data guide-enabled index on the non-partitioned table must be rebuilt.

Avoid Persistent Data-Guide Information If Serializing Hash-Table Data

If you serialize Java hash tables or associative arrays (such as are found in JavaScript) as JSON objects, then avoid the use of persistent data-guide information.

The default hash-table serialization provided by popular libraries such as GSON and Jackson produces textual JSON documents with object field names that are taken from the hash-table key entries and with field values taken from the corresponding Java hash-table values. Serializing a single Java hash-table entry produces a new (unique) JSON field and value.

Persisted data-guide information reflects the shape of your data, and it is updated automatically as new JSON documents are inserted. Each hash-table key–value pair results in a separate entry in the JSON search index. Such serialization can thus greatly increase the size of the information maintained in the index. In addition to the large size, the many index updates affect performance negatively, making DML slow.

If you serialize a hash table or an associative array instead as a JSON array of objects, each of which includes a field derived from a hash-table key entry, then there are no such problems.

The default serialization of a hash table or associative array as a JSON object is indistinguishable from an object that has field names assigned by a developer. A JSON data guide cannot tell which (metadata-like) field names have been assigned by a developer and which (data-like) names might have been derived from a hash table or associative array. It treats all field names as essentially metadata, as if specified by a developer.

For example:

  • If you construct an application object using a hash table that has animalName as the hash key and sets of animal properties as values then the resulting default serialization is a single JSON object that has a separate field ("cat", "mouse",...) for each hash-table entry, with the field value being an object with the corresponding animal properties. This can be problematic in terms of data-guide size and performance because of the typically large number of fields ("cat", "mouse",...) derived from the hash key.

  • If you instead construct an application array of animal structures, each of which has a field animalName (with value "cat" or "mouse"...) then the resulting serialization is a JSON array of objects, each of which has the same field, animalName. The corresponding data guide has no size or performance problem.

Example 24-1 Enabling Persistent Support for a JSON Data Guide But Not For Search

CREATE SEARCH INDEX po_dg_only_idx
  ON j_purchaseorder (po_document) FOR JSON
    PARAMETERS ('DATAGUIDE ON SEARCH_ON NONE');

Example 24-2 Enabling JSON Data-Guide Support For an Existing JSON Search Index

ALTER INDEX po_search_idx REBUILD PARAMETERS ('DATAGUIDE ON');

Example 24-3 Gathering Statistics on JSON Data Using a JSON Search Index

EXEC DBMS_STATS.gather_index_stats(docuser, po_search_idx, NULL, 100);

See Also:

24.3 Data-Guide Formats and Ways of Creating a Data Guide

There are three formats for a data guide: flat, hierarchical, and schema. All are available in SQL and PL/SQL as CLOB JSON documents. Both hierarchical and schema data guides are JSON Schema documents (that is, JSON schemas), but only schema format is designed for validating JSON documents.

  • You can use a schema data guide to validate JSON documents.

  • You can use a hierarchical or schema data guide to create a view, or to add virtual columns, using particular document fields that you choose on the basis of data-guide information.

  • With a flat data guide you can more easily query data-guide information such as the types and the occurrence frequencies of fields used in your documents.

Flat data guides are represented in JSON as an array of objects, each of which represents the JSON data of a specific path in the document set. A Flat Data Guide For Purchase-Order Documents describes a flat data guide for the purchase-order data of Example 1-1.

Hierarchical and schema data guides are each represented in JSON as an object with nested JSON data, in the same format as that defined by JSON Schema. Although a hierarchical data guide is a JSON schema, do not use it for validating JSON data — use a schema data guide for that. A Hierarchical Data Guide For Purchase-Order Documents describes a hierarchical data guide for the purchase-order data of Example 1-1.

You can create a data guide of any format (flat, hierarchical, or schema) by scanning a set of JSON documents. You use SQL aggregate function json_dataguide to do this. This does not require a data guide-enabled JSON search index. The data guide accurately reflects the document set at the moment function json_dataguide is invoked.

You can create a flat or hierarchical data guide from the data-guide information stored in a JSON search index. You use PL/SQL function DBMS_JSON.get_index_dataguide to do this. (You cannot create a schema-format data guide from search-index data.)

A data guide can include statistical fields, such as how frequently each JSON field is used in the document set.

  • If you use SQL function json_dataguide then statistical fields are present only if you specify DBMS_JSON.gather_stats in the third argument. They are computed dynamically (up-to-date) at the time you invoke gather_stats.

  • If you use PL/SQL function DBMS_JSON.get_index_dataguide then statistical fields are present only if you have gathered them on the JSON search index. They are not updated automatically — gather them anew if you want to be sure they are up to date.

Table 24-1 SQL and PL/SQL Functions to Obtain a Data Guide

Uses Data Guide-Enabled Search Index? Flat Data Guide Hierarchical Data Guide Schema Data Guide
Yes PL/SQL function get_index_dataguide with format DBMS_JSON.FORMAT_FLAT PL/SQL function get_index_dataguide with format DBMS_JSON.FORMAT_HIERARCHICAL None
No SQL function json_dataguide, with no format argument or with DBMS_JSON.FORMAT_FLAT as the format argument SQL function json_dataguide, with DBMS_JSON.FORMAT_HIERARCHICAL as the format argument SQL function json_dataguide, with DBMS_JSON.FORMAT_SCHEMA as the format argument
Advantages of obtaining a data guide based on a data guide-enabled JSON search index include:
  • Additive updates to the document set are automatically reflected in the persisted data-guide information whenever the index is synced.

  • Because this data-guide information is persisted, obtaining a data guide based on it (using PL/SQL function get_index_dataguide) is typically faster than obtaining a data guide by analyzing the document set (using SQL function json_dataguide).

Advantages of obtaining a data guide without using a data guide-enabled JSON search index include assurance that the data guide is accurate and the lack of index maintenance overhead. In addition, a data guide that is not derived from an index is appropriate in these particular use cases:

  • The JSON data is in an external table. You cannot create an index on it.

  • The JSON column could be indexed, but the index would not be very useful. This can be the case, for example, if the column contains different kinds of documents. In this case, it can sometimes be helpful to add a column to the table that identifies the kind of document stored in the JSON column. You can then use the data guide with SQL aggregate functions and GROUP BY. See Multiple Data Guides Per Document Set.

See Also:

24.4 JSON Data-Guide Fields

The predefined fields of a JSON data guide are described. They include JSON Schema fields and Oracle-specific fields.

A given occurrence of a field in a data guide corresponds to a field that is present in one or more JSON documents of the document set.

JSON Schema Data-Guide Fields (Keywords)

A JSON Schema is a JSON document that contains a JSON object, which can itself contain child objects (subschemas). Fields that are defined by JSON Schema are called JSON Schema keywords. Table 24-2 describes the keywords that can be used in an Oracle JSON data guide. Keywords properties, items, and oneOf are used only in hierarchical and schema data guides (both of which are JSON schemas). Keyword type is used in all three kinds of data guides.

Table 24-2 JSON Schema Data-Guide Fields (Keywords)

Field (Keyword) Value Description
properties

An object whose members represent the properties of a JSON object used in JSON data that is represented by a hierarchical or schema data guide.

items

An object whose members represent the elements (items) of an array used in JSON data represented by a hierarchical or schema data guide.

oneOf

An array, each of whose items represents one or more occurrences of a JSON field in the JSON data represented by a hierarchical or schema data guide.

type

A string naming the type of some JSON data represented by a data guide (of any kind).

The possible values are: array, "boolean", "GeoJSON", "null", "number", "object", "string", and, for JSON type data, "binary", "date", "daysecondInterval", "double", "float", "timestamp", timestamp with time zone, and "yearmonthInterval".

If option DBMS_JSON.DETECT_DATETIME is passed as a flag parameter to function json_dataguide, then any string field value in your data that conforms to an Oracle-supported ISO date or time format is recorded as having type "timestamp" or timestamp with time zone, not "string".

Oracle-Specific Data-Guide Fields

In addition to JSON Schema keywords, a JSON data guide can contain Oracle data guide-specific fields. The field names all have the prefix o:. They are described in Table 24-3.

Table 24-3 Oracle-Specific Data-Guide Fields

Field Value Description
o:path

Path through the JSON documents to the JSON field. Used only in a flat data guide. The value is a simple SQL/JSON path expression (no filter expression), possibly with relaxation (implicit array wrapping and unwrapping), and possibly with a wildcard array step. It has no array steps with array indexes or range specifications, and it has no function step. See SQL/JSON Path Expression Syntax.

o:length

Maximum length of the JSON field value, in bytes. The value is always a power of two. For example, if the maximum length of all actual field values is 5 then the value of o:length is 8, the smallest power of two greater than or equal to 5.

o:preferred_column_name

An identifier, case-sensitive and unique to a given data guide, that you prefer as the name to use for a view column or a virtual column that is created using the data guide.

This field is absent if the data guide was obtained using SQL function json_dataguide with format parameter DBMS_JSON.FORMAT_FLAT or without any format parameter (DBMS_JSON.FORMAT_FLAT is the default).

o:frequency

Percentage of JSON documents that contain the given field. Duplicate occurrences of a field under the same array are ignored. (Available only if statistics were gathered on the document set.)

This field is absent if the data guide was obtained using SQL function json_dataguide, unless the third parameter specified DBMS_JSON.gather_stats.

If the data guide was created using PL/SQL function get_index_dataguide then all documents in the document set are taken into account. Otherwise, only the documents targeted by the json_dataguide query are considered.

o:num_nulls

Number of documents whose value for the targeted scalar field is JSON null. (Available only if statistics were gathered on the document set.)

This field is absent if the data guide was obtained using SQL function json_dataguide, unless the third parameter specified DBMS_JSON.gather_stats.

If the data guide was created using PL/SQL function get_index_dataguide then all documents in the document set are taken into account. Otherwise, only the documents targeted by the json_dataguide query are considered.

o:high_value

Highest value for the targeted scalar field, among all documents examined. (Available only if statistics were gathered on the document set.)

This field is absent if the data guide was obtained using SQL function json_dataguide, unless the third parameter specified DBMS_JSON.gather_stats.

If the data guide was created using PL/SQL function get_index_dataguide then all documents in the document set are taken into account. Otherwise, only the documents targeted by the json_dataguide query are considered.

o:low_value

Lowest value for the targeted scalar field, among all documents examined. (Available only if statistics were gathered on the document set.)

This field is absent if the data guide was obtained using SQL function json_dataguide, unless the third parameter specified DBMS_JSON.gather_stats.

If the data guide was created using PL/SQL function get_index_dataguide then all documents in the document set are taken into account. Otherwise, only the documents targeted by the json_dataguide query are considered.

o:last_analyzed

Date and time when statistics were last gathered on the document set. (Available only if statistics were gathered on the document set.)

This field is absent if the data guide was obtained using SQL function json_dataguide, unless the third parameter specified DBMS_JSON.gather_stats.

If the data guide was created using PL/SQL function get_index_dataguide then all documents in the document set are taken into account. Otherwise, only the documents targeted by the json_dataguide query are considered.

o:sample_size

Total number of JSON documents selected by a query that uses SQL function json_dataguide with its the third parameter specifying DBMS_JSON.gather_stats. You can use a SAMPLE clause in the query to further control the sample size.

This field is absent if the data guide was obtained in some other way.

For all data-guide formats, if a given field has the same type in all documents of the document set then that is the type reported for the field in the data guide.

If a given field has values of different types across the document set, then the types for that field are reported differently by the different data-guide formats, as follows.

A schema data guide reports each of the types used for a given field exactly. If there is more than one type for a field then the field is reported using keyword oneOf, whose value is an array of objects that specify the different types.

A hierarchical data guide reports field types similarly, except that if a field has different scalar values across the document set then the single scalar type "string" is reported for it — just as if all of its scalar values, across all documents, were strings.

A flat data guide reports the nonscalar types of a field using separate objects with different type values ("object", "array"). Just as for a hierarchical data guide, if a field has different scalar values across the document set then the single scalar type "string" is reported for it.

The types of array elements are handled similarly to the types of fields.

For flat and hierarchical data guides, different types of scalar array elements are not reported individually. Instead, a single subschema is recorded for all scalar element values: if the scalar elements, across all documents, have the same type then that type is used. Otherwise, type "string" is used.

For flat and hierarchical data guides, if across the document set an array has both a scalar element and a nonscalar element, or it has both an object element and an array element, then both the nonscalar type(s) and a scalar type are reported. The scalar type reported is as specified above (different scalar element types are reported as type "string").

For array elements, the o:path value (present only in a flat data guide) is the o:path value for the array, followed by an array with a wildcard ([*]), which indicates all array elements.

When present, the default value of field o:preferred_column_name depends on whether the data guide was obtained using SQL function json_dataguide (with format DBMS_JSON.FORMAT_HIERARCHICAL) or using PL/SQL function DBMS_JSON.get_index_dataguide:

  • get_index_dataguide — Same as the corresponding JSON field name, prefixed with the JSON column name followed by $, and with any non-ASCII characters removed. If the resulting field name already exists in the same data guide then it is suffixed with a new sequence number, to make it unique.

    The JSON column-name part is uppercase unless that column was defined using escaped lowercase letters (for example, 'PO_Column' instead of po_column).

    For example, the default value for field User for data in JSON column po_document is PO_DOCUMENT$User.

  • json_dataguide (hierarchical format) — Same as the corresponding JSON field name.

    You can, however, control column naming when you create a view or a virtual column based on the data guide, by specifying the following parameters to DBMS_JSON procedures create_view, get_view_sql, and add_virtual_columns:

    • colNamePrefix => prefix — Prefix the column names specified by o:preferred_column_name with prefix.

    • mixedCaseColumns => FALSE — Make column names be case-insensitive. (They are case-sensitive by default.)

    • resolveNameConflicts => TRUE (default) — Resolve any name conflicts: if the resulting field name already exists in the same data guide then it is suffixed with a new sequence number, to make it unique (same behavior that get_index_dataguide provides).

You can use PL/SQL procedure DBMS_JSON.rename_column to set the value of o:preferred_column_name for a given field and type. This procedure has no effect if data-guide information is not persisted as part of a JSON search index.

Field o:preferred_column_name is used to name a new, virtual column in the table that contains the JSON column, or it is used to name a column in a new view that also contains the other columns of the table. In either case, a name specified by field o:preferred_column_name must be unique with respect to the other column names of the table. In addition, the name must be unique across all JSON fields of any type in the document set. When you use DBMS_JSON.get_index_dataguide, the default name is guaranteed to be unique in these ways.

If the name you specify with DBMS_JSON.rename_column causes a name conflict then the specified name is ignored and a system-generated name is used instead.

See Also:

24.5 Data-Dictionary Views For Persistent Data-Guide Information

You can query static data-dictionary views to see which tables have JSON columns with data guide-enabled JSON search indexes and to extract JSON object field information that is recorded in dataguide-enabled JSON search indexes.

Tables that do not have JSON columns with data guide-enabled indexes are not present in the views.

You can use the following views to find columns that have data guide-enabled JSON search indexes. The views have columns TABLE_NAME (the table name), COLUMN_NAME (the JSON column name), and DATAGUIDE (a data guide).

  • USER_JSON_DATAGUIDES — tables owned by the current user

  • ALL_JSON_DATAGUIDES — tables accessible by the current user

  • DBA_JSON_DATAGUIDES — all tables

If the JSON column has a data guide-enabled JSON search index then the value of column DATAGUIDE is the data guide for the JSON column, in flat format as a CLOB instance. If it does not have a data guide-enabled index then there is no row for that column in the view.

You can use the following views to extract JSON field path and type information that is recorded in dataguide-enabled JSON search indexes. The views have columns TABLE_NAME, COLUMN_NAME, PATH, TYPE, and LENGTH. Columns PATH, TYPE, and LENGTH correspond to the values of data-guide fields o:path, type, and o:length, respectively.
  • USER_JSON_DATAGUIDE_FIELDS — tables owned by the current user

  • ALL_JSON_DATAGUIDE_FIELDS — tables accessible by the current user

  • DBA_JSON_DATAGUIDE_FIELDS — all tables

In the case of both types of view, a view whose name has the prefix ALL_ or DBA_ also has column OWNER, whose value is the table owner.

See Also:

24.6 Specifying a Preferred Name for a Field Column

You can project JSON fields from your data as non-JSON columns in a database view or as non-JSON virtual columns added to the same table that contains the JSON column. You can specify a preferred name for such a column.

The document fields are projected as columns when you use procedure DBMS_JSON.create_view, DBMS_JSON.create_view_on_path, or DBMS_JSON.add_virtual_columns.

A data guide obtained from your JSON document set is used to define this projection. The name of each projected column is taken from data-guide field o:preferred_column_name for the JSON data field to be projected.

If your JSON data has a data guide-enabled search index then you can use procedure DBMS_JSON.rename_column to set the value of o:preferred_column_name for a given document field and type. Example 24-4 illustrates this. It specifies preferred names for the columns to be projected from various fields, as described in Table 24-4.

A hierarchical or schema data guide is populated with field o:preferred_column_name. When you use procedure DBMS_JSON.create_view or DBMS_JSON.add_virtual_columns, you can pass parameters that further control the naming of projected columns:

  • colNamePrefix => prefix — Prefix the names specified by o:preferred_column_name with prefix.

  • mixedCaseColumns => FALSE — Make column names be case-insensitive. (They are case-sensitive by default.)

  • resolveNameConflicts => TRUE (default) — Resolve any name conflicts.

Table 24-4 Preferred Names for Some JSON Field Columns

Field JSON Type Preferred Column Name
PONumber number PONumber
Phone (phone as string, not object – just the number) string Phone
type (phone type) string PhoneType
number (phone number) string PhoneNumber
ItemNumber (line-item number) number ItemNumber
Description (part description) string PartDescription

See Also:

Example 24-4 Specifying Preferred Column Names For Some JSON Fields

BEGIN
  DBMS_JSON.rename_column(
    'J_PURCHASEORDER', 'PO_DOCUMENT',
    '$.PONumber',
    DBMS_JSON.TYPE_NUMBER, 'PONumber');
  DBMS_JSON.rename_column(
    'J_PURCHASEORDER', 'PO_DOCUMENT',
    '$.ShippingInstructions.Phone',
    DBMS_JSON.TYPE_STRING, 'Phone');
  DBMS_JSON.rename_column(
    'J_PURCHASEORDER', 'PO_DOCUMENT',
    '$.ShippingInstructions.Phone.type',
    DBMS_JSON.TYPE_STRING, 'PhoneType');
  DBMS_JSON.rename_column(
    'J_PURCHASEORDER', 'PO_DOCUMENT',
    '$.ShippingInstructions.Phone.number',
    DBMS_JSON.TYPE_STRING, 'PhoneNumber');
  DBMS_JSON.rename_column(
    'J_PURCHASEORDER', 'PO_DOCUMENT',
    '$.LineItems.ItemNumber',
    DBMS_JSON.TYPE_NUMBER, 'ItemNumber');
  DBMS_JSON.rename_column(
    'J_PURCHASEORDER', 'PO_DOCUMENT',
    '$.LineItems.Part.Description',
    DBMS_JSON.TYPE_STRING, 'PartDescription');
END;
/

24.7 Creating a View Over JSON Data Based on Data-Guide Information

Based on data-guide information, you can create a database view whose columns project particular scalar fields from a set of JSON documents.

You can choose the fields to project by editing a hierarchical or schema data guide or by specifying a SQL/JSON path expression and possibly a minimum frequency of field occurrence.

You can create multiple views based on the same JSON document set, projecting different fields. See Multiple Data Guides Per Document Set.

You can create a view by projecting JSON fields using SQL/JSON function json_table — see Creating a View Over JSON Data Using JSON_TABLE.

An alternative is to use PL/SQL procedure DBMS_JSON.create_view or DBMS_JSON.create_view_on_path, to create a view by projecting fields that you choose based on available data-guide information.

The data-guide information can come from either:

  • A hierarchical or schema data guide that includes the fields to project, and possibly a SQL/JSON path expression.

  • A data guide-enabled JSON search index, together with a SQL/JSON path expression, and possibly a minimum field frequency.

In the former case, use procedure create_view. You can edit a (hierarchical or schema) data guide to specify fields that you want included. In this case you do not need (and for a schema data guide you cannot use) a data guide-enabled search index,

In the latter case, use procedure create_view_on_path. In this case you need a data guide-enabled search index, but you do not need a data guide.

In either case, you can provide a SQL/JSON path expression, to specify a field to be expanded for the view. This is required for procedure create_view_on_path. To specify a path for procedure create_view, use optional parameter PATH. The path $ creates a view starting from the JSON document root.

For procedure create_view_on_path, you can also provide a minimum frequency of occurrence, using optional parameter FREQUENCY. The resulting view includes only JSON fields along the path whose frequency is greater than the specified frequency.

When you specify a path, all descendant fields under it are expanded. A view column is created for each scalar value in the resulting subtree. The fields in the document set that are projected include both:

  • All scalar fields present, at any level, in the data that is targeted by the path expression.

  • All scalar fields, anywhere in the document, that are not under an array.

The path argument you provide must be a simple SQL/JSON path expression (no filter expression), possibly with relaxation (implicit array wrapping and unwrapping), but with no array steps and no function step. See SQL/JSON Path Expression Syntax.

Regardless of whether you use procedure create_view or create_view_on_path, in addition to the JSON fields that are projected as columns, all non-JSON columns of the table are also columns of the view.

The data guide that serves as the basis for a given view definition is static; it does not necessarily faithfully continue to reflect the current data in the document set. The fields that are projected for the view are determined when the view is created.

In particular, if you use create_view_on_path (which requires a data guide-enabled search index) then what counts are the fields specified by the given path expression and that have at least the given frequency (default 0), based on the index data at the time of the view creation.

There is also PL/SQL function DBMS_JSON.get_view_sql, which does not create a view, but instead returns the SQL DDL code that would create a view. You can, for example, edit that DDL to create different views.

You can also optionally obtain only the SQL SELECT statement that the view-creation DDL would use. In this case, if more columns would be needed for the view than the maximum number allowed, then the SELECT statement would involve joins of multiple json_table expressions. (The maximum number of columns allowed in a table (default: 1000) is defined by initialization parameter MAX_COLUMNS. See MAX_COLUMNS in Oracle Database Reference.)

See Also:

24.7.1 Creating a View Over JSON Data Based on a Hierarchical or Schema Data Guide

You can use a hierarchical or schema data guide to create a database view whose columns project specified JSON fields from your documents. The fields projected are those in the data guide. You can edit the data guide to include only the fields that you want to project.

You can obtain a hierarchical or schema data guide using SQL function json_dataguide with argument DBMS_JSON.FORMAT_HIERARCHICAL or DBMS_JSON.FORMAT_SCHEMA, respectively.

You can edit the data guide obtained to include only specific fields, change the length of given types, or rename fields. The resulting data guide specifies which fields of the JSON data to project as columns of the view.

Note:

When you use a schema data guide to create a view that includes a column for a given field, if that field has values of different scalar types in the document set then a column is used for each of those scalar types. The names of such multiple columns other than the first have _N appended to the field name (N = 1, 2,…).

For example, if field a has a number value in one document and a string value in another document, then two columns are used in the view, one of type NUMBER and the other of type VARCHAR2. One column is named A; the other is named A_1.

You use PL/SQL procedure DBMS_JSON.create_view to create the view.

Example 24-5 illustrates this using a data guide obtained using Oracle SQL function json_dataguide with argument DBMS_JSON.FORMAT_HIERARCHICAL.

If you create a view using the data guide obtained using json_dataguide then GeoJSON data in your documents is supported. In this case the view column corresponding to the GeoJSON data has SQL data type SDO_GEOMETRY. For that you pass constant DBMS_JSON.GEOJSON or DBMS_JSON.GEOJSON+DBMS_JSON.PRETTY as the third argument to function json_dataguide.

Note:

Function json_dataguide cannot detect GeoJSON data if field coordinates or field geometries precedes field type in a GeoJSON object.

For example, this GeoJSON data is detected as such:

{"type"        : "Point",
 "coordinates" : [ 23.807, 7.121 ]}

This GeoJSON data is not detected as such (it is handled as arbitrary JSON data).

{"coordinates" : [ 23.807, 7.121 ]
 "type"        : "Point"}

See Also:

Example 24-5 Creating a View Using a Hierarchical Data Guide Obtained With JSON_DATAGUIDE

This example creates a view that projects all of the fields present in the hierarchical data guide that is obtained by invoking SQL function json_dataguide on po_document of table j_purchaseorder. The second and third arguments passed to json_dataguide are used, respectively, to specify that the data guide is to be hierarchical and pretty-printed.

The view column names come from the values of field o:preferred_column_name of the data guide that you pass to DBMS_JSON.create_view. By default, the view columns are thus named the same as the projected fields.

Because the columns must be uniquely named in the view, you must ensure that the field names themselves are unique. Optional parameter RESOLVENAMECONFLICTS does this by default (value true), but if you specify it as false then the names are not guaranteed to be unique. In this case (false), an alternative is to edit the data guide returned by json_dataguide to make the value of o:preferred_column_name unique. If parameter RESOLVENAMECONFLICTS is false, then an error is raised by DBMS_JSON.create_view if the names for the columns are not unique.

Although this example does not do so, you can provide a column-name prefix using DBMS_JSON.create_view with parameter colNamePrefix. For example, to get the same effect as that provided when you use a data guide obtained from the information in a data guide-enabled JSON search index, you could specify parameter colNamePrefix as 'PO_DOCUMENT$', that is, the JSON column name, PO_DOCUMENT followed by $. See Example 24-8.

DECLARE
  dg CLOB;
  BEGIN
    SELECT json_dataguide(po_document,
                          FORMAT DBMS_JSON.FORMAT_HIERARCHICAL,
                          DBMS_JSON.PRETTY)
      INTO dg
      FROM j_purchaseorder
      WHERE extract(YEAR FROM date_loaded) = 2014;
    DBMS_JSON.create_view('MYVIEW',
                          'J_PURCHASEORDER',
                          'PO_DOCUMENT',
                          dg);
  END;
/

DESCRIBE myview
 Name                 Null?    Type
 -------------------- -------- ---------------------------
 DATE_LOADED                   TIMESTAMP(6) WITH TIME ZONE
 ID                   NOT NULL RAW(16)
 User                          VARCHAR2(8)
 PONumber                      NUMBER
 UPCCode                       NUMBER
 UnitPrice                     NUMBER
 Description                   VARCHAR2(32)
 Quantity                      NUMBER
 ItemNumber                    NUMBER
 Reference                     VARCHAR2(16)
 Requestor                     VARCHAR2(16)
 CostCenter                    VARCHAR2(4)
 AllowPartialShipment          VARCHAR2(4)
 name                          VARCHAR2(16)
 Phone                         VARCHAR2(16)
 type                          VARCHAR2(8)
 number                        VARCHAR2(16)
 city                          VARCHAR2(32)
 state                         VARCHAR2(2)
 street                        VARCHAR2(32)
 country                       VARCHAR2(32)
 zipCode                       NUMBER
 Special Instructions          VARCHAR2(8)

Related Topics

24.7.2 Creating a View Over JSON Data Based on a Path Expression

You can use the information in a data guide-enabled JSON search index to create a database view whose columns project JSON fields from your documents. The fields projected are the scalar fields not under an array plus the scalar fields in the data targeted by a specified SQL/JSON path expression.

For example, if the path expression is $ then all scalar fields are projected, because the root (top) of the document is targeted. Example 24-6 illustrates this. If the path is $.LineItems.Part then only the scalar fields that are present (at any level) in the data targeted by $.LineItems.Part are projected (in addition to scalar fields elsewhere that are not under an array). Example 24-7 illustrates this.

If you gather statistics on your JSON document set then the data-guide information in a data guide-enabled JSON search index records the frequency of occurrence, across the document set, of each path to a field that is present in a document. When you create the view, you can specify that only the (scalar) fields with a given minimum frequency of occurrence (as a percentage) are to be projected as view columns. You do this by specifying a non-zero value for parameter FREQUENCY of procedure DBMS_JSON.create_view_on_path.

For example, if you specify the path as $ and the minimum frequency as 50 then all scalar fields (on any path, since $ targets the whole document) that occur in at least half (50%) of the documents are projected. Example 24-8 illustrates this.

The value of argument PATH is a simple SQL/JSON path expression (no filter expression), possibly with relaxation (implicit array wrapping and unwrapping), but with no array steps and no function step. See SQL/JSON Path Expression Syntax.

No frequency filtering is done in either of the following cases — targeted fields are projected regardless of their frequency of occurrence in the documents:
  • You never gather statistics information on your set of JSON documents. (No frequency information is included in the data guide-enabled JSON search index.)

  • The FREQUENCY argument of DBMS_JSON.create_view_on_path is zero (0).

Note:

When the FREQUENCY argument is non-zero, even if you have gathered statistics information on your document set, the index contains no statistical information for any documents added after the most recent gathering of statistics. This means that any fields added after that statistics gathering are ignored (not projected).

See Also:

Example 24-6 Creating a View That Projects All Scalar Fields

All scalar fields are represented in the view, because the specified path is $.

(Columns whose names are italic in the describe command output are those that have been renamed using PL/SQL procedure DBMS_JSON.rename_column. Underlined rows are missing from Example 24-8.)

EXEC DBMS_JSON.create_view_on_path('VIEW2',
                                   'J_PURCHASEORDER',
                                   'PO_DOCUMENT',
                                   '$');

DESCRIBE view2;
 Name                              Null?    Type
 --------------------------------  -------- ------------------
 ID                                NOT NULL RAW(16)
 DATE_LOADED                       TIMESTAMP(6) WITH TIME ZONE
 PO_DOCUMENT$User                  VARCHAR2(8)
 PONumber                          NUMBER
 PO_DOCUMENT$Reference             VARCHAR2(16)
 PO_DOCUMENT$Requestor             VARCHAR2(16)
 PO_DOCUMENT$CostCenter            VARCHAR2(4)
 PO_DOCUMENT$AllowPartialShipment  VARCHAR2(4)
 PO_DOCUMENT$name                  VARCHAR2(16)
 Phone                             VARCHAR2(16)
 PO_DOCUMENT$city                  VARCHAR2(32)
 PO_DOCUMENT$state                 VARCHAR2(2)
 PO_DOCUMENT$street                VARCHAR2(32)
 PO_DOCUMENT$country               VARCHAR2(32)
 PO_DOCUMENT$zipCode               NUMBER
 PO_DOCUMENT$SpecialInstructions   VARCHAR2(8)
 PO_DOCUMENT$UPCCode               NUMBER
 PO_DOCUMENT$UnitPrice             NUMBER
 PartDescription                   VARCHAR2(32)
 PO_DOCUMENT$Quantity              NUMBER
 ItemNumber                        NUMBER
 PhoneType                         VARCHAR2(8)
 PhoneNumber                       VARCHAR2(16)

Example 24-7 Creating a View That Projects Scalar Fields Targeted By a Path Expression

Fields Itemnumber, PhoneType, and PhoneNumber are not represented in the view. The only fields that are projected are those scalar fields that are not under an array plus those that are present (at any level) in the data that is targeted by $.LineItems.Part (that is, the scalar fields whose paths start with $.LineItems.Part). (Columns whose names are italic in the describe command output are those that have been renamed using PL/SQL procedure DBMS_JSON.rename_column.)

SQL> EXEC DBMS_JSON.create_view_on_path('VIEW4',
                                        'J_PURCHASEORDER',
                                        'PO_DOCUMENT',
                                        '$.LineItems.Part');


SQL> DESCRIBE view4;
 Name                             Null?    Type
 -------------------------------- -------- ------------------
 ID                               NOT NULL RAW(16)
 DATE_LOADED                      TIMESTAMP(6) WITH TIME ZONE
 PO_DOCUMENT$User                 VARCHAR2(8)
 PONumber                         NUMBER
 PO_DOCUMENT$Reference            VARCHAR2(16)
 PO_DOCUMENT$Requestor            VARCHAR2(16)
 PO_DOCUMENT$CostCenter           VARCHAR2(4)
 PO_DOCUMENT$AllowPartialShipment VARCHAR2(4)
 PO_DOCUMENT$name                 VARCHAR2(16)
 Phone                            VARCHAR2(16)
 PO_DOCUMENT$city                 VARCHAR2(32)
 PO_DOCUMENT$state                VARCHAR2(2)
 PO_DOCUMENT$street               VARCHAR2(32)
 PO_DOCUMENT$country              VARCHAR2(32)
 PO_DOCUMENT$zipCode              NUMBER
 PO_DOCUMENT$SpecialInstructions  VARCHAR2(8)
 PO_DOCUMENT$UPCCode              NUMBER
 PO_DOCUMENT$UnitPrice            NUMBER
 PartDescription                  VARCHAR2(32)

Example 24-8 Creating a View That Projects Scalar Fields Having a Given Frequency

All scalar fields that occur in all (100%) of the documents are represented in the view. Field AllowPartialShipment does not occur in all of the documents, so there is no column PO_DOCUMENT$AllowPartialShipment in the view. Similarly for fields Phone, PhoneType, and PhoneNumber.

(Columns whose names are italic in the describe command output are those that have been renamed using PL/SQL procedure DBMS_JSON.rename_column.)

SQL> EXEC DBMS_JSON.create_view_on_path('VIEW3',
                                        'J_PURCHASEORDER',
                                        'PO_DOCUMENT',
                                        '$',
                                        100);


SQL> DESCRIBE view3;
 Name                             Null?    Type
 -------------------------------- -------- ------------------
 ID                               NOT NULL RAW(16)
 DATE_LOADED                      TIMESTAMP(6) WITH TIME ZONE
 PO_DOCUMENT$User                 VARCHAR2(8)
 PONumber                         NUMBER
 PO_DOCUMENT$Reference            VARCHAR2(16)
 PO_DOCUMENT$Requestor            VARCHAR2(16)
 PO_DOCUMENT$CostCenter           VARCHAR2(4)
 PO_DOCUMENT$name                 VARCHAR2(16)
 PO_DOCUMENT$city                 VARCHAR2(32)
 PO_DOCUMENT$state                VARCHAR2(2)
 PO_DOCUMENT$street               VARCHAR2(32)
 PO_DOCUMENT$country              VARCHAR2(32)
 PO_DOCUMENT$zipCode              NUMBER
 PO_DOCUMENT$SpecialInstructions  VARCHAR2(8)
 PO_DOCUMENT$UPCCode              NUMBER
 PO_DOCUMENT$UnitPrice            NUMBER
 PartDescription                  VARCHAR2(32)
 PO_DOCUMENT$Quantity             NUMBER
 ItemNumber                       NUMBER

24.8 Adding and Dropping Virtual Columns For JSON Fields Based on Data-Guide Information

Based on data-guide information for a JSON column, you can project scalar fields from that JSON data as virtual columns in the same table. The scalar fields projected are those that are not under an array.

You can do all of the following with a virtual column, with the aim of improving performance:

  • Build an index on it.

  • Gather statistics on it for the optimizer.

  • Load it into the In-Memory Column Store (IM column store).

Note:

The number of virtual columns per table is limited by the value of initialization parameter MAX_COLUMNS. By default that value is STANDARD, which means 1000 columns maximum. See MAX_COLUMNS in Oracle Database Reference.

You use PL/SQL procedure DBMS_JSON.add_virtual_columns to add virtual columns based on data-guide information for a JSON column. Before it adds virtual columns, procedure add_virtual_columns first drops any existing virtual columns that were projected from fields in the same JSON column by a previous invocation of add_virtual_columns or by data-guide change-trigger procedure add_vc (in effect, it does what procedure DBMS_JSON.drop_virtual_columns does).

There are two alternative sources of the data-guide information that you provide to procedure add_virtual_columns:

  • It can come from a hierarchical or schema data guide that you pass as an argument. All scalar fields in the data guide that are not under an array are projected as virtual columns. All other fields in the data guide are ignored (not projected).

    In this case, you can edit the data guide before passing it, so that it specifies the scalar fields (not under an array) that you want projected. You do not need a data guide-enabled search index in this case.

  • It can come from a data guide-enabled JSON search index.

    In this case, you can specify, as the value of argument FREQUENCY to procedure add_virtual_columns, a minimum frequency of occurrence for the scalar fields to be projected. You need a data guide-enabled search index in this case, but you do not need a data guide.

You can also specify that added virtual columns be hidden. The SQL describe command does not list hidden columns.

  • If you pass a hierarchical or schema data guide to add_virtual_columns then you can specify projection of particular scalar fields (not under an array) as hidden virtual columns by adding "o:hidden": true to their descriptions in the data guide.

  • If you use a data guide-enabled JSON search index with add_virtual_columns then you can specify a PL/SQL TRUE value for argument HIDDEN, to make all of the added virtual columns be hidden. (The default value of HIDDEN is FALSE, meaning that the added virtual columns are not hidden.)

Related Topics

See Also:

24.8.1 Adding Virtual Columns For JSON Fields Based on a Hierarchical or Schema Data Guide

You can use a hierarchical or schema data guide to project scalar fields from JSON data as virtual columns in the same table.

All scalar fields in the data guide that are not under an array are projected as virtual columns. All other fields in the data guide are ignored (not projected).

You can obtain a hierarchical or schema data guide using Oracle SQL function json_dataguide with argument DBMS_JSON.FORMAT_HIERARCHICAL or DBMS_JSON.FORMAT_SCHEMA, respectively.

You can edit the data guide obtained, to include only specific scalar fields (that are not under an array), rename those fields, or change the lengths of their types. The resulting data guide specifies which such fields to project as new virtual columns. Any fields in the data guide that are not scalar fields not under an array are ignored (not projected).

Note:

When you use a schema data guide to add a virtual column for a given field, if that field has values of different scalar types in the document set then a column is added for each of those scalar types. The names of such multiple columns other than the first have _N appended to the field name (N = 1, 2,…).

For example, if field a has a number value in one document and a string value in another document, then two virtual columns are created, one of type NUMBER and the other of type VARCHAR2. One column is named A; the other is named A_1.

You use PL/SQL procedure DBMS_JSON.add_virtual_columns to add the virtual columns to the table that contains the JSON column containing the projected fields. That procedure first drops any existing virtual columns that were projected from fields in the same JSON column by a previous invocation of add_virtual_columns or by data-guide change-trigger procedure add_vc (in effect, it does what procedure DBMS_JSON.drop_virtual_columns does).

Example 24-9 illustrates this. It projects scalar fields that are not under an array, from the data in JSON column po_document of table j_purchaseorder. The fields projected are those that are indicated in the data guide.

Example 24-10 illustrates passing a data-guide argument that specifies the projection of two fields as virtual columns. Data-guide field o:hidden is used to hide one of these columns.

See Also:

Example 24-9 Adding Virtual Columns That Project JSON Fields Using a Data Guide Obtained With JSON_DATAGUIDE

This example uses a hierarchical data guide obtained using function json_dataguide with JSON column po_document.

The added virtual columns are all of the columns in table j_purchaseorder except for ID, DATE_LOADED, and PODOCUMENT.

  • Parameter resolveNameConflicts is TRUE, to ensure that any name conflicts get resolved. (Optional, for clarity; this is anyway the default value.)

  • Parameter colNamePrefix is 'PO_DOCUMENT$', to use that as the default prefix for column names.

  • Parameter mixedCaseColumns is TRUE, to make column names be case-sensitive, that is, to distinguish uppercase and lowercase letters.

DECLARE
  dg CLOB;
BEGIN
  SELECT json_dataguide(po_document, DBMS_JSON.FORMAT_HIERARCHICAL) INTO dg
    FROM j_purchaseorder;
  DBMS_JSON.add_virtual_columns('J_PURCHASEORDER',
                                'PO_DOCUMENT',
                                dg,
                                resolveNameConflicts=>TRUE,
                                colNamePrefix=>'PO_DOCUMENT$',
                                mixedCaseColumns=>TRUE);
END;
/

DESCRIBE j_purchaseorder;
 Name                             Null?    Type
 -------------------------------- -------- ---------------------------
 ID                               NOT NULL RAW(16)
 DATE_LOADED                               TIMESTAMP(6) WITH TIME ZONE
 PO_DOCUMENT                               CLOB
 PO_DOCUMENT$User                          VARCHAR2(8)
 PO_DOCUMENT$PONumber                      NUMBER
 PO_DOCUMENT$Reference                     VARCHAR2(16)
 PO_DOCUMENT$Requestor                     VARCHAR2(16)
 PO_DOCUMENT$CostCenter                    VARCHAR2(4)
 PO_DOCUMENT$AllowPartialShipment          VARCHAR2(4)
 PO_DOCUMENT$name                          VARCHAR2(16)
 PO_DOCUMENT$Phone                         VARCHAR2(16)
 PO_DOCUMENT$city                          VARCHAR2(32)
 PO_DOCUMENT$state                         VARCHAR2(2)
 PO_DOCUMENT$street                        VARCHAR2(32)
 PO_DOCUMENT$country                       VARCHAR2(32)
 PO_DOCUMENT$zipCode                       NUMBER
 PO_DOCUMENT$SpecialInstructions           VARCHAR2(8)

Example 24-10 Adding Virtual Columns, Hidden and Visible

In this example only two fields are projected as virtual columns: PO_Number and PO_Reference. The data guide is defined locally as a literal string. Data-guide field o:hidden is used here to hide the virtual column for PO_Reference. (For PO_Number the o:hidden: false entry is not needed, as false is the default value.)

DECLARE
  dg CLOB;
BEGIN
  dg := '{"type" : "object",
          "properties" :
            {"PO_Number"    : {"type" : "number",
                               "o:length" : 4,
                               "o:preferred_column_name" : "PO_Number",
                               "o:hidden" : false},
             "PO_Reference" : {"type" : "string",
                               "o:length" : 16,
                               "o:preferred_column_name" : "PO_Reference",
                               "o:hidden" : true}}}';
  DBMS_JSON.add_virtual_columns('J_PURCHASEORDER', 'PO_DOCUMENT', dg);
END;
/

DESCRIBE j_purchaseorder;
 Name        Null?    Type
 ----------- -------- ---------------------------
 ID          NOT NULL RAW(16)
 DATE_LOADED          TIMESTAMP(6) WITH TIME ZONE
 PO_DOCUMENT          CLOB
 PO_Number            NUMBER

SELECT column_name FROM user_tab_columns
  WHERE table_name = 'J_PURCHASEORDER' ORDER BY 1;
COLUMN_NAME
-----------
DATE_LOADED
ID
PO_DOCUMENT
PO_Number
PO_Reference

5 rows selected.

Related Topics

24.8.2 Adding Virtual Columns For JSON Fields Based on a Data Guide-Enabled Search Index

You can use a data guide-enabled search index for a JSON column to project scalar fields from that JSON data as virtual columns in the same table. Only scalar fields not under an array are projected. You can specify a minimum frequency of occurrence for the fields to be projected.

You use procedure DBMS_JSON.add_virtual_columns to add the virtual columns.

Example 24-11 illustrates this. It projects all scalar fields that are not under an array to table j_purchaseorder as virtual columns.

If you gather statistics on the documents in the JSON column where you want to project fields then the data-guide information in the data guide-enabled JSON search index records the frequency of occurrence, across that document set, of each field in a document.

When you add virtual columns you can specify that only those fields with a given minimum frequency of occurrence are to be projected.

You do this by specifying a non-zero value for parameter FREQUENCY of procedure add_virtual_columns. Zero is the default value, so if you do not include argument FREQUENCY then all scalar fields (not under an array) are projected. The frequency of a given field is the number of documents containing that field divided by the total number of documents in the JSON column, expressed as a percentage.

Example 24-12 projects all scalars (not under an array) that occur in all (100%) of the documents as virtual columns.

If you want to hide all of the added virtual columns then specify a TRUE value for argument HIDDEN. (The default value of parameter HIDDEN is FALSE, meaning that the added virtual columns are not hidden.)

Example 24-13 projects, as hidden virtual columns, the scalar fields (not under an array) that occur in all (100%) of the documents.

See Also:

Example 24-11 Projecting All Scalar Fields Not Under an Array as Virtual Columns

The added virtual columns are all of the columns in table j_purchaseorder except for ID, DATE_LOADED, and PODOCUMENT. This is because no FREQUENCY argument is passed to add_virtual_columns, so all scalar fields (that are not under an array) are projected.

(Columns whose names are italic in the describe command output are those that have been renamed using PL/SQL procedure DBMS_JSON.rename_column.)

EXEC DBMS_JSON.add_virtual_columns('J_PURCHASEORDER', 'PO_DOCUMENT');

DESCRIBE j_purchaseorder;
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 ID                                        NOT NULL RAW(16)
 DATE_LOADED                                        TIMESTAMP(6) WITH TIME ZONE
 PO_DOCUMENT                                        CLOB
 PO_DOCUMENT$User                                   VARCHAR2(8)
 PONumber                                           NUMBER
 PO_DOCUMENT$Reference                              VARCHAR2(16)
 PO_DOCUMENT$Requestor                              VARCHAR2(16)
 PO_DOCUMENT$CostCenter                             VARCHAR2(4)
 PO_DOCUMENT$AllowPartialShipment                   VARCHAR2(4)
 PO_DOCUMENT$name                                   VARCHAR2(16)
 Phone                                              VARCHAR2(16)
 PO_DOCUMENT$city                                   VARCHAR2(32)
 PO_DOCUMENT$state                                  VARCHAR2(2)
 PO_DOCUMENT$street                                 VARCHAR2(32)
 PO_DOCUMENT$country                                VARCHAR2(32)
 PO_DOCUMENT$zipCode                                NUMBER
 PO_DOCUMENT$SpecialInstructions                    VARCHAR2(8)

Example 24-12 Projecting Scalar Fields With a Minimum Frequency as Virtual Columns

All scalar fields that occur in all (100%) of the documents are projected as virtual columns. The result is the same as that for Example 24-11, except that fields AllowPartialShipment and Phone are not projected, because they do not occur in 100% of the documents.

(Columns whose names are italic in the describe command output are those that have been renamed using PL/SQL procedure DBMS_JSON.rename_column.)

EXEC DBMS_JSON.add_virtual_columns('J_PURCHASEORDER', 'PO_DOCUMENT', 100);

DESCRIBE j_purchaseorder;
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 ID                                        NOT NULL RAW(16)
 DATE_LOADED                                        TIMESTAMP(6) WITH TIME ZONE
 PO_DOCUMENT                                        CLOB
 PO_DOCUMENT$User                                   VARCHAR2(8)
 PONumber                                           NUMBER
 PO_DOCUMENT$Reference                              VARCHAR2(16)
 PO_DOCUMENT$Requestor                              VARCHAR2(16)
 PO_DOCUMENT$CostCenter                             VARCHAR2(4)
 PO_DOCUMENT$name                                   VARCHAR2(16)
 PO_DOCUMENT$city                                   VARCHAR2(32)
 PO_DOCUMENT$state                                  VARCHAR2(2)
 PO_DOCUMENT$street                                 VARCHAR2(32)
 PO_DOCUMENT$country                                VARCHAR2(32)
 PO_DOCUMENT$zipCode                                NUMBER
 PO_DOCUMENT$SpecialInstructions                    VARCHAR2(8)

Example 24-13 Projecting Scalar Fields With a Minimum Frequency as Hidden Virtual Columns

The result is the same as that for Example 24-12, except that all of the added virtual columns are hidden. (The query of view USER_TAB_COLUMNS shows that the virtual columns were in fact added.)

EXEC DBMS_JSON.add_virtual_columns('J_PURCHASEORDER', 'PO_DOCUMENT', 100, TRUE);

DESCRIBE j_purchaseorder;
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 ID                                        NOT NULL RAW(16)
 DATE_LOADED                                        TIMESTAMP(6) WITH TIME ZONE
 PO_DOCUMENT                                        CLOB

SELECT column_name FROM user_tab_columns
  WHERE table_name = 'J_PURCHASEORDER'
  ORDER BY 1;

COLUMN_NAME
-----------
DATE_LOADED
ID
PONumber
PO_DOCUMENT
PO_DOCUMENT$CostCenter
PO_DOCUMENT$Reference
PO_DOCUMENT$Requestor
PO_DOCUMENT$SpecialInstructions
PO_DOCUMENT$User
PO_DOCUMENT$city
PO_DOCUMENT$country
PO_DOCUMENT$name
PO_DOCUMENT$state
PO_DOCUMENT$street
PO_DOCUMENT$zipCode

24.8.3 Dropping Virtual Columns for JSON Fields Based on Data-Guide Information

You can use procedure DBMS_JSON.drop_virtual_columns to drop all virtual columns that were added for JSON fields in a column of JSON data.

Procedure DBMS_JSON.drop_virtual_columns drops all virtual columns that were projected from fields in a given JSON column by an invocation of add_virtual_columns or by data-guide change-trigger procedure add_vc. Example 24-14 illustrates this for fields projected from column po_document of table j_purchaseorder.

See Also:

Example 24-14 Dropping Virtual Columns Projected From JSON Fields

EXEC DBMS_JSON.drop_virtual_columns('J_PURCHASEORDER', 'PO_DOCUMENT');

24.9 Change Triggers For Data Guide-Enabled Search Index

When JSON data changes, some information in a data guide-enabled JSON search index is automatically updated. You can specify a procedure whose invocation is triggered whenever this happens. You can define your own PL/SQL procedure for this, or you can use the predefined change-trigger procedure add_vc.

The data-guide information in a data guide-enabled JSON search index records structure, type, and possibly statistical information about a set of JSON documents. Except for the statistical information, which is updated only when you gather statistics, relevant changes in the document set are automatically reflected in the data-guide information stored in the index.

You can define a PL/SQL procedure whose invocation is automatically triggered by such an index update. The invocation occurs when the index is updated. Any errors that occur during the execution of the procedure are ignored.

You can use the predefined change-trigger procedure add_vc to automatically add virtual columns that project JSON fields from the document set or to modify existing such columns, as needed. The virtual columns added by add_vc follow the same naming rules as those you add by invoking procedure DBMS_JSON.add_virtual_columns for a JSON column that has a data guide-enabled search index.

In either case, any error that occurs during the execution of the procedure is ignored.

Unlike DBMS_JSON.add_virtual_columns, add_vc does not first drop any existing virtual columns that were projected from fields in the same JSON column. To drop virtual columns projected from fields in the same JSON column by add_vc or by add_virtual_columns, use procedure DBMS_JSON.drop_virtual_columns.

You specify the use of a trigger for data-guide changes by using the keywords DATAGUIDE ON CHANGE in the PARAMETERS clause when you create or alter a JSON search index. Only one change trigger is allowed per index: altering an index to specify a trigger automatically replaces any previous trigger for it.

Example 24-15 alters existing JSON search index po_search_idx to use procedure add_vc.

Example 24-15 Adding Virtual Columns Automatically With Change Trigger ADD_VC

This example adds predefined change trigger add_vc to JSON search index po_search_idx.

It first drops any existing virtual columns that were projected from fields in JSON column po_document either by procedure DBMS_JSON.add_virtual_columns or by a pre-existing add_vc change trigger for the same JSON search index.

Then it alters the search index to add change trigger add_vc (if it was already present then this is has no effect).

Finally, it inserts a new document that provokes a change in the data guide. Two virtual columns are added to the table, for the two scalar fields not under an array.

EXEC DBMS_JSON.drop_virtual_columns('J_PURCHASEORDER', 'PO_DOCUMENT');

ALTER INDEX po_search_idx REBUILD
  PARAMETERS ('DATAGUIDE ON CHANGE add_vc');

INSERT INTO j_purchaseorder
  VALUES (
    SYS_GUID(),
    to_date('30-JUN-2015'),
    '{"PO_Number"     : 4230,
      "PO_Reference"  : "JDEER-20140421",
      "PO_LineItems"  : [ {"Part_Number"  : 230912362345,
                           "Quantity"     : 3.0} ]}');

DESCRIBE j_purchaseorder;
 Name                      Null?    Type
 ------------------------- -------- ----------------------------
 ID                        NOT NULL RAW(16)
 DATE_LOADED                        TIMESTAMP(6) WITH TIME ZONE
 PO_DOCUMENT                        CLOB
 PO_DOCUMENT$PO_Number              NUMBER
 PO_DOCUMENT$PO_Reference           VARCHAR2(16)

See Also:

24.9.1 User-Defined Data-Guide Change Triggers

You can define a procedure whose invocation is triggered automatically whenever a given data guide-enabled JSON search index is updated. Any errors that occur during the execution of the procedure are ignored.

Example 24-16 illustrates this.

A user-defined procedure specified with keywords DATAGUIDE ON CHANGE in a JSON search index PARAMETERS clause must accept the parameters specified in Table 24-5.

Table 24-5 Parameters of a User-Defined Data-Guide Change Trigger Procedure

Name Type Description
table_name VARCHAR2 Name of the table containing column column_name.
column_name VARCHAR2 Name of a JSON column that has a data guide-enabled JSON search index.
path VARCHAR2 A SQL/JSON path expression that targets a particular field in the data in column column_name. This path is affected by the index change that triggered the procedure invocation. For example, the index change involved adding this path or changing its type value or its type-length value.
new_type NUMBER A new type for the given path.
new_type_length NUMBER A new type length for the given path.

Example 24-16 Tracing Data-Guide Updates With a User-Defined Change Trigger

This example first drops any existing virtual columns projected from fields in JSON column po_document.

It then defines PL/SQL procedure my_dataguide_trace, which prints the names of the table and JSON column, together with the path, type and length fields of the added virtual column. It then alters JSON search index po_search_idx to specify that this procedure be invoked as a change trigger for updates to the data-guide information in the index.

It then inserts a new document that provokes a change in the data guide, which triggers the output of trace information.

Note that the TYPE argument to the procedure must be a number that is one of the DBMS_JSON constants for a JSON type. The procedure tests the argument and outputs a user-friendly string in place of the number.

EXEC DBMS_JSON.drop_virtual_columns('J_PURCHASEORDER', 'PO_DOCUMENT');

CREATE OR REPLACE PROCEDURE my_dataguide_trace(tableName VARCHAR2,
                                               jcolName  VARCHAR2,
                                               path      VARCHAR2,
                                               type      NUMBER,
                                               tlength   NUMBER)
  IS
    typename VARCHAR2(10);
  BEGIN
    IF    (type = DBMS_JSON.TYPE_NULL)    THEN typename := 'null';
    ELSIF (type = DBMS_JSON.TYPE_BOOLEAN) THEN typename := 'boolean';
    ELSIF (type = DBMS_JSON.TYPE_NUMBER)  THEN typename := 'number';
    ELSIF (type = DBMS_JSON.TYPE_STRING)  THEN typename := 'string';
    ELSIF (type = DBMS_JSON.TYPE_OBJECT)  THEN typename := 'object';
    ELSIF (type = DBMS_JSON.TYPE_ARRAY)   THEN typename := 'array';
    ELSE                                       typename := 'unknown';
    END IF;
    DBMS_OUTPUT.put_line('Updating ' || tableName || '(' || jcolName
                         || '): Path = ' || path || ', Type = ' || type
                         || ', Type Name = ' || typename
                         || ', Type Length = ' || tlength);
  END;
/

ALTER INDEX po_search_idx REBUILD
  PARAMETERS ('DATAGUIDE ON CHANGE my_dataguide_trace');

INSERT INTO j_purchaseorder
  VALUES (
    SYS_GUID(),
    to_date('30-MAR-2016'),
    '{"PO_ID"     : 4230,
      "PO_Ref"  : "JDEER-20140421",
      "PO_Items"  : [ {"Part_No"       : 98981327234,
                      "Item_Quantity" : 13} ]}');

COMMIT;
Updating J_PURCHASEORDER(PO_DOCUMENT):
  Path = $.PO_ID, Type = 3, Type Name = number, Type Length = 4
Updating J_PURCHASEORDER(PO_DOCUMENT):
  Path = $.PO_Ref, Type = 4, Type Name = string, Type Length = 16
Updating J_PURCHASEORDER(PO_DOCUMENT):
  Path = $.PO_Items, Type = 6, Type Name = array, Type Length = 64
Updating J_PURCHASEORDER(PO_DOCUMENT):
  Path = $.PO_Items.Part_No, Type = 3, Type Name = number, Type Length = 16
Updating J_PURCHASEORDER(PO_DOCUMENT):
  Path = $.PO_Items.Item_Quantity, Type = 3, Type Name = number, Type Length = 2

Commit complete.

See Also:

24.10 Multiple Data Guides Per Document Set

A data guide reflects the shape of a given set of JSON documents. If a JSON column contains different types of documents, with different structure or type information, you can create and use different data guides for the different kinds of documents.

Data Guides For Different Kinds of JSON Documents

JSON documents need not, and typically do not, follow a prescribed schema. This is true even for documents that are used similarly in a given application; they may differ in structural ways (shape), and field types may differ.

A JSON data guide summarizes the structural and type information of a given set of documents. In general, the more similar the structure and type information of the documents in a given set, the more useful the resulting data guide.

A data guide is created for a given column of JSON data. If the column contains very different kinds of documents (for example, purchase orders and health records) then a single data guide for the column is likely to be of limited use.

One way to address this concern is to put different kinds of JSON documents in different JSON columns. But sometimes other considerations decide in favor of mixing document types in the same column.

In addition, documents of the same general type, which you decide to store in the same column, can nevertheless differ in relatively systematic ways. This includes the case of evolving document shape and type information. For example, the structure of tax-information documents could change from year to year.

When you create a data guide you can decide which information to summarize. And you can thus create different data guides for the same JSON column, to represent different subsets of the document set.

An additional aid in this regard is to have a separate, non-JSON, column in the same table, which is used to label, or categorize, the documents in a JSON column.

In the case of the purchase-order documents used in our examples, let’s suppose that their structure can evolve significantly from year to year, so that column date_loaded of table j_purchaseorder can be used to group them into subsets of reasonably similar shape. Example 24-17 adds a purchase-order document for 2015, and Example 24-18 adds a purchase-order document for 2016. (Compare with the documents for 2014, which are added in Example 4-3.)

Using a SQL Aggregate Function to Create Multiple Data Guides

Oracle SQL function json_dataguide is in fact an aggregate function. An aggregate function returns a single result row based on groups of rows, rather than on a single row. It is typically used in a SELECT list for a query that has a GROUP BY clause, which divides the rows of a queried table or view into groups. The aggregate function applies to each group of rows, returning a single result row for each group. For example, aggregate function avg returns the average of a group of values.

Function json_dataguide aggregates JSON data to produce a summary, or specification, of it, which is returned in the form of a JSON document. In other words, for each group of JSON documents to which they are applied, they return a data guide.

If you omit GROUP BY then this function returns a single data guide that summarizes all of the JSON data in the subject JSON column.

Example 24-19 queries the documents of JSON column po_document, grouping them to produce three data guides, one for each year of column date_loaded.

Example 24-17 Adding a 2015 Purchase-Order Document

The 2015 purchase-order format uses only part number, reference, and line-items as its top-level fields, and these fields use prefix PO_. Each line item contains only a part number and a quantity.

INSERT INTO j_purchaseorder
  VALUES (
    SYS_GUID(),
    to_date('30-JUN-2015'),
    '{"PO_Number"     : 4230,
      "PO_Reference"  : "JDEER-20140421",
      "PO_LineItems"  : [ {"Part_Number"  : 230912362345,
                          "Quantity"     : 3.0} ]}');

Example 24-18 Adding a 2016 Purchase-Order Document

The 2016 format uses PO_ID instead of PO_Number, PO_Ref instead of PO_Reference, PO_Items instead of PO_LineItems, Part_No instead of Part_Number, and Item_Quantity instead of Quantity.

INSERT INTO j_purchaseorder
  VALUES (
    SYS_GUID(),
    to_date('30-MAR-2016'),
    '{"PO_ID"     : 4230,
      "PO_Ref"  : "JDEER-20140421",
      "PO_Items"  : [ {"Part_No"       : 98981327234,
                      "Item_Quantity" : 13} ]}');

Example 24-19 Creating Multiple Data Guides With Aggregate Function JSON_DATAGUIDE

This example uses aggregate SQL function json_dataguide to obtain three flatFoot 1 data guides, one for each year-specific format. The data guide for 2014 is shown only partially — it is the same as the data guide from A Flat Data Guide For Purchase-Order Documents, except that no statistics fields are present. (Data guides returned by functions json_dataguide do not contain any statistics fields.

SELECT extract(YEAR FROM date_loaded), json_dataguide(po_document)
  FROM j_purchaseorder
  GROUP BY extract(YEAR FROM date_loaded)
  ORDER BY extract(YEAR FROM date_loaded) DESC;

EXTRACT(YEARFROMDATE_LOADED)
----------------------------
JSON_DATAGUIDE(PO_DOCUMENT)
---------------------------
                        2016
[
  {
    "o:path" : "$.PO_ID",
    "type" : "number",
    "o:length" : 4
  },
  {
    "o:path" : "$.PO_Ref",
    "type" : "string",
    "o:length" : 16
  },
  {
    "o:path" : "$.PO_Items",
    "type" : "array",
    "o:length" : 64
  },
  {
    "o:path" : "$.PO_Items.Part_No",
    "type" : "number",
    "o:length" : 16
  },
  {
    "o:path" : "$.PO_Items.Item_Quantity",
    "type" : "number",
    "o:length" : 2
  }
]

                        2015
[
  {
    "o:path" : "$.PO_Number",
    "type" : "number",
    "o:length" : 4
  },
  {
    "o:path" : "$.PO_LineItems",
    "type" : "array",
    "o:length" : 64
  },
  {
    "o:path" : "$.PO_LineItems.Quantity",
    "type" : "number",
    "o:length" : 4
  },
  {
    "o:path" : "$.PO_LineItems.Part_Number",
    "type" : "number",
    "o:length" : 16
  },
  {
    "o:path" : "$.PO_Reference",
    "type" : "string",
    "o:length" : 16
  }
]

                        2014
[
  {
    "o:path" : "$.User",
    "type" : "string",
    "o:length" : 8
  },
  {
    "o:path" : "$.PONumber",
    "type" : "number",
    "o:length" : 4
  },
...
  {
    "o:path" : "$.\"Special Instructions\"",
    "type" : "string",
    "o:length" : 8
  }
]

3 rows selected.

See Also:

Oracle Database SQL Language Reference for information about SQL function json_dataguide

24.11 Querying a Data Guide

A data guide is information about a set of JSON documents. You can query it from a flat data guide that you obtain using either Oracle SQL function json_dataguide or PL/SQL function DBMS_JSON.get_index_dataguide. In the latter case, a data guide-enabled JSON search index must be defined on the JSON data.

See Also:

Example 24-20 Querying a Data Guide Obtained Using JSON_DATAGUIDE

This example uses SQL/JSON function json_dataguide to obtain a flat data guide. It then queries the relational columns projected on the fly by SQL/JSON function json_table from fields o:path, type, and o:length. It returns the projected columns ordered lexicographically by the path column created, jpath.

If DBMS_JSON.GATHER_STATS were included in a third argument to json_dataguide then the data guide returned would also include statistical fields.

WITH dg_t AS (SELECT json_dataguide(po_document) dg_doc
                FROM j_purchaseorder)
  SELECT jt.*
    FROM dg_t,
         json_table(dg_doc, '$[*]'
           COLUMNS
             jpath   VARCHAR2(40) PATH '$."o:path"',
             type    VARCHAR2(10) PATH '$."type"',
             tlength NUMBER       PATH '$."o:length"') jt
   ORDER BY jt.jpath;

JPATH                                    TYPE          TLENGTH
---------------------------------------- ------------- -------
$."Special Instructions"                 string              8
$.AllowPartialShipment                   boolean             4
$.CostCenter                             string              4
$.LineItems                              array             512
$.LineItems.ItemNumber                   number              1
$.LineItems.Part                         object            128
$.LineItems.Part.Description             string             32
$.LineItems.Part.UPCCode                 number             16
$.LineItems.Part.UnitPrice               number              8
$.LineItems.Quantity                     number              4
$.PONumber                               number              4
$.PO_LineItems                           array              64
$.Reference                              string             16
$.Requestor                              string             16
$.ShippingInstructions                   object            256
$.ShippingInstructions.Address           object            128
$.ShippingInstructions.Address.city      string             32
$.ShippingInstructions.Address.country   string             32
$.ShippingInstructions.Address.state     string              2
$.ShippingInstructions.Address.street    string             32
$.ShippingInstructions.Address.zipCode   number              8
$.ShippingInstructions.Phone             array             128
$.ShippingInstructions.Phone             string             16
$.ShippingInstructions.Phone.number      string             16
$.ShippingInstructions.Phone.type        string              8
$.ShippingInstructions.name              string             16
$.User                                   string              8

Example 24-21 Querying a Data Guide With Index Data For Paths With Frequency at Least 80%

This example uses PL/SQL function DBMS_JSON.get_index_dataguide with format value DBMS_JSON.FORMAT_FLAT to obtain a flat data guide from the data-guide information stored in a data guide-enabled JSON search index. It then queries the relational columns projected on the fly from fields o:path, type, o:length, and o:frequency by SQL/JSON function json_table.

The value of field o:frequency is a statistic that records the frequency of occurrence, across the document set, of each field in a document. It is available only if you have gathered statistics on the document set. The frequency of a given field is the number of documents containing that field divided by the total number of documents in the JSON column, expressed as a percentage.

WITH dg_t AS
  (SELECT DBMS_JSON.get_index_dataguide('J_PURCHASEORDER',
                                        'PO_DOCUMENT',
                                        DBMS_JSON.FORMAT_FLAT) dg_doc

    FROM DUAL)
 SELECT jt.*
   FROM dg_t,
        json_table(dg_doc, '$[*]'
          COLUMNS
            jpath     VARCHAR2(40) PATH '$."o:path"',
            type      VARCHAR2(10) PATH '$."type"',
            tlength   NUMBER       PATH '$."o:length"',
            frequency NUMBER       PATH '$."o:frequency"') jt
   WHERE jt.frequency > 80;

JPATH                                    TYPE          TLENGTH  FREQUENCY
---------------------------------------- ------------- -------- ---------
$.User                                   string              8        100
$.PONumber                               number              4        100
$.LineItems                              array             512        100
$.LineItems.Part                         object            128        100
$.LineItems.Part.UPCCode                 number             16        100
$.LineItems.Part.UnitPrice               number              8        100
$.LineItems.Part.Description             string             32        100
$.LineItems.Quantity                     number              4        100
$.LineItems.ItemNumber                   number              1        100
$.Reference                              string             16        100
$.Requestor                              string             16        100
$.CostCenter                             string              4        100
$.ShippingInstructions                   object            256        100
$.ShippingInstructions.name              string             16        100
$.ShippingInstructions.Address           object            128        100
$.ShippingInstructions.Address.city      string             32        100
$.ShippingInstructions.Address.state     string              2        100
$.ShippingInstructions.Address.street    string             32        100
$.ShippingInstructions.Address.country   string             32        100
$.ShippingInstructions.Address.zipCode   number              8        100
$."Special Instructions"                 string              8        100

Related Topics

24.12 A Flat Data Guide For Purchase-Order Documents

The fields of a sample flat data guide are described. It corresponds to a set of purchase-order documents.

The only JSON Schema keyword used in a flat data guide is type. The other fields are all Oracle data-guide fields, which have prefix o:.

Example 24-22 shows a flat data guide for the purchase-order documents in table j_purchaseorder. Things to note:

  • The values of o:preferred_column_name use prefix PO_DOCUMENT$. This prefix comes from using DBMS_JSON.get_index_dataguide to obtain this data guide.

  • The value of o:length is 8 for path $.User, for example, in spite of the fact that the actual lengths of the field values are 5. This is because the value of o:length is always a power of two.

  • The value of o:path for field Special Instructions is wrapped in double quotation marks ("Special Instructions") because of the embedded space character.

Example 24-22 Flat Data Guide For Purchase Orders

Paths are bold. JSON schema keywords are italic. Preferred column names that result from using DBMS_JSON.rename_column are also italic. The formatting used is similar to that produced by using SQL/JSON function json_dataguide with format arguments DBMS_JSON.FORMAT_FLAT and DBMS_JSON.PRETTY.

Note that fields o:frequency, o:low_value, o:high_value, o:num_nulls, and o:last_analyzed are present. This can only be because statistics were gathered on the document set. Their values reflect the state as of the last statistics gathering.

See Example 24-3 for an example of gathering statistics for this data.

In order for statistics to be gathered, either the data guide needs to be based on a JSON search index or it needs to be created using function json_dataguide, specifying DBMS_JSON.GATHER_STATS in the third argument.

[
  {
    "o:path": "$.User",
    "type": "string",
    "o:length": 8,
    "o:preferred_column_name": "PO_DOCUMENT$User",
    "o:frequency": 100,
    "o:low_value": "ABULL",
    "o:high_value": "SBELL",
    "o:num_nulls": 0,
    "o:last_analyzed": "2016-03-31T12:17:53"
  },
  {
    "o:path": "$.PONumber",
    "type": "number",
    "o:length": 4,
    "o:preferred_column_name": "PONumber",
    "o:frequency": 100,
    "o:low_value": "672",
    "o:high_value": "1600",
    "o:num_nulls": 0,
    "o:last_analyzed": "2016-03-31T12:17:53"
  },
  {
    "o:path": "$.LineItems",
    "type": "array",
    "o:length": 512,
    "o:preferred_column_name": "PO_DOCUMENT$LineItems",
    "o:frequency": 100,
    "o:last_analyzed": "2016-03-31T12:17:53"
  },
  {
    "o:path": "$.LineItems.Part",
    "type": "object",
    "o:length": 128,
    "o:preferred_column_name": "PO_DOCUMENT$Part",
    "o:frequency": 100,
    "o:last_analyzed": "2016-03-31T12:17:53"
  },
  {
    "o:path": "$.LineItems.Part.UPCCode",
    "type": "number",
    "o:length": 16,
    "o:preferred_column_name": "PO_DOCUMENT$UPCCode",
    "o:frequency": 100,
    "o:low_value": "13131092899",
    "o:high_value": "717951002396",
    "o:num_nulls": 0,
    "o:last_analyzed": "2016-03-31T12:17:53"
  },
  {
    "o:path": "$.LineItems.Part.UnitPrice",
    "type": "number",
    "o:length": 8,
    "o:preferred_column_name": "PO_DOCUMENT$UnitPrice",
    "o:frequency": 100,
    "o:low_value": "20",
    "o:high_value": "19.95",
    "o:num_nulls": 0,
    "o:last_analyzed": "2016-03-31T12:17:53"
  },
  {
    "o:path": "$.LineItems.Part.Description",
    "type": "string",
    "o:length": 32,
    "o:preferred_column_name": "PartDescription",
    "o:frequency": 100,
    "o:low_value": "Nixon",
    "o:high_value": "Eric Clapton: Best Of 1981-1999",
    "o:num_nulls": 0,
    "o:last_analyzed": "2016-03-31T12:17:53"
  },
  {
    "o:path": "$.LineItems.Quantity",
    "type": "number",
    "o:length": 4,
    "o:preferred_column_name": "PO_DOCUMENT$Quantity",
    "o:frequency": 100,
    "o:low_value": "5",
    "o:high_value": "9.0",
    "o:num_nulls": 0,
    "o:last_analyzed": "2016-03-31T12:17:53"
  },
  {
    "o:path": "$.LineItems.ItemNumber",
    "type": "number",
    "o:length": 1,
    "o:preferred_column_name": "ItemNumber",
    "o:frequency": 100,
    "o:low_value": "1",
    "o:high_value": "3",
    "o:num_nulls": 0,
    "o:last_analyzed": "2016-03-31T12:17:53"
  },
  {
    "o:path": "$.Reference",
    "type": "string",
    "o:length": 16,
    "o:preferred_column_name": "PO_DOCUMENT$Reference",
    "o:frequency": 100,
    "o:low_value": "ABULL-20140421",
    "o:high_value": "SBELL-20141017",
    "o:num_nulls": 0,
    "o:last_analyzed": "2016-03-31T12:17:53"
  },
  {
    "o:path": "$.Requestor",
    "type": "string",
    "o:length": 16,
    "o:preferred_column_name": "PO_DOCUMENT$Requestor",
    "o:frequency": 100,
    "o:low_value": "Sarah Bell",
    "o:high_value": "Alexis Bull",
    "o:num_nulls": 0,
    "o:last_analyzed": "2016-03-31T12:17:53"
  },
  {
    "o:path": "$.CostCenter",
    "type": "string",
    "o:length": 4,
    "o:preferred_column_name": "PO_DOCUMENT$CostCenter",
    "o:frequency": 100,
    "o:low_value": "A50",
    "o:high_value": "A50",
    "o:num_nulls": 0,
    "o:last_analyzed": "2016-03-31T12:17:53"
  },
  {
    "o:path": "$.AllowPartialShipment",
    "type": "boolean",
    "o:length": 4,
    "o:preferred_column_name": "PO_DOCUMENT$AllowPartialShipment",
    "o:frequency": 50,
    "o:low_value": "true",
    "o:high_value": "true",
    "o:num_nulls": 0,
    "o:last_analyzed": "2016-03-31T12:17:53"
  },
  {
    "o:path": "$.ShippingInstructions",
    "type": "object",
    "o:length": 256,
    "o:preferred_column_name": "PO_DOCUMENT$ShippingInstructions",
    "o:frequency": 100,
    "o:last_analyzed": "2016-03-31T12:17:53"
  },
  {
    "o:path": "$.ShippingInstructions.name",
    "type": "string",
    "o:length": 16,
    "o:preferred_column_name": "PO_DOCUMENT$name",
    "o:frequency": 100,
    "o:low_value": "Sarah Bell",
    "o:high_value": "Alexis Bull",
    "o:num_nulls": 0,
    "o:last_analyzed": "2016-03-31T12:17:53"
  },
  {
    "o:path": "$.ShippingInstructions.Phone",
    "type": "string",
    "o:length": 16,
    "o:preferred_column_name": "Phone",
    "o:frequency": 50,
    "o:low_value": "983-555-6509",
    "o:high_value": "983-555-6509",
    "o:num_nulls": 0,
    "o:last_analyzed": "2016-03-31T12:17:53"
  },
  {
    "o:path": "$.ShippingInstructions.Phone",
    "type": "array",
    "o:length": 128,
    "o:preferred_column_name": "PO_DOCUMENT$Phone_1",
    "o:frequency": 50,
    "o:last_analyzed": "2016-03-31T12:17:53"
  },
  {
    "o:path": "$.ShippingInstructions.Phone.type",
    "type": "string",
    "o:length": 8,
    "o:preferred_column_name": "PhoneType",
    "o:frequency": 50,
    "o:low_value": "Mobile",
    "o:high_value": "Office",
    "o:num_nulls": 0,
    "o:last_analyzed": "2016-03-31T12:17:53"
  },
  {
    "o:path": "$.ShippingInstructions.Phone.number",
    "type": "string",
    "o:length": 16,
    "o:preferred_column_name": "PhoneNumber",
    "o:frequency": 50,
    "o:low_value": "415-555-1234",
    "o:high_value": "909-555-7307",
    "o:num_nulls": 0,
    "o:last_analyzed": "2016-03-31T12:17:53"
  },
  {
    "o:path": "$.ShippingInstructions.Address",
    "type": "object",
    "o:length": 128,
    "o:preferred_column_name": "PO_DOCUMENT$Address",
    "o:frequency": 100,
    "o:last_analyzed": "2016-03-31T12:17:53"
  },
  {
    "o:path": "$.ShippingInstructions.Address.city",
    "type": "string",
    "o:length": 32,
    "o:preferred_column_name": "PO_DOCUMENT$city",
    "o:frequency": 100,
    "o:low_value": "South San Francisco",
    "o:high_value": "South San Francisco",
    "o:num_nulls": 0,
    "o:last_analyzed": "2016-03-31T12:17:53"
  },
  {
    "o:path": "$.ShippingInstructions.Address.state",
    "type": "string",
    "o:length": 2,
    "o:preferred_column_name": "PO_DOCUMENT$state",
    "o:frequency": 100,
    "o:low_value": "CA",
    "o:high_value": "CA",
    "o:num_nulls": 0,
    "o:last_analyzed": "2016-03-31T12:17:53"
  },
  {
    "o:path": "$.ShippingInstructions.Address.street",
    "type": "string",
    "o:length": 32,
    "o:preferred_column_name": "PO_DOCUMENT$street",
    "o:frequency": 100,
    "o:low_value": "200 Sporting Green",
    "o:high_value": "200 Sporting Green",
    "o:num_nulls": 0,
    "o:last_analyzed": "2016-03-31T12:17:53"
  },
  {
    "o:path": "$.ShippingInstructions.Address.country",
    "type": "string",
    "o:length": 32,
    "o:preferred_column_name": "PO_DOCUMENT$country",
    "o:frequency": 100,
    "o:low_value": "United States of America",
    "o:high_value": "United States of America",
    "o:num_nulls": 0,
    "o:last_analyzed": "2016-03-31T12:17:53"
  },
  {
    "o:path": "$.ShippingInstructions.Address.zipCode",
    "type": "number",
    "o:length": 8,
    "o:preferred_column_name": "PO_DOCUMENT$zipCode",
    "o:frequency": 100,
    "o:low_value": "99236",
    "o:high_value": "99236",
    "o:num_nulls": 0,
    "o:last_analyzed": "2016-03-31T12:17:53"
  },
  {
    "o:path": "$.\"Special Instructions\"",
    "type": "string",
    "o:length": 8,
    "o:preferred_column_name": "PO_DOCUMENT$SpecialInstructions",
    "o:frequency": 100,
    "o:low_value": "Courier",
    "o:high_value": "Courier",
    "o:num_nulls": 1,
    "o:last_analyzed": "2016-03-31T12:17:53"
  }
]

See Also:

24.13 A Hierarchical Data Guide For Purchase-Order Documents

The fields of a sample hierarchical data guide are described. It corresponds to a set of purchase-order documents.

Example 24-23 shows a hierarchical data guide for the purchase-order documents in table j_purchaseorder. The data guide was created using procedure DBMS_JSON.get_index_dataguide.

Example 24-23 Hierarchical Data Guide For Purchase Orders

Field names are bold. JSON Schema keywords are italic. Preferred column names that result from using DBMS_JSON.rename_column are also italic. The formatting used is similar to that produced by using SQL/JSON function json_dataguide with format arguments DBMS_JSON.FORMAT_HIERARCHICAL and DBMS_JSON.PRETTY.

Note that statistical fields o:frequency, o:low_value, o:high_value, o:num_nulls, and o:last_analyzed are present in this example. This can only be because statistics were gathered on the document set. Their values reflect the state as of the last statistics gathering. See Example 24-3 for an example of gathering statistics for this data.

A hierarchical data guide created by SQL function json_dataguide would look similar to this example, but with these differences:

  • The values of field o:preferred_column_name would the same as the field names in your JSON documents. That is, they would not be prefixed with PO_DOCUMENT$.

  • Statistical fields would be present only if json_dataguide were invoked with DBMS_JSON.GATHER_STATS in its third argument. And in this case field o:sample_size would also be present, following field o:last_analyzed. (The value of o:sample_size would be 2 if there are two documents in the queried column of JSON data.)

{
  "type": "object",
  "properties": {
    "User": {
      "type": "string",
      "o:length": 8,
      "o:preferred_column_name": "PO_DOCUMENT$User",
      "o:frequency": 100,
      "o:low_value": "ABULL",
      "o:high_value": "SBELL",
      "o:num_nulls": 0,
      "o:last_analyzed": "2016-03-31T12:17:53"
    },
    "PONumber": {
      "type": "number",
      "o:length": 4,
      "o:preferred_column_name": "PONumber",
      "o:frequency": 100,
      "o:low_value": "672",
      "o:high_value": "1600",
      "o:num_nulls": 0,
      "o:last_analyzed": "2016-03-31T12:17:53"
    },
    "LineItems": {
      "type": "array",
      "o:length": 512,
      "o:preferred_column_name": "PO_DOCUMENT$LineItems",
      "o:frequency": 100,
      "o:last_analyzed": "2016-03-31T12:17:53",
      "items": {
        "properties": {
          "Part": {
            "type": "object",
            "o:length": 128,
            "o:preferred_column_name": "PO_DOCUMENT$Part",
            "o:frequency": 100,
            "o:last_analyzed": "2016-03-31T12:17:53",
            "properties": {
              "UPCCode": {
                "type": "number",
                "o:length": 16,
                "o:preferred_column_name": "PO_DOCUMENT$UPCCode",
                "o:frequency": 100,
                "o:low_value": "13131092899",
                "o:high_value": "717951002396",
                "o:num_nulls": 0,
                "o:last_analyzed": "2016-03-31T12:17:53"
              },
              "UnitPrice": {
                "type": "number",
                "o:length": 8,
                "o:preferred_column_name": "PO_DOCUMENT$UnitPrice",
                "o:frequency": 100,
                "o:low_value": "20",
                "o:high_value": "19.95",
                "o:num_nulls": 0,
                "o:last_analyzed": "2016-03-31T12:17:53"
              },
              "Description": {
                "type": "string",
                "o:length": 32,
                "o:preferred_column_name": "PartDescription",
                "o:frequency": 100,
                "o:low_value": "Nixon",
                "o:high_value": "Eric Clapton: Best Of 1981-1999",
                "o:num_nulls": 0,
                "o:last_analyzed": "2016-03-31T12:17:53"
              }
            }
          },
          "Quantity": {
            "type": "number",
            "o:length": 4,
            "o:preferred_column_name": "PO_DOCUMENT$Quantity",
            "o:frequency": 100,
            "o:low_value": "5",
            "o:high_value": "9.0",
            "o:num_nulls": 0,
            "o:last_analyzed": "2016-03-31T12:17:53"
          },
          "ItemNumber": {
            "type": "number",
            "o:length": 1,
            "o:preferred_column_name": "ItemNumber",
            "o:frequency": 100,
            "o:low_value": "1",
            "o:high_value": "3",
            "o:num_nulls": 0,
            "o:last_analyzed": "2016-03-31T12:17:53"
          }
        }
      }
    },
    "Reference": {
      "type": "string",
      "o:length": 16,
      "o:preferred_column_name": "PO_DOCUMENT$Reference",
      "o:frequency": 100,
      "o:low_value": "ABULL-20140421",
      "o:high_value": "SBELL-20141017",
      "o:num_nulls": 0,
      "o:last_analyzed": "2016-03-31T12:17:53"
    },
    "Requestor": {
      "type": "string",
      "o:length": 16,
      "o:preferred_column_name": "PO_DOCUMENT$Requestor",
      "o:frequency": 100,
      "o:low_value": "Sarah Bell",
      "o:high_value": "Alexis Bull",
      "o:num_nulls": 0,
      "o:last_analyzed": "2016-03-31T12:17:53"
    },
    "CostCenter": {
      "type": "string",
      "o:length": 4,
      "o:preferred_column_name": "PO_DOCUMENT$CostCenter",
      "o:frequency": 100,
      "o:low_value": "A50",
      "o:high_value": "A50",
      "o:num_nulls": 0,
      "o:last_analyzed": "2016-03-31T12:17:53"
    },
    "AllowPartialShipment": {
      "type": "boolean",
      "o:length": 4,
      "o:preferred_column_name": "PO_DOCUMENT$AllowPartialShipment",
      "o:frequency": 50,
      "o:last_analyzed": "2016-03-31T12:17:53"
    },
    "ShippingInstructions": {
      "type": "object",
      "o:length": 256,
      "o:preferred_column_name": "PO_DOCUMENT$ShippingInstructions",
      "o:frequency": 100,
      "o:last_analyzed": "2016-03-31T12:17:53",
      "properties": {
        "name": {
          "type": "string",
          "o:length": 16,
          "o:preferred_column_name": "PO_DOCUMENT$name",
          "o:frequency": 100,
          "o:low_value": "Sarah Bell",
          "o:high_value": "Alexis Bull",
          "o:num_nulls": 0,
          "o:last_analyzed": "2016-03-31T12:17:53"
        },
        "Phone": {
          "oneOf": [
            {
              "type": "string",
              "o:length": 16,
              "o:preferred_column_name": "Phone",
              "o:frequency": 50,
              "o:low_value": "983-555-6509",
              "o:high_value": "983-555-6509",
              "o:num_nulls": 0,
              "o:last_analyzed": "2016-03-31T12:17:53"
            },
            {
              "type": "array",
              "o:length": 128,
              "o:preferred_column_name": "PO_DOCUMENT$Phone_1",
              "o:frequency": 50,
              "o:last_analyzed": "2016-03-31T12:17:53",
              "items": {
                "properties": {
                  "type": {
                    "type": "string",
                    "o:length": 8,
                    "o:preferred_column_name": "PhoneType",
                    "o:frequency": 50,
                    "o:low_value": "Mobile",
                    "o:high_value": "Office",
                    "o:num_nulls": 0,
                    "o:last_analyzed": "2016-03-31T12:17:53"
                  },
                  "number": {
                    "type": "string",
                    "o:length": 16,
                    "o:preferred_column_name": "PhoneNumber",
                    "o:frequency": 50,
                    "o:low_value": "415-555-1234",
                    "o:high_value": "909-555-7307",
                    "o:num_nulls": 0,
                    "o:last_analyzed": "2016-03-31T12:17:53"
                  }
                }
              }
            }
          ]
        },
        "Address": {
          "type": "object",
          "o:length": 128,
          "o:preferred_column_name": "PO_DOCUMENT$Address",
          "o:frequency": 100,
          "o:last_analyzed": "2016-03-31T12:17:53",
          "properties": {
            "city": {
              "type": "string",
              "o:length": 32,
              "o:preferred_column_name": "PO_DOCUMENT$city",
              "o:frequency": 100,
              "o:low_value": "South San Francisco",
              "o:high_value": "South San Francisco",
              "o:num_nulls": 0,
              "o:last_analyzed": "2016-03-31T12:17:53"
            },
            "state": {
              "type": "string",
              "o:length": 2,
              "o:preferred_column_name": "PO_DOCUMENT$state",
              "o:frequency": 100,
              "o:low_value": "CA",
              "o:high_value": "CA",
              "o:num_nulls": 0,
              "o:last_analyzed": "2016-03-31T12:17:53"
            },
            "street": {
              "type": "string",
              "o:length": 32,
              "o:preferred_column_name": "PO_DOCUMENT$street",
              "o:frequency": 100,
              "o:low_value": "200 Sporting Green",
              "o:high_value": "200 Sporting Green",
              "o:num_nulls": 0,
              "o:last_analyzed": "2016-03-31T12:17:53"
            },
            "country": {
              "type": "string",
              "o:length": 32,
              "o:preferred_column_name": "PO_DOCUMENT$country",
              "o:frequency": 100,
              "o:low_value": "United States of America",
              "o:high_value": "United States of America",
              "o:num_nulls": 0,
              "o:last_analyzed": "2016-03-31T12:17:53"
            },
            "zipCode": {
              "type": "number",
              "o:length": 8,
              "o:preferred_column_name": "PO_DOCUMENT$zipCode",
              "o:frequency": 100,
              "o:low_value": "99236",
              "o:high_value": "99236",
              "o:num_nulls": 0,
              "o:last_analyzed": "2016-03-31T12:17:53"
            }
          }
        }
      }
    },
    "Special Instructions": {
      "type": "string",
      "o:length": 8,
      "o:preferred_column_name": "PO_DOCUMENT$SpecialInstructions",
      "o:frequency": 100,
      "o:low_value": "Courier",
      "o:high_value": "Courier",
      "o:num_nulls": 1,
      "o:last_analyzed": "2016-03-31T12:17:53"
    }
  }
}

See Also:

24.14 A Schema Data Guide For Purchase-Order Documents

The fields of a sample JSON-Schema data guide are described. It corresponds to a set of purchase-order documents.

Example 24-24 shows a schema data guide for the purchase-order documents in table j_purchaseorder. The data guide was created using SQL function json_dataguide.

Example 24-24 Schema Data Guide for Purchase-Order Documents

Field names are bold. JSON Schema keywords are italic.

{
  "type" : "object",
  "o:length" : 1,
  "properties" :
  {
    "User" :
    {
      "type" : "string",
      "o:length" : 8,
      "o:preferred_column_name" : "User"
    },
    "PONumber" :
    {
      "type" : "number",
      "o:length" : 2,
      "o:preferred_column_name" : "PONumber"
    },
    "LineItems" :
    {
      "type" : "array",
      "o:length" : 1,
      "o:preferred_column_name" : "LineItems",
      "items" :
      {
       "properties" :
       {
         "Part" :
         {
           "type" : "object",
           "o:length" : 1,
           "o:preferred_column_name" : "Part",
           "properties" :
           {
             "UPCCode" :
             {
              "type" : "number",
              "o:length" : 8,
              "o:preferred_column_name" : "UPCCode"
             },
             "UnitPrice" :
             {
              "type" : "number",
              "o:length" : 4,
              "o:preferred_column_name" : "UnitPrice"
             },
             "Description" :
             {
              "type" : "string",
              "o:length" : 32,
              "o:preferred_column_name" : "Description"
             }
           }
         },
         "Quantity" :
         {
           "type" : "number",
           "o:length" : 2,
           "o:preferred_column_name" : "Quantity"
         },
         "ItemNumber" :
         {
           "type" : "number",
           "o:length" : 2,
           "o:preferred_column_name" : "ItemNumber"
         }
       }
      }
    },
    "Reference" :
    {
      "type" : "string",
      "o:length" : 16,
      "o:preferred_column_name" : "Reference"
    },
    "Requestor" :
    {
      "type" : "string",
      "o:length" : 16,
      "o:preferred_column_name" : "Requestor"
    },
    "CostCenter" :
    {
      "type" : "string",
      "o:length" : 4,
      "o:preferred_column_name" : "CostCenter"
    },
    "AllowPartialShipment" :
    {
      "type" : "boolean",
      "o:length" : 8,
      "o:preferred_column_name" : "AllowPartialShipment"
    },
    "ShippingInstructions" :
    {
      "type" : "object",
      "o:length" : 1,
      "o:preferred_column_name" : "ShippingInstructions",
      "properties" :
      {
       "name" :
       {
         "type" : "string",
         "o:length" : 16,
         "o:preferred_column_name" : "name"
       },
       "Phone" :
       {
         "type" : "array",
         "o:length" : 1,
         "o:preferred_column_name" : "Phone",
         "items" :
         {
           "properties" :
           {
             "type" :
             {
              "type" : "string",
              "o:preferred_column_name" : "type"
             },
             "number" :
             {
              "type" : "string",
              "o:length" : 16,
              "o:preferred_column_name" : "number"
             }
           }
         }
       },
       "Address" :
       {
         "type" : "object",
         "o:length" : 1,
         "o:preferred_column_name" : "Address",
         "properties" :
         {
           "city" :
           {
             "type" : "string",
             "o:length" : 32,
             "o:preferred_column_name" : "city"
           },
           "state" :
           {
             "type" : "string",
             "o:length" : 2,
             "o:preferred_column_name" : "state"
           },
           "street" :
           {
             "type" : "string",
             "o:length" : 32,
             "o:preferred_column_name" : "street"
           },
           "country" :
           {
             "type" : "string",
             "o:length" : 32,
             "o:preferred_column_name" : "country"
           },
           "zipCode" :
           {
             "type" : "number",
             "o:length" : 4,
             "o:preferred_column_name" : "zipCode"
           }
         }
       }
      }
    },
    "Special Instructions" :
    {
      "type" : "null",
      "o:length" : 1,
      "o:preferred_column_name" : "Special Instructions"
    }
  }
}

See Also:



Footnote Legend

Footnote 1: If function json_dataguide were passed DBMS_JSON.FORMAT_HIERARCHICAL or DBMS_JSON.FORMAT_SCHEMA as optional second argument, then the result would be three hierarchical or schema data guides, respectively.