3 Overview of Storing and Managing JSON Data

This overview describes: (1) data types for JSON columns, (2) LOB storage considerations for JSON data, and (3) ensuring that JSON columns contain well-formed JSON data.

Data Types for JSON Columns

You can store JSON data in Oracle Database using columns whose data types are VARCHAR2, CLOB, or BLOB. The choice of which to use is typically motivated by the size of the JSON documents you need to manage:

  • Use VARCHAR2(4000) if you are sure that your largest JSON documents do not exceed 4000 bytes (or characters)Foot 1Footref 1.

    If you use Oracle Exadata then choosing VARCHAR2(4000) can improve performance by allowing the execution of some JSON operations to be pushed down to Exadata storage cells, for improved performance.

  • Use VARCHAR2(32767) if you know that some of your JSON documents are larger than 4000 bytes (or characters) and you are sure than none of the documents exceeds 32767 bytes (or characters)Footref 1.

    With VARCHAR2(32767), the first roughly 3.5K bytes (or characters) of a document is stored in line, as part of the table row. This means that the added cost of using VARCHAR2(32767) instead of VARCHAR2(4000) applies only to those documents that are larger than about 3.5K. If most of your documents are smaller than this then you will likely notice little performance difference from using VARCHAR2(4000).

    If you use Oracle Exadata then push-down is enabled for any documents that are stored in line.

  • Use BLOB (binary large object) or CLOB (character large object) storage if you know that you have some JSON documents that are larger than 32767 bytes (or characters)Footref 1.

The fact that you store JSON data in the database using standard SQL data types means that you can manipulate JSON data as you would manipulate any other data of those types. Storing JSON data using standard data types allows all features of Oracle Database, such as advanced replication, to work with tables containing JSON documents.

Considerations When Using LOB Storage for JSON Data

Oracle recommends that you use BLOB, not CLOB storage.

This is particularly relevant if the database character set is the Oracle-recommended value of AL32UTF8. In AL32UTF8 databases CLOB instances are stored using the UCS2 character set, which means that each character requires two bytes. This doubles the storage needed for a document if most of its content consists of characters that are represented using a single byte in character set AL32UTF8.

Even in cases where the database character set is not AL32UTF8, choosing BLOB over CLOB storage has the advantage that it avoids the need for character-set conversion when storing the JSON document  (see Character Sets and Character Encoding for JSON Data).

When using large objects (LOBs), Oracle recommends that you do the following:

  • Use the clause LOB (COLUMN_NAME) STORE AS (CACHE) in your CREATE TABLE statement, to ensure that read operations on the JSON documents are optimized using the database buffer cache.

  • Use SecureFiles LOBs. Consider also using Oracle Advanced Compression, to reduce the storage space needed for your JSON data. If you use compression then Oracle recommends option Medium Compression, which provides a good balance between space savings and performance.

SQL/JSON functions and conditions work with JSON data without any special considerations, whether the data is stored as BLOB or CLOB. From an application-development perspective, the API calls for working with BLOB content are nearly identical to those for working with CLOB content.

A downside of choosing BLOB storage over CLOB (for JSON or any other kind of data) is that it is sometimes more difficult to work with BLOB content using command-line tools such as SQL*Plus. For instance:

  • When selecting data from a BLOB column, if you want to view it as printable text then you must use SQL function to_clob.

  • When performing insert or update operations on a BLOB column, you must explicitly convert character strings to BLOB format using SQL function rawtohex.Foot 2

Ensure That JSON Columns Contain Well-Formed JSON Data

You can use SQL/JSON condition is json to check whether or not some JSON data is well formed. Oracle strongly recommends that you apply an is json check constraint to any JSON column, unless you expect some rows to contain something other than well-formed JSON data.

The overhead of parsing JSON is such that evaluating the condition should not have a significant impact on insert and update performance, and omitting the constraint means you cannot use the simple dot-notation syntax to query the JSON data.

What constitutes well-formed JSON data is a gray area. In practice, it is common for JSON data to have some characteristics that do not strictly follow the standard definition. You can control which syntax you require a given column of JSON data to conform to: the standard definition (strict syntax) or a JavaScript-like syntax found in common practice (lax syntax). The default SQL/JSON syntax for Oracle Database is lax. Which kind of syntax is used is controlled by condition is json. Applying an is json check constraint to a JSON column thus enables the use of lax JSON syntax, by default.

See Also:

Oracle Database SQL Language Reference for information about SQL function rawtohex

Footnote Legend

Footnote 1: Whether the limit is expressed in bytes or characters is determined by session parameter NLS_LENGTH_SEMANTICS.
Footnote 2: The return value of SQL function rawtohex is limited to 32767 bytes. The value is truncated to remove any converted data beyond this length.