Overview of Storing and Managing JSON Data

3 Overview of Storing and Managing JSON Data

You can store JSON data in one or more columns of a table, alone or with relational columns. JSON data type is recommended, but you can also store JSON textually. If you store textual JSON data then use SQL/JSON condition is json to ensure that the data is well-formed.

If your use case is mainly document-centric, then consider using JSON collections. A JSON collection is a special table or view that provides JSON documents in a single JSON-type column named DATA.^Foot 1

JSON collections are convenient for use with document APIs, such as Oracle AI Database API for MongoDB, Simple Oracle Document Access (SODA), and REST, along with your usual drivers, frameworks, tools, and development methods. See JSON Collections.

Data Types for JSON Columns

You can store JSON data in Oracle AI Database using columns whose data types are JSON, VARCHAR2, CLOB, or BLOB. Whichever type you use, you can manipulate JSON data as you would manipulate any other data of those types. Storing JSON data using standard data types allows all features of Oracle AI Database, such as advanced replication, to work with tables containing JSON documents.

Oracle recommends that you use JSON data type, which stores JSON data in a native binary format.

If you instead use one of the other types, the choice of which one to use is typically motivated by the size of the JSON documents you need to manage:

Use VARCHAR2(4000) if you are sure that your largest JSON documents do not exceed 4000 bytes (or characters)^Foot 2 ^Foot 2.

If you use Oracle Exadata then choosing VARCHAR2(4000) can improve performance by allowing the execution of some JSON operations to be pushed down to Exadata storage cells, for improved performance.
Use VARCHAR2(32767) if you know that some of your JSON documents are larger than 4000 bytes (or characters) and you are sure than none of the documents exceeds 32767 bytes (or characters)^Foot 2.

With VARCHAR2(32767), the first roughly 3.5K bytes (or characters) of a document is stored in line, as part of the table row. This means that the added cost of using VARCHAR2(32767) instead of VARCHAR2(4000) applies only to those documents that are larger than about 3.5K. If most of your documents are smaller than this then you will likely notice little performance difference from using VARCHAR2(4000).

If you use Oracle Exadata then push-down is enabled for any documents that are stored in line.
Use BLOB (binary large object) or CLOB (character large object) storage if you know that you have some JSON documents that are larger than 32767 bytes (or characters)^Foot 2.

Ensure That JSON Columns Contain Well-Formed JSON Data

If you use JSON data type to store your JSON data (recommended) then the data is guaranteed to be well-formed JSON data — you cannot store it otherwise.

If you do not use JSON data type to store your JSON data then you can use SQL/JSON condition is json to check whether or not some JSON data is well formed. In this case Oracle strongly recommends that you apply an is json check constraint to any JSON column, unless you expect some rows to contain something other than well-formed JSON data.

The overhead of parsing JSON is such that evaluating the condition should not have a significant impact on insert and update performance, and omitting the constraint means you cannot use the simple dot-notation syntax to query the JSON data.

What constitutes well-formed JSON data is a gray area. In practice, it is common for JSON data to have some characteristics that do not strictly follow the standard definition. You can control which syntax you require a given column of JSON data to conform to: the standard definition (strict syntax) or a JavaScript-like syntax found in common practice (lax syntax). The default SQL/JSON syntax for Oracle AI Database is lax. Which kind of syntax is used is controlled by condition is json. Applying an is json check constraint to a JSON column thus enables the use of lax JSON syntax, by default.

Related Topics

Footnote Legend

Footnote 1: With a collection table, each document automatically has a unique document-identifier field _id. With a collection view, documents can, but need not, have a document-identifier field, and it need not be named _id.
Footnote 2: Whether the limit is expressed in bytes or characters is determined by session parameter NLS_LENGTH_SEMANTICS.