MySQL 8.0 Reference Manual Including MySQL NDB Cluster 8.0
This section contains information about JSON functions that
convert JSON data to tabular data. In MySQL 8.0.4 and later, one
such function—JSON_TABLE()
—is
supported.
JSON_TABLE(
expr
,
path
COLUMNS
(column_list
) [AS]
alias
)
Extracts data from a JSON document and returns it as a relational table having the specified columns. The complete syntax for this function is shown here:
JSON_TABLE(expr
,path
COLUMNS (column_list
) ) [AS]alias
column_list
:column
[,column
][, ...]column
:name
FOR ORDINALITY |name
type
PATHstring path
[on_empty
] [on_error
] |name
type
EXISTS PATHstring path
| NESTED [PATH]path
COLUMNS (column_list
)on_empty
: {NULL | DEFAULTjson_string
| ERROR} ON EMPTYon_error
: {NULL | DEFAULTjson_string
| ERROR} ON ERROR
expr
: This is an expression that
returns JSON data. This can be a constant
('{"a":1}'
), a column
(t1.json_data
, given table
t1
specified prior to
JSON_TABLE()
in the FROM
clause), or a function call
(JSON_EXTRACT(t1.json_data,'$.post.comments')
).
path
: A JSON path expression, which
is applied to the data source. We refer to the JSON value
matching the path as the row source; this
is used to generate a row of relational data. The
COLUMNS
clause evaluates the row source,
finds specific JSON values within the row source, and returns
those JSON values as SQL values in individual columns of a row
of relational data.
The alias
is required. The usual
rules for table aliases apply (see
Section 9.2, “Schema Object Names”).
JSON_TABLE()
supports four types of
columns, described in the following list:
: This type enumerates rows in the
name
FOR
ORDINALITYCOLUMNS
clause; the column named
name
is a counter whose type is
UNSIGNED INT
, and whose initial value
is 1. This is equivalent to specifying a column as
AUTO_INCREMENT
in a
CREATE TABLE
statement, and
can be used to distinguish parent rows with the same value
for multiple rows generated by a NESTED
[PATH]
clause.
: Columns
of this type are used to extract values specified by
name
type
PATH
string_path
[on_empty
]
[on_error
]string_path
.
type
is a MySQL scalar data
type (that is, it cannot be an object or array).
JSON_TABLE()
extracts data as JSON then
coerces it to the column type, using the regular automatic
type conversion applying to JSON data in MySQL. A missing
value triggers the on_empty
clause. Saving an object or array triggers the optional
on error
clause; this also
occurs when an error takes place during coercion from the
value saved as JSON to the table column, such as trying to
save the string 'asd'
to an integer
column.
: This column
returns 1 if any data is present at the location specified
by name
type
EXISTS PATH
path
path
, and 0 otherwise.
type
can be any valid MySQL
data type, but should normally be specified as some
variety of INT
.
NESTED [PATH]
: This
flattens nested objects or arrays in JSON data into a
single row along with the JSON values from the parent
object or array. Using multiple path
COLUMNS
(column_list
)PATH
options allows projection of JSON values from multiple
levels of nesting into a single row.
The path
is relative to the
parent path row path of JSON_TABLE()
,
or the path of the parent NESTED [PATH]
clause in the event of nested paths.
on empty
, if specified, determines
what JSON_TABLE()
does in the event that
data is missing (depending on type). This clause is also
triggered on a column in a NESTED PATH
clause when the latter has no match and a
NULL
complemented row is produced for it.
on empty
takes one of the following
values:
NULL ON EMPTY
: The column is set to
NULL
; this is the default behavior.
DEFAULT
: the provided
json_string
ON
EMPTYjson_string
is parsed as JSON,
as long as it is valid, and stored instead of the missing
value. Column type rules also apply to the default value.
ERROR ON EMPTY
: An error is thrown.
If used, on_error
takes one of the
following values with the corresponding result as shown here:
NULL ON ERROR
: The column is set to
NULL
; this is the default behavior.
DEFAULT
: The
json string
ON
ERRORjson_string
is parsed as JSON
(provided that it is valid) and stored instead of the
object or array.
ERROR ON ERROR
: An error is thrown.
Prior to MySQL 8.0.20, a warning was thrown if a type
conversion error occurred with NULL ON
ERROR
or DEFAULT ... ON ERROR
was
specified or implied. In MySQL 8.0.20 and later, this is no
longer the case. (Bug #30628330)
Previously, it was possible to specify ON
EMPTY
and ON ERROR
clauses in
either order. This runs counter to the SQL standard, which
stipulates that ON EMPTY
, if specified,
must precede any ON ERROR
clause. For this
reason, beginning with MySQL 8.0.20, specifying ON
ERROR
before ON EMPTY
is
deprecated; trying to do so causes the server to issue a
warning. Expect support for the nonstandard syntax to be
removed in a future version of MySQL.
When a value saved to a column is truncated, such as saving
3.14159 in a DECIMAL(10,1)
column, a warning is issued independently of any ON
ERROR
option. When multiple values are truncated in
a single statement, the warning is issued only once.
Prior to MySQL 8.0.21, when the expression and path passed to
this function resolved to JSON null,
JSON_TABLE()
raised an error. In MySQL
8.0.21 and later, it returns SQL NULL
in
such cases, in accordance with the SQL standard, as shown here
(Bug #31345503, Bug #99557):
mysql>SELECT *
->FROM
->JSON_TABLE(
->'[ {"c1": null} ]',
->'$[*]' COLUMNS( c1 INT PATH '$.c1' ERROR ON ERROR )
->) as jt;
+------+ | c1 | +------+ | NULL | +------+ 1 row in set (0.00 sec)
The following query demonstrates the use of ON
EMPTY
and ON ERROR
. The row
corresponding to {"b":1}
is empty for the
path "$.a"
, and attempting to save
[1,2]
as a scalar produces an error; these
rows are highlighted in the output shown.
mysql>SELECT *
->FROM
->JSON_TABLE(
->'[{"a":"3"},{"a":2},{"b":1},{"a":0},{"a":[1,2]}]',
->"$[*]"
->COLUMNS(
->rowid FOR ORDINALITY,
->ac VARCHAR(100) PATH "$.a" DEFAULT '111' ON EMPTY DEFAULT '999' ON ERROR,
->aj JSON PATH "$.a" DEFAULT '{"x": 333}' ON EMPTY,
->bx INT EXISTS PATH "$.b"
->)
->) AS tt;
+-------+------+------------+------+ | rowid | ac | aj | bx | +-------+------+------------+------+ | 1 | 3 | "3" | 0 | | 2 | 2 | 2 | 0 | | 3 | 111 | {"x": 333} | 1 | | 4 | 0 | 0 | 0 | | 5 | 999 | [1, 2] | 0 | +-------+------+------------+------+ 5 rows in set (0.00 sec)
Column names are subject to the usual rules and limitations governing table column names. See Section 9.2, “Schema Object Names”.
All JSON and JSON path expressions are checked for validity; an invalid expression of either type causes an error.
Each match for the path
preceding
the COLUMNS
keyword maps to an individual
row in the result table. For example, the following query
gives the result shown here:
mysql>SELECT *
->FROM
->JSON_TABLE(
->'[{"x":2,"y":"8"},{"x":"3","y":"7"},{"x":"4","y":6}]',
->"$[*]" COLUMNS(
->xval VARCHAR(100) PATH "$.x",
->yval VARCHAR(100) PATH "$.y"
->)
->) AS jt1;
+------+------+ | xval | yval | +------+------+ | 2 | 8 | | 3 | 7 | | 4 | 6 | +------+------+
The expression "$[*]"
matches each element
of the array. You can filter the rows in the result by
modifying the path. For example, using
"$[1]"
limits extraction to the second
element of the JSON array used as the source, as shown here:
mysql>SELECT *
->FROM
->JSON_TABLE(
->'[{"x":2,"y":"8"},{"x":"3","y":"7"},{"x":"4","y":6}]',
->"$[1]" COLUMNS(
->xval VARCHAR(100) PATH "$.x",
->yval VARCHAR(100) PATH "$.y"
->)
->) AS jt1;
+------+------+ | xval | yval | +------+------+ | 3 | 7 | +------+------+
Within a column definition, "$"
passes the
entire match to the column; "$.x"
and
"$.y"
pass only the values corresponding to
the keys x
and y
,
respectively, within that match. For more information, see
JSON Path Syntax.
NESTED PATH
(or simply
NESTED
; PATH
is
optional) produces a set of records for each match in the
COLUMNS
clause to which it belongs. If
there is no match, all columns of the nested path are set to
NULL
. This implements an outer join between
the topmost clause and NESTED [PATH]
. An
inner join can be emulated by applying a suitable condition in
the WHERE
clause, as shown here:
mysql>SELECT *
->FROM
->JSON_TABLE(
->'[ {"a": 1, "b": [11,111]}, {"a": 2, "b": [22,222]}, {"a":3}]',
->'$[*]' COLUMNS(
->a INT PATH '$.a',
->NESTED PATH '$.b[*]' COLUMNS (b INT PATH '$')
->)
->) AS jt
->WHERE b IS NOT NULL;
+------+------+ | a | b | +------+------+ | 1 | 11 | | 1 | 111 | | 2 | 22 | | 2 | 222 | +------+------+
Sibling nested paths—that is, two or more instances of
NESTED [PATH]
in the same
COLUMNS
clause—are processed one
after another, one at a time. While one nested path is
producing records, columns of any sibling nested path
expressions are set to NULL
. This means
that the total number of records for a single match within a
single containing COLUMNS
clause is the sum
and not the product of all records produced by NESTED
[PATH]
modifiers, as shown here:
mysql>SELECT *
->FROM
->JSON_TABLE(
->'[{"a": 1, "b": [11,111]}, {"a": 2, "b": [22,222]}]',
->'$[*]' COLUMNS(
->a INT PATH '$.a',
->NESTED PATH '$.b[*]' COLUMNS (b1 INT PATH '$'),
->NESTED PATH '$.b[*]' COLUMNS (b2 INT PATH '$')
->)
->) AS jt;
+------+------+------+ | a | b1 | b2 | +------+------+------+ | 1 | 11 | NULL | | 1 | 111 | NULL | | 1 | NULL | 11 | | 1 | NULL | 111 | | 2 | 22 | NULL | | 2 | 222 | NULL | | 2 | NULL | 22 | | 2 | NULL | 222 | +------+------+------+
A FOR ORDINALITY
column enumerates records
produced by the COLUMNS
clause, and can be
used to distinguish parent records of a nested path,
especially if values in parent records are the same, as can be
seen here:
mysql>SELECT *
->FROM
->JSON_TABLE(
->'[{"a": "a_val",
'>"b": [{"c": "c_val", "l": [1,2]}]},
'>{"a": "a_val",
'>"b": [{"c": "c_val","l": [11]}, {"c": "c_val", "l": [22]}]}]',
->'$[*]' COLUMNS(
->top_ord FOR ORDINALITY,
->apath VARCHAR(10) PATH '$.a',
->NESTED PATH '$.b[*]' COLUMNS (
->bpath VARCHAR(10) PATH '$.c',
->ord FOR ORDINALITY,
->NESTED PATH '$.l[*]' COLUMNS (lpath varchar(10) PATH '$')
->)
->)
->) as jt;
+---------+---------+---------+------+-------+ | top_ord | apath | bpath | ord | lpath | +---------+---------+---------+------+-------+ | 1 | a_val | c_val | 1 | 1 | | 1 | a_val | c_val | 1 | 2 | | 2 | a_val | c_val | 1 | 11 | | 2 | a_val | c_val | 2 | 22 | +---------+---------+---------+------+-------+
The source document contains an array of two elements; each of
these elements produces two rows. The values of
apath
and bpath
are the
same over the entire result set; this means that they cannot
be used to determine whether lpath
values
came from the same or different parents. The value of the
ord
column remains the same as the set of
records having top_ord
equal to 1, so these
two values are from a single object. The remaining two values
are from different objects, since they have different values
in the ord
column.