Overview of Inner Join
An inner join is an operation that produces new rows by combining rows from two or more tables, based on the join predicates applied to related columns or fields between them. The result-set contains only those combined rows that satisfy the join predicates.
Conceptually, an inner join works as follows:
Consider that you need to perform an inner join of three tables A, B and C. The tables A and B are first joined. That is, if table A has N rows and n columns, and table B has M rows and m columns, every row in table A is joined with every row in table B. The resultant table AB would thus have (N * M) rows and (n + m) columns. Similarly table AB is now joined with table C, to form table ABC. The join predicates in the WHERE clause are then applied to the table ABC. Note that the join predicates must include equality predicates between all the shard key columns of the joined tables. The final result-set contains only the matching rows from the participating tables.
You specify the tables to be joined in the FROM clause of the SELECT statement and the join predicates in the WHERE clause. A join predicate is a predicate that references the columns or fields from one or more tables that are to be joined and specifies the filter conditions that need to be applied on them. In the case of inner join, the WHERE clause must include the equality predicate on all shard keys of the participating tables.
If you use a '*' with the ‘SELECT’ clause, wherein all the fields in the tables are returned, the order of fields in the result-set depends on the order in which you specify the tables in the FROM clause. If you provide a list of fields in the SELECT clause, then the order of the fields in the result-set is as specified in the SELECT clause.
- Only joins among tables in the same table hierarchy are allowed.
- Supports joining of tables that are in an ancestor-descendant relationship as well as tables that are not in an ancestor-descendant relationship.
- The join predicates must include equality predicates between all the shard key columns of the joined tables. To know more about shard keys, see CREATE TABLE. That is, for any pair of joined tables, a row from one table matches with a row from the other table only if they both have the same values on their shard key columns. You can use the DESCRIBE TABLE statement to identify the shard keys.
- The rest of the predicates in the WHERE clause are applied to these matching rows.
- An inner join is based on matching the shard keys of the participating tables, whereas NESTED TABLES and left outer join are based on matching the primary keys of the participating tables.
- The result-set of an inner join contains only the matching rows. Whereas, in the case of NESTED TABLES and left outer join, the unmatched row in the left table is also returned in the result-set with a corresponding NULL row in the right table.
- Inner join can be used to join tables that are not in an ancestor-descendant relationship. This is not possible in the case of left outer join and NESTED_TABLES. For more details, see Inner Join vs LOJ vs NESTED TABLES.
In essence, tables having an ancestor-descendant relationship between them can be joined using any of the three types of join. You can choose to use one of them based on your use case. If the tables to be joined are not in an ancestor-descendant relationship, then inner join must be used.