JOIN clauses allow records from multiple statements and/or named states to be combined, based on a relationship between certain attributes in these statements.
JOIN clauses, which conform to a subset of the SQL standard, do a join with the specified join condition. The join condition may be an arbitrary Boolean expression referring to the attributes in the
FROM statement. The expression must be enclosed in parentheses.
JOIN clause always modifies a
FROM clause. Two named sources (one or both of which can be named states) can be indicated in the
FROM clause. Fields must be dot-qualified to indicate which source they come from, except in queries from a single table.
Self-join is supported. Statement aliasing is required for self-join.
Both input tables must result from
RETURN statements (that is, from intermediate results).
Any number of joins can be performed in a single statement.
JOINis as follows:
FROM <statement1> [alias] [INNER,CROSS,LEFT,RIGHT,FULL] JOIN <statement2> [alias] ON (Boolean-expression) [JOIN <statementN> [alias] ON (Boolean-expression)]*where statement is either a statement or a named state. Note that you can put multiple
JOINclauses under a
FROMclause, but there must be exactly one
FROMclause in any statement.
Types of joins
EQL supports the following types of joins:
INNER JOINjoins records on the left and right sides, then filters the result records by the join condition. That means that only rows for which the join condition is
TRUEare included. If you do not specify the join type,
INNER JOIN. Note that the
INNERkeyword can be used only with
JOIN, and EQL will throw an error if it is used with the other join types.
RIGHT JOIN, and
FULL JOIN(collectively called outer joins) extend the result of an
INNER JOINwith records from a side for which no record on the other side matched the join condition. When such an additional record is included from one side, the record in the join result contains NULLs for all attributes from the other side.
LEFT JOINincludes all such rows from the left side,
RIGHT JOINincludes all such rows from the right side, and
FULL JOINincludes all such rows from either side.
CROSS JOINis the Cartesian product of the left and right sides. Each result record has the assignments from both of the corresponding records from the two sides.
Keep in mind that if not used correctly, joins can cause the Dgraph to grow beyond available RAM because they can easily create very large results. For example, a
CROSS JOIN of a result with 100 records and a result with 200 records would contain 20,000 records. Two best practices are to avoid
CROSS JOIN if possible and to be careful with
ON conditions so that the number of results are reasonable.
INNER JOIN example
INNER JOINexample finds employees whose sales in a particular subcategory account for more than 10% of that subcategory's total:
DEFINE EmployeeTotals AS SELECT ARB(DimEmployee_FullName) AS Name, SUM(FactSales_SalesAmount) AS Total FROM SaleState GROUP BY DimEmployee_EmployeeKey, ProductSubcategoryName; DEFINE SubcategoryTotals AS SELECT SUM(FactSales_SalesAmount) AS Total FROM SaleState GROUP BY ProductSubcategoryName; RETURN Stars AS SELECT EmployeeTotals.Name AS Name, EmployeeTotals.ProductSubcategoryName AS Subcategory, 100 * EmployeeTotals.Total / SubcategoryTotals.Total AS Pct FROM EmployeeTotals INNER JOIN SubcategoryTotals ON (EmployeeTotals.ProductSubcategoryName = SubcategoryTotals.ProductSubcategoryName) HAVING Pct > 10
INNER JOINcomputes cumulative daily sales totals per employee:
DEFINE Days AS SELECT FactSales_OrderDateKey AS DateKey, DimEmployee_EmployeeKey AS EmployeeKey, ARB(DimEmployee_FullName) AS EmployeeName, SUM(FactSales_SalesAmount) AS DailyTotal FROM SaleState GROUP BY DateKey, EmployeeKey; RETURN CumulativeDays AS SELECT SUM(PreviousDays.DailyTotal) AS CumulativeTotal, Day.DateKey AS DateKey, Day.EmployeeKey AS EmployeeKey, ARB(Day.EmployeeName) AS EmployeeName FROM Days Day JOIN Days PreviousDays ON (PreviousDays.DateKey <= Day.DateKey) GROUP BY DateKey, EmployeeKey
LEFT JOIN examples
LEFT JOINexample computes the top 5 subcategories along with an Other bucket, for use in a pie chart:
DEFINE Totals AS SELECT SUM(FactSales_SalesAmount) AS Total FROM SaleState GROUP BY ProductSubcategoryName; DEFINE Top5 AS SELECT ARB(Total) AS Total FROM Totals GROUP BY ProductSubcategoryName ORDER BY Total DESC PAGE(0,5); RETURN Chart AS SELECT COALESCE(Top5.ProductSubcategoryName, 'Other') AS Subcategory, SUM(Totals.Total) AS Total FROM Totals LEFT JOIN Top5 ON (Totals.ProductSubcategoryName = Top5.ProductSubcategoryName) GROUP BY Subcategory
LEFT JOINcomputes metrics for each product in a particular region, ensuring all products appear in the list even if they have never been sold in that region:
DEFINE Product AS SELECT ProductAlternateKey AS Key, ARB(ProductName) AS Name FROM SaleState GROUP BY Key; DEFINE RegionTrans AS SELECT ProductAlternateKey AS ProductKey, FactSales_SalesAmount AS Amount FROM SaleState WHERE DimSalesTerritory_SalesTerritoryRegion='United Kingdom'; RETURN Results AS SELECT Product.Key AS ProductKey, ARB(Product.Name) AS ProductName, COALESCE(SUM(RegionTrans.Amount), 0) AS SalesTotal, COUNT(RegionTrans.Amount) AS TransactionCount FROM Product LEFT JOIN RegionTrans ON (Product.Key = RegionTrans.ProductKey) GROUP BY ProductKey
FULL JOIN example
FULL JOINcomputes the top 10 employees' sales totals for the top 10 products, ensuring that each employee and each product appears in the result:
DEFINE TopEmployees AS SELECT DimEmployee_EmployeeKey AS Key, ARB(DimEmployee_FullName) AS Name, SUM(FactSales_SalesAmount) AS SalesTotal FROM SaleState GROUP BY Key ORDER BY SalesTotal DESC PAGE (0,10); DEFINE TopProducts AS SELECT ProductAlternateKey AS Key, ARB(ProductName) AS Name, SUM(FactSales_SalesAmount) AS SalesTotal FROM SaleState GROUP BY Key ORDER BY SalesTotal DESC PAGE (0,10); DEFINE EmployeeProductTotals AS SELECT DimEmployee_EmployeeKey AS EmployeeKey, ProductAlternateKey AS ProductKey, SUM(FactSales_SalesAmount) AS SalesTotal FROM SaleState GROUP BY EmployeeKey, ProductKey HAVING [EmployeeKey] IN TopEmployees AND [ProductKey] IN TopProducts; RETURN Results AS SELECT TopEmployees.Key AS EmployeeKey, TopEmployees.Name AS EmployeeName, TopEmployees.SalesTotal AS EmployeeTotal, TopProducts.Key AS ProductKey, TopProducts.Name AS ProductName, TopProducts.SalesTotal AS ProductTotal, EmployeeProductTotals.SalesTotal AS EmployeeProductTotal FROM EmployeeProductTotals FULL JOIN TopEmployees ON (EmployeeProductTotals.EmployeeKey = TopEmployees.Key) FULL JOIN TopProducts ON (EmployeeProductTotals.ProductKey = TopProducts.Key)
CROSS JOIN example
CROSS JOINexample finds the percentage of total sales each product subcategory represents:
DEFINE GlobalTotal AS SELECT SUM(FactSales_SalesAmount) AS GlobalTotal FROM SaleState GROUP; DEFINE SubcategoryTotals AS SELECT SUM(FactSales_SalesAmount) AS SubcategoryTotal FROM SaleState GROUP BY ProductSubcategoryName; RETURN SubcategoryContributions AS SELECT SubcategoryTotals.ProductSubcategoryName AS Subcategory, SubcategoryTotals.SubcategoryTotal / GlobalTotal.GlobalTotal AS Contribution FROM SubcategoryTotals CROSS JOIN GlobalTotal