1 Basic Concepts

SQL Dialogs are skills that can translate a user's natural language utterances into SQL queries, send the queries to a backend data source, and display the response.

When writing skills that provide database information to end users, developers typically need to define the use cases, write custom components to retrieve the data, create intents for the use cases, map user utterances to intents, and write the dialog flow to handle every intent. With SQL skills, you don't need to do these steps. Instead, you map the users' mental models of the data to the physical data source, and the skill uses the map to dynamically generate SQL from natural language utterances.

For example, users might know that employees belong to departments that are in various locations, and they have employee IDs, job titles, hire dates, and sometimes commissions. Given their mental model, they can retrieve the data by asking the skill "What is James Smith's job?", "When was James Smith hired?", "How many employees are in New York?", and "Who has the highest commission?

This version of SQL Dialogs supports integration with Oracle database services, such as Oracle Enterprise Database Service.

Note:

The current version doesn't support more than one language per skill.

How SQL Dialogs Work

To implement a SQL Dialogs, you create a visual dialog skill and import information about the physical model (database schema) from the data service. Oracle Digital Assistant uses this information to create a query entity for each table you selected to import (the logical model). The query entities contain attributes that model the table columns.

If a table in the physical model has a foreign key, then the query entity as an attribute that links to the related query entity. For example, if an Emp table has a foreign key to the Depttable, then the Emp query entity has a dept attribute, which links to the Dept entity.

As soon as you create the query entities and identify their primary keys, you can train the skill and it's ready to use in a rudimentary way. That is, you can use free form utterances, but, for now, the query must use the exact entity and attribute primary names, which are initially derived from the physical model's names (the canonical names). This will change as you enhance the logical model to more closely reflect natural language.

To enable the end users to inquire about the data by using natural language, you map end-user terminology to the physical model by changing the primary names and adding synonyms for both the query entities and their attributes. For example, you might change the primary name for the Emp table to "employee" and add the "staff member" synonym. Adding primary names and synonyms is one of the ways that you train the natural language parser (NLP) to resolve utterances into Oracle meaning representation query language (OMRQL) queries. OMRQL queries are like SQL queries but are based on the canonical names of the object models (the query entities). For example, if you change the primary name for empno to "employee number" then "what is Joe Smith's employee number" resolves to SELECT empno FROM emp WHERE ename = 'Joe Smith'. To even further improve the natural language processor (NLP) resolution, you also can associate the attributes with value lists, which are automatically populated from the data service upon creation.

For example, let's say that you import Emp and Dept tables from a data service, which results in Emp and Dept query entities. Immediately after you import the tables and train the skill, you can query the query entities using utterances like following:

Show all Emp in dept 10

After you change the primary names for the entities and attributes to more natural language terms, such as Employees for the entity and department for the attribute, you can use utterances like this one:

Show all the employees in department 10

You can further add synonyms to model all the ways people typically refer to the entity or attribute. For example, you might add the synonyms district and territory for department so that the NLP recognizes this utterance:

Show all employees in district 10

With more complex utterances, you can teach the skill how to resolve the queries to OMRQL by adding custom data that associates the utterances with specific OMRQL statements.

When the skill outputs a query result, it lets the user give a thumbs up or thumbs down to indicate whether the result is correct. The Insights page shows the thumbs up (correct queries) and thumbs down (incorrect queries) counts so you can see how well the skill is doing.

Supported Queries

Users can ask questions relating to the database and receive the relevant information. Using the example of an employees/departments database, this table shows the kinds of queries that a user can make.

Category Behavior Examples
Request an entity Returns a table with all the rows in the table that correspond to the request. When the attributes aren't named, the columns are the default attributes defined in the logical model.
  • show me all the employees
  • show details about each employee
Request an attribute Returns a table with the requested attribute as well as the minimum attributes of the entity that the attribute belongs to.
  • salary of employees
  • what is the salary of all employees?
Request distinct attribute Returns a table with only the unique values of the given attribute.
  • Show the different jobs of employees:
  • what are the unique jobs that an employee can have?
  • return distinct jobs
Request an aggregation Users can choose to request a summary of the information in the database.
  • how many employees
  • what is the total salary of all employees
  • what is the average salary of employees
  • what is the lowest salary of employees
  • what is the maximum salary of employees
Filter by attribute equal to a value This can be used for all attributes.
  • show me employees whose hire date is 10 Dec 2020
  • who are the employees with hire date 10 Dec 2020
Filter by implicit value reference Can be used for values that are present in an associated value list (including synonyms).
  • salary of everyclerk
  • what's the salary of all employees who are clerks
Filter by attribute containing a value Can be used for both text and numeric attributes
  • Employees with name containing Jo
  • employees where the ID includes 0153
Filter by value comparison Compares a numerical attribute to a value.
  • Employees whose salary is less than 1000
  • employees who earn over 100000
  • employees who earn at most 50000
  • employees who earn 1000 or more in salary
  • employees who earn between 1000 and 1500 in salary
Filter by aggregate Returns only the attributes that satisfy criteria based on an aggregation functions.
  • Show jobs where average salary is more than 70000
Combined filters

Users can combine filters with either

  • AND: both conditions are true
  • OR: one of the conditions is true

You can't have more than 2 filters. For example, you can't ask "Employees whose first name is John and belongs to the Sales dept and their salary is more than 70000"

  • employees whose job is clerk and salary is above 1000
  • employees who earn a salary above 100000 or commission above 300
Filter or group across entities Users can filter or group by attributes from linked entities.
  • salary of employees located in New York City
  • average salary per department name
  • Department names with average salary above 70000
Order by an attribute Returns the rows sorted by the specified attribute.
  • show employees sorted by their names
  • employees ordered by name from Z to A
Limit the number of rows Users can order and restrict the number of rows returned .

If you don't specify the attribute to use for comparison, then the entity's Measure By value determines which attribute to sort on. See Define Query Rules. However the NLU doesn't always predict ORDER BY in the OMRQL for these types of queries.

  • what are the 10 highest salaries of all employees
  • employee with the lowest salary
Group by attributes Users can group the results based on attributes, and then request summary statistics for each group.
  • how many employees per job
Queries with dates The queries can contain absolute and relative dates and times as well as date intervals. They also can include natural language terms that are associated with date operators. For example, you can use these natural terms to refer to the relative future and past:
  • Past (<): before, earlier than
  • Future (>): after,later than
  • what invoices are due after 11/10/2022
  • what invoices are due in the next 10 days
  • what invoices were paid last year
  • what installments are due after 5th August 3 pm
  • which vulnerabilities were discovered yesterday?
  • which vulnerabilities were discovered last Saturday?

Here are examples of queries that aren't currently supported:

  • User pronouns. For example: what is my salary?

  • Implicit date and time, date, or time values. For example: what is the next event?

  • Date durations which need to be coerced to dates. For example: due in 2 days.

  • Date intervals with both date and time values. "show meetings between 9 am July 5 and 5 pm July 6".

  • A date interval where the start or end date does not contain all the information. For example "show all invoices due between July and December 2022" (the start date is missing the year).

  • A date utterance where the natural language phrase for a date operator can't be resolved correctly. For example, for "what installments have due date more than 40 days ago", "more than" implies ">" but "more than 40 days ago" implies "<".

  • Resolving ambiguous dates based on the context. For example: "invoices that were due on Wednesday" (ambiguious) versus "invoices that were due last Wednesday" (not ambiguous). All ambiguous dates are resolved based on the Resolve Date & Time as setting (Default, Past, Future, Nearest) for the DATE_TIME entity. If Resolve Date & Time as is set to Future, then "invoices that were due on Wednesday" resolves to the upcoming Wednesday.

  • Limited performance for domain specific entity/attribute synonyms. When two words are synonyms in the context of the domain and only in the context of the domain, then the model may not accurately map the synonym to the correct attribute. For example, in the mobile phone domain, IP address and device are synonymous, but that's not true outside the domain.

  • Implicit distinct. Show the cities of all employees (currently shows multiple rows with repeating cities).

  • Implicit aggregations. For example: how much do we pay to all employees in Accounting dept?

  • Implying an attribute by referring only to the entity. For example: Show invoices containing 1234.

  • Group by attributes across multiple entities with aggregations. For example: Show number of employees for every department location.

  • Group by attributes across multiple entities with aggregations. For example: Show number of employees for every department location.

  • Negation. For example: Which employees are not in the accounting department?.

  • Disambiguation. For example: Show amount for all invoices (not clear if invoice amount or gross amount).

  • Implicit attribute reference for values not in the value list entity. For example: Show salary of all employees in HM (where HM is not in the value list for department names, nor is it a synonym for one of the values in the list).

  • Group by attributes across multiple entities with aggregations. For example: Show number of employees for every department location.

  • Queries in any language other than English.

  • Limited resolution of word names for numbers. You can use the following word names for numbers. All other word names are not supported.

    • one, two, three, four, five, six, seven, eight, nine

    • ten

    • twenty

    • fifty

  • Typographical errors. For example: Employees in acounting department (accounting is misspelled).

  • Fuzzy matching for value list entities. For example: Show invoices for Amazon (where the value list contains "Amazon LLC").

  • Yes/no question. For example: Do we have any employees who don't belong to any department?

  • Query containing "starts with" or "ends with". For example: Employees whose name ends with Smith.

  • Subqueries. For example: Show employees whose salaries are more than the highest salary of the Sales dept.

  • Set operations. There is no support for queries which require the use of INTERSECT, UNION, EXCEPT, or NONE.

  • Arithmetic operations. For example: How much money is left for project VMON.

  • The EXISTS and NOT EXISTS keywords.

  • Order-by superlatives. NLU does not consistently predict the order by clause in the OMRQL for superlative queries.. For example: Show top 10 employees.