6.6 Learn About MySQL HeatWave AutoML with Oracle Cloud Infrastructure Generative AI

Available as of MySQL 9.4.0, the NL2ML routine enables you to learn about MySQL HeatWave AutoML by leveraging Oracle Cloud Infrastructure Generative AI (OCI) and asking questions in natural language.

This topic has the following sections.

Before You Begin

This feature is only available on MySQL HeatWave on OCI. It is not available for MySQL HeatWave on AWS.
Oracle Cloud Infrastructure Generative AI is available in specific regions. To confirm if your region has Generative AI, see Regions with Generative AI.
Complete the steps to Authenticate OCI Generative AI Service.
Review the required tasks to Create a Machine Learning Model.

Overview

MySQL HeatWave AutoML is simple to use, and the introduction of NL2ML (natural language to machine learning) makes it even easier to use. NL2ML provides step-by-step guidelines on how to use MySQL HeatWave AutoML for a particular business problem.

You can use NL2ML to do the following:

Ask general questions about using MySQL HeatWave AutoML.
Use NL2ML to generate AutoML queries that you can copy and run. To generate AutoML queries, NL2ML leverages the databases, tables, and table columns in the DB System.

NL2ML uses pretrained foundational models supported by OCI Generative AI. MySQL HeatWave in-database LLMs are currently not supported. To review available OCI Generative AI models, see Pretrained Foundational Models in Generative AI.

For every major MySQL release, NL2ML creates a new vector store from the MySQL HeatWave User Guide documentation. After creating each new vector store, the old vector store gets deleted. Due to the creation of vector stores, some NL2ML calls may take longer to complete.

MySQL 9.4.1 adds the following updates to NL2ML:

Additional data preparation information for training queries. This includes detailed instructions to do the following:
- Join your dataset's various tables into a single table.
- Exclude unnecessary columns.
- Run the TRAIN_TEST_SPLIT routine to automatically prepare training and testing datasets.
If table and column names are non-descriptive, the ability to scan table and column comments in the table to retrieve additional contextual information for preparing queries.

Use Oracle Cloud Infrastructure Generative AI to Learn About MySQL HeatWave AutoML

Run the NL2ML routine.
```
mysql> CALL sys.NL2ML (query, response);
```
Replace query with your question in natural language, and response with the name of the session variable that has the generated answer to your question.
The following example asks about the size limit of the training table and sets the response session variable to @output.
```
mysql> CALL sys.NL2ML("Is there any limit on the size of training table?", @output);
Query OK, 0 rows affected (11.0086 sec)
```

Query the response session variable to view the response. The following example queries the response from the previous question.

mysql> SELECT @output;
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| @output |
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| {"text": "Yes, there are limits on the size of the training table for HeatWave AutoML models. The table used to train a model cannot exceed 10 GB in size, 100 million rows, or 1017 columns."} |
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.0003 sec)

Review additional Syntax Examples.

View Chat History and Change Model

To view the chat history and current Generative AI model, query the @nl2ml)options session variable. Optionally, use JSON_PRETTY to view the output in an easily readable format.

mysql> SELECT JSON_PRETTY(@nl2ml_options);

Use one of the following commands to reset the chat history:

mysql> SET @nl2ml_options = NULL;

mysql> SET @nl2ml_options = JSON_REMOVE(@nl2ml_options, '$.chat_history');

You have the option to change the current Generative AI model. To review available options for model_id, see Pretrained Foundational Models in Generative AI.

To change the current Generative AI model and maintain the current chat history, use the following command:

mysql> SET @nl2ml_options = JSON_SET(@nl2ml_options, '$.mode_id', "model_id");

For example:

mysql> SET @nl2ml_options = JSON_SET(@nl2ml_options, '$.mode_id', "meta.llama-3.1-405b-instruct");

To change the current Generative AI model and reset the chat history, use the following command:

mysql> SET @nl2ml_options = JSON_OBJECT("model_id", "model_id");

For example:

mysql> SET @nl2ml_options = JSON_OBJECT("model_id", "meta.llama-3.1-405b-instruct");

NL2ML Use Cases

The following use case uses NL2ML to request AutoML queries to train, use, and evaluate a model for data related to a food delivery service. The examples show how you can use follow-up questions based on previous questions and responses.

The additional information for training queries is available as of MySQL 9.4.1.

Refer to the following questions and responses. The output for each question is similar to the following:

mysql> CALL sys.NL2ML("Give me a training query to predict how long it takes to deliver food", @output);
Query OK, 0 rows affected (6.8284 sec)

mysql> SELECT @output;
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| @output                                                                                                                                                                                                                                                        |
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| {"text": "**Data Preparation: **\n\n To prepare the dataset for the machine learning task of regression, we need to follow the GENERAL INSTRUCTIONS and the SPECIFIC INSTRUCTIONS for REGRESSION.                                                              |
|           \n\nFirst, let's integrate the tables into one full table using a `CREATE TABLE ... AS SELECT ... JOIN ... ;` SQL query.                                                                                                                             |
|           \n\nFrom the provided dataset, we can see that the `orders` table contains information about orders, and the `restaurants` table contains information about restaurants. We can join these tables based on the `restaurant_id` column.               |
|           \n\nWe can also join the `orders` table with the `delivery_logs` table based on the `order_id` column to get information about the delivery status.\n\nHere's an example SQL query to integrate the tables:                                          |
|           \n\n```sql\nCREATE TABLE integrated_table AS\nSELECT o.order_id, o.customer_id, o.restaurant_id, o.rider_id, o.order_time, o.assigned_time, o.pickup_time, o.delivery_time, o.status, o.estimated_delivery_time,                                     |
|           \n       r.x1 AS restaurant_name, r.x2 AS cuisine_type, r.average_prep_time, r.rating,\n       dl.log_time, dl.status AS delivery_status, dl.latitude, dl.longitude\nFROM orders o\nJOIN restaurants r ON o.restaurant_id = r.restaurant_id          |
|           \nJOIN delivery_logs dl ON o.order_id = dl.order_id;\n```\n\nNext, we need to identify columns that should be excluded from training. These columns are likely to be the ones that are not relevant to the regression task or are redundant.         |
|           \n\nFrom the integrated table, we can exclude the following columns:\n\n* `order_id` (unique identifier for each order)\n* `customer_id` (unique identifier for each customer)\n* `restaurant_id` (unique identifier for each restaurant)            |
|           \n* `rider_id` (unique identifier for each rider)\n* `log_time` (timestamp for each delivery log)\n* `status` (status of each order)\n* `delivery_status` (status of each delivery log)\n\nHere's a JSON array of the columns to be excluded:        |
|           \n\n```json\nJSON_ARRAY('order_id', 'customer_id', 'restaurant_id', 'rider_id', 'log_time', 'status', 'delivery_status')\n```\n\nNow, we can use the `sys.TRAIN_TEST_SPLIT` stored procedure to do an 80-20 train-test split on the integrated table.|
|           \n\nHere's an example SQL query to split the data:\n\n```sql\nCALL sys.TRAIN_TEST_SPLIT('food_delivery_db.integrated_table', NULL, JSON_OBJECT('task', 'regression'));                                                                               |
|           \n```\n\nThis will split the data into training and testing sets, with 80% of the data used for training and 20% used for testing.\n\n```sql\nCALL sys.ML_TRAIN('food_delivery_db.integrated_train', 'estimated_delivery_time',                      |
|           \nJSON_OBJECT('task', 'regression', 'exclude_column_list', \nJSON_ARRAY('order_id', 'customer_id', 'restaurant_id', 'rider_id', 'log_time', 'status', 'delivery_status')), @delivery_time_model);\n```"}                                             |
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.0003 sec)
mysql> CALL sys.NL2ML("how can I deploy this model?", @output);
Query OK, 0 rows affected (11.7221 sec)

mysql> SELECT @output;
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| @output                                                                                                                                                                                                          |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| {"text": "To deploy a trained machine learning model for predicting food delivery duration, you need to load the model into HeatWave AutoML using the ML_MODEL_LOAD routine.                                     |
|            \n\nFirst, you need to train the model using the ML_TRAIN routine and get the model handle.\n\n```mysql\nCALL sys.ML_TRAIN('mlcorpus.food_delivery', 'delivery_duration', NULL, @food_delivery_model);|
|            \n```\n\nThen, you can load the model into HeatWave AutoML using the ML_MODEL_LOAD routine.\n\n```mysql\nCALL sys.ML_MODEL_LOAD(@food_delivery_model, NULL);                                          |
|            \n```\n\nAfter loading the model, you can use it to make predictions using the ML_PREDICT_TABLE or ML_PREDICT_ROW routines.                                                                           |
|            \n\nFor example, to predict the delivery duration for a new food delivery order, you can use the ML_PREDICT_ROW routine.                                                                              |
|            \n\n```mysql\nSELECT sys.ML_PREDICT_ROW(@new_order, @food_delivery_model);                                                                                                                            |
|            \n```\n\nReplace `@new_order` with the actual data for the new order.\n\nNote: Make sure to replace `@food_delivery_model` with the actual model handle returned by the ML_TRAIN routine.             |
|            \n\nAlso, ensure that the model is loaded into HeatWave AutoML before making predictions. You can check the active models using the ML_MODEL_ACTIVE routine.                                          |
|            \n\nIf you need to unload the model, you can use the ML_MODEL_UNLOAD routine.\n\n```mysql\nCALL sys.ML_MODEL_UNLOAD(@food_delivery_model);\n```"}                                                     |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.0002 sec)
mysql> CALL sys.NL2ML("how can I evaluate this model?", @output);
Query OK, 0 rows affected (16.4552 sec)

mysql> SELECT @output;
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| @output                                                                                                                                                                                                                                                                                                                 |
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| {"text": "To evaluate a trained machine learning model for predicting food delivery duration, you can use the ML_SCORE function provided by HeatWave AutoML. Since the task is regression (predicting delivery duration), you can use regression metrics such as neg_mean_absolute_error, neg_mean_squared_error, or r2.|
|            \n\nHere's an example SQL query to evaluate the model:\n\n```sql\nCALL sys.ML_SCORE('mlcorpus.food_delivery_validate', 'delivery_duration', @delivery_duration_model, 'neg_mean_absolute_error', @score, NULL);\nSELECT @score;                                                                              |
|            \n```\n\nIn this query:\n\n- 'mlcorpus.food_delivery_validate' is the table used for validation, which should have the same columns as the training dataset but with different data.                                                                                                                         |
|            \n- 'delivery_duration' is the target column containing the actual delivery durations.\n- @delivery_duration_model is the model handle or a session variable containing the model handle.                                                                                                                    |
|            \n- 'neg_mean_absolute_error' is the metric used to evaluate the model.\n- @score is the user-defined variable name for the computed score.\n- NULL is used for the options parameter since it's not required for regression tasks.                                                                          |
|            \n\nNote that you should replace '@delivery_duration_model' with the actual model handle or session variable containing the model handle. Also, make sure to create a validation table 'mlcorpus.food_delivery_validate' with the same columns as the training dataset but with different data."}            |
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.0002 sec)
mysql> CALL sys.NL2ML("How can I call predictions for a new row??", @output);
Query OK, 0 rows affected (16.8089 sec)

mysql> SELECT @output;
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| @output                                                                                                                                                                                                                                                              |
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| {"text": "To predict the delivery duration for a new food delivery order, you can use the ML_PREDICT_ROW function. First, ensure that the model you want to use is loaded.                                                                                           |
|            \n\nAssuming that your model handle is stored in a session variable @delivery_duration_model, you can use the following SQL query to make a prediction for a new row:                                                                                     |
|            \n\n```sql\nSET @new_order = JSON_OBJECT(\n  \"order_id\", \"new_order_id\",\n  \"customer_id\", \"new_customer_id\",\n  \"restaurant_id\", \"new_restaurant_id\",                                                                                        |
|            \n  \"food_item\", \"new_food_item\",\n  \"order_time\", \"new_order_time\",\n  \"delivery_time\", NULL,\n  \"estimated_delivery_time\", \"new_estimated_delivery_time\",                                                                                 |
|            \n  \"delivery_distance\", \"new_delivery_distance\",\n  \"delivery_method\", \"new_delivery_method\",\n  \"traffic_conditions\", \"new_traffic_conditions\",                                                                                             |
|            \n  \"weather_conditions\", \"new_weather_conditions\",\n  \"order_priority\", \"new_order_priority\",\n  \"payment_method\", \"new_payment_method\"\n);                                                                                                  |
|            \n\nSELECT sys.ML_PREDICT_ROW(@new_order, @delivery_duration_model, NULL);\n```\n\nReplace the values in the JSON_OBJECT with the actual values for the new order. The column names must match the feature column names in the training dataset.          |
|            \n\nNote that the delivery_time is set to NULL, assuming that it is the target variable that you are trying to predict. The result will be a JSON object containing the predicted delivery duration and the feature values used to make the prediction."} |
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.0003 sec)
mysql> CALL sys.NL2ML("How can I generate explanation for a new table?", @output);
Query OK, 0 rows affected (8.7374 sec)

mysql> SELECT @output;
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| @output                                                                                                                                                                                                       |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| {"text": "```sql\nCALL sys.ML_EXPLAIN_TABLE('mlcorpus.new_food_delivery', @food_delivery_model, 'mlcorpus.food_delivery_explanations', JSON_OBJECT('prediction_explainer', 'permutation_importance'));\n```"} |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.0003 sec)

The following use case has non-descriptive table columns and shows how NL2ML can scan column comments to retrieve the contextual information needed to prepare queries.

Refer to the following questions and responses. The output for each question is similar to the following:

mysql> SHOW TABLES;
+----------------------------+
| Tables_in_food_delivery_db |
+----------------------------+
| table_1                    |
| table_10                   |
| table_2                    |
| table_3                    |
| table_4                    |
| table_5                    |
| table_6                    |
| table_7                    |
| table_8                    |
| table_9                    |
+----------------------------+
10 rows in set (0.001 sec)
 
mysql> DESC table_3;
+-------+--------------------------------+------+-----+---------+----------------+
| Field | Type                           | Null | Key | Default | Extra          |
+-------+--------------------------------+------+-----+---------+----------------+
| x0    | int                            | NO   | PRI | NULL    | auto_increment |
| x1    | varchar(255)                   | NO   |     | NULL    |                |
| x2    | varchar(20)                    | YES  | UNI | NULL    |                |
| x3    | enum('bike','motorbike','car') | YES  |     | NULL    |                |
| x4    | decimal(9,6)                   | YES  |     | NULL    |                |
| x5    | decimal(9,6)                   | YES  |     | NULL    |                |
| x6    | decimal(2,1)                   | YES  |     | NULL    |                |
| x7    | tinyint(1)                     | YES  |     | 1       |                |
+-------+--------------------------------+------+-----+---------+----------------+
8 rows in set (0.001 sec)
 
mysql> SELECT COLUMN_COMMENT FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_SCHEMA = 'food_delivery_db' AND TABLE_NAME = 'table_3';
+----------------------------+
| COLUMN_COMMENT             |
+----------------------------+
| Rider id                   |
| Rider name                 |
| Rider phone number         |
| Rider vehicle type         |
| Current latitude           |
| Current logitude           |
| rating                     |
| Whether rider is available |
+----------------------------+
8 rows in set (0.001 sec)
 
mysql> SET @input = 'Give me a training query to predict how long it takes to deliver food';
Query OK, 0 rows affected (0.000 sec)
 
mysql> CALL sys.NL2ML(@input, @output);
Query OK, 0 rows affected (1 min 35.274 sec)
 
mysql> SELECT @output;
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| @output                                                                                                                                                                                                                                                                                                                               |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| {"text": "**Data Preparation:**\n\n1. Integrate the tables into one full table using a SQL query.                                                                                                                                                                                                                                     |
|                              \n\n```sql\nCREATE TABLE integrated_table AS\nSELECT \n  t5.*,\n  t1.x1 AS restaurant_name,\n  t1.x2 AS cuisine_type,                                                                                                                                                                                    |
|                              \n  t1.x3 AS restaurant_latitude,\n  t1.x4 AS restaurant_longitude,\n  t1.x5 AS average_preparation_time,\n  t1.x6 AS restaurant_rating,                                                                                                                                                                 |
|                              \n  t2.x1 AS customer_name,\n  t2.x2 AS customer_email,\n  t2.x3 AS customer_phone_number,\n  t2.x4 AS customer_latitude,                                                                                                                                                                                |
|                              \n  t2.x5 AS customer_longitude,\n  t3.x1 AS rider_name,\n  t3.x2 AS rider_phone_number,\n  t3.x3 AS rider_vehicle_type,                                                                                                                                                                                 |
|                              \n  t3.x4 AS rider_current_latitude,\n  t3.x5 AS rider_current_longitude,\n  t3.x6 AS rider_rating,\n  t3.x7 AS rider_availability,                                                                                                                                                                      |
|                              \n  t4.x1 AS food_item_name,\n  t4.x2 AS food_category,\n  t4.x3 AS food_item_price,\n  t4.x4 AS food_average_preparation_time,                                                                                                                                                                          |
|                              \n  t6.x1 AS order_item_quantity,\n  t6.x2 AS order_item_price,\n  t7.x1 AS log_time,\n  t7.x2 AS log_status,\n  t7.x4 AS log_latitude,                                                                                                                                                                  |
|                              \n  t7.x5 AS log_longitude,\n  t8.x1 AS weather_temperature,\n  t8.x2 AS weather_precipitation,\n  t8.x3 AS weather_wind_speed,                                                                                                                                                                          |
|                              \n  t8.x4 AS weather_traffic_level,\n  t9.x1 AS origin_latitude,\n  t9.x2 AS origin_longitude,\n  t9.x3 AS destination_latitude,                                                                                                                                                                         |
|                              \n  t9.x4 AS destination_longitude,\n  t9.x5 AS distance_in_kilometers,\n  t9.x6 AS estimated_travel_duration,\n  t10.x1 AS shift_start_time,                                                                                                                                                            |
|                              \n  t10.x2 AS shift_end_time,\n  t10.x3 AS total_orders_in_shift,\n  t10.x4 AS total_traveled_distance\nFROM \n  table_5 t5                                                                                                                                                                              |
|                              \n  LEFT JOIN table_1 t1 ON t5.table_1_id = t1.x0\n  LEFT JOIN table_2 t2 ON t5.table_2_id = t2.x0\n  LEFT JOIN table_3 t3 ON t5.table_3_id = t3.x0                                                                                                                                                      |
|                              \n  LEFT JOIN table_4 t4 ON t5.table_4_id = t4.x0\n  LEFT JOIN table_6 t6 ON t5.x0 = t6.table_5_id\n  LEFT JOIN table_7 t7 ON t5.x0 = t7.table_5_id                                                                                                                                                      |
|                              \n  LEFT JOIN table_8 t8 ON t5.x0 = t8.x0\n  LEFT JOIN table_9 t9 ON t5.x0 = t9.x0\n  LEFT JOIN table_10 t10 ON t5.table_3_id = t10.table_3_id;                                                                                                                                                          |
|                              \n```\n\n2. Identify columns that should be excluded from training and provide a JSON array.\n\n```json                                                                                                                                                                                                  |
|                              \nJSON_ARRAY('x0', 'table_1_id', 'table_2_id', 'table_3_id', 'table_4_id', 'table_5_id', 'table_6_id', 'table_7_id', 'table_8_id', 'table_9_id', 'table_10_id', 'created_at', 'recorded_at', 'registered_at', 'log_time', 'shift_start_time', 'shift_end_time')                                          |
|                              \n```\n\n3. Use the available utility SQL stored procedure `sys.TRAIN_TEST_SPLIT` to do an 80-20 train-test split on the integrated full table.\n\n```sql                                                                                                                                                |
|                              \nCALL sys.TRAIN_TEST_SPLIT('food_delivery_db.integrated_table', NULL, JSON_OBJECT('task', 'regression'));\n```                                                                                                                                                                                          |
|                              \n\n**Training Query:**\n\nTo predict how long it takes to deliver food, we will use a regression model. The target column will be `estimated_travel_duration`. The `ML_TRAIN` query is as follows:                                                                                                      |
|                              \n\n```sql\nCALL sys.ML_TRAIN('food_delivery_db.integrated_table_train', 'estimated_travel_duration',\n              JSON_OBJECT('task', 'regression',                                                                                                                                                   |
|                              \n              'exclude_column_list', JSON_ARRAY('x0', 'table_1_id', 'table_2_id', 'table_3_id', 'table_4_id', 'table_5_id', 'table_6_id', 'table_7_id', 'table_8_id', 'table_9_id', 'table_10_id', 'created_at', 'recorded_at', 'registered_at', 'log_time', 'shift_start_time', 'shift_end_time')),   |
|                              \n              @food_delivery_model);\n```"}                                                                                                                                                                                                                                                            |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.000 sec)
 
mysql> SET @input = 'How can I evaluate this model?';
Query OK, 0 rows affected (0.000 sec)
 
mysql> CALL sys.NL2ML(@input, @output);
Query OK, 0 rows affected (15.196 sec)
 
mysql> SELECT @output;
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| @output                                                                                                                                                                                                                |
+-------------------------------------------------------------------------------------------------------------------------------- ---------------------------------------------------------------------------------------+
| {"text": "To evaluate the regression model trained using the sys.ML_TRAIN procedure, you can use the sys.ML_SCORE procedure with the test data. Here's an example:                                                     |
|          \n\n```sql\nCALL sys.ML_SCORE('food_delivery_db.integrated_table_test', 'estimated_travel_duration', @food_delivery_model, 'neg_mean_squared_error', @score, NULL);                                           |
|          \n\nSELECT @score;\n```\n\nThis will return the mean squared error of the model on the test data, which can be used as a metric to evaluate the model's performance.                                          |
|          \n\nYou can replace 'neg_mean_squared_error' with other regression metrics available in the sys.ML_SCORE procedure, such as 'neg_mean_absolute_error', 'neg_median_absolute_error', 'r2', etc.                |
|          \n\nFor example, to evaluate the model using R-squared metric:\n\n```sql\nCALL sys.ML_SCORE('food_delivery_db.integrated_table_test', 'estimated_travel_duration', @food_delivery_model, 'r2', @score, NULL); |
|          \n\nSELECT @score;\n```\n\nThis will return the R-squared value of the model on the test data."}                                                                                                              |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.000 sec)
 
mysql> SET @input = 'How can I call predictions for a new row?';
Query OK, 0 rows affected (0.000 sec)
 
mysql> CALL sys.NL2ML(@input, @output);
Query OK, 0 rows affected (15.651 sec)
 
mysql> SELECT @output;
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| @output                                                                                                                                                                                                                 |
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| {"text": "To make predictions for a new row of data using the trained model, you can use the `sys.ML_PREDICT_ROW` function. Here's an example of how you can do it:                                                     |
|          \n\n```sql\nSET @new_row = JSON_OBJECT(\n  \"estimated_travel_duration\", NULL,  # This will be predicted\n  # Add your new row's column values here\n  # Example:                                             |
|          \n  \"origin_latitude\", 40.7128,\n  \"origin_longitude\", -74.0060,\n  \"destination_latitude\", 34.0522,\n  \"destination_longitude\", -118.2437,                                                            |
|          \n  # Add more column values as needed\n);\n\nSELECT sys.ML_PREDICT_ROW(@new_row, @food_delivery_model, NULL);                                                                                                 |
|          \n```\n\nReplace the `# Add your new row's column values here` comment with the actual column values for your new row. Make sure to match the column names and data types with the ones in your training data. |
|          \n\nThis will return a JSON object containing the predicted value for `estimated_travel_duration`."}                                                                                                           |
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.000 sec)
 
mysql> SET @input = 'How can I generate explanation for a new table?';
Query OK, 0 rows affected (0.000 sec)
 
mysql> CALL sys.NL2ML(@input, @output);
Query OK, 0 rows affected (22.206 sec)
 
mysql> SELECT @output;
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| @output                                                                                                                                                                                                                                                                                     |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| {"text": "To generate explanations for predictions made on a new table using the trained food delivery model, you can use the `sys.ML_EXPLAIN_TABLE` procedure.                                                                                                                             |
|          \n\nFirst, make sure to load the model using the `sys.ML_MODEL_LOAD` procedure:\n\n```sql\nCALL sys.ML_MODEL_LOAD(@food_delivery_model, NULL);                                                                                                                                     |
|          \n```\n\nThen, create a new table with the data you want to generate explanations for. This table should have the same column names as the training data, but without the target column (`estimated_travel_duration` in this case).                                                |
|          \n\nLet's assume the new table is named `new_table` in the `food_delivery_db` schema. \n\nYou can then use the `sys.ML_EXPLAIN_TABLE` procedure to generate explanations for the new table:                                                                                        |
|          \n\n```sql\nCALL sys.ML_EXPLAIN_TABLE('food_delivery_db.new_table', \n                          @food_delivery_model, \n                          'food_delivery_db.new_table_explanations',                                                                                       |
|          \n                          JSON_OBJECT('prediction_explainer', 'permutation_importance'));\n```\n\nThis will generate explanations for the predictions made on the new table and store the results in a new table named `new_table_explanations` in the `food_delivery_db` schema.| 
|          \n\nYou can replace `'permutation_importance'` with `'shap'` to use the SHAP prediction explainer instead.\n\nTo view the explanations, you can query the `new_table_explanations` table:                                                                                          |
|          \n\n```sql\nSELECT * FROM food_delivery_db.new_table_explanations;\n```"} |                                                                                                                                                                                                        |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.000 sec)

What's Next

Review Machine Learning Use Cases.
Review the syntax and examples for the NL2ML routine.