Model Monitoring Example
Model monitoring plays an important role in MLOps by allowing users to monitor the performance of production machine learning models. Monitoring gives you the ability to identify when the performance of a model is no longer satisfactory to meet business objectives and needs to be replaced with an updated model.
- Data drift: This occurs when there is a change in the profile of the data based on which predictions are being made.
- Concept drift: This happens when the expectations of what constitutes correct predictions change over time. For example, the statistical properties of a target variable, such as stocking inventory, may change over time as consumer preferences change.
The large variety of factors that contribute to data and concept drift make model monitoring an important task. It enables users to be aware when these changes compromise the prediction quality of production models.
- Accuracy: Calculates the proportion of correctly classifies cases - both Positive and Negative. For example, if there are a total of TP (True Positives)+TN (True Negatives) correctly classified cases out of TP+TN+FP+FN (True Positives+True Negatives+False Positives+False Negatives) cases, then the formula is:
Accuracy = (TP+TN)/(TP+TN+FP+FN) - Balanced Accuracy: Evaluates how good a binary classifier is. It is especially useful when the classes are imbalanced, that is, when one of the two classes appears a lot more often than the other. This often happens in many settings such as Anomaly Detection etc.
- Recall: Calculates the proportion of actual Positives that is correctly classified.
- Precision: Calculates the proportion of predicted Positives that is True Positive.
- F1 Score: Combines precision and recall into a single number. F1-score is computed using harmonic mean which is calculated by the formula:
F1-score = 2*(precision*recall)/(precision+recall) - AUC (Area under the ROC Curve): Provides an aggregate measure of discrimination regardless of the decision threshold. AUC - ROC curve is a performance measurement for the classification problems at various threshold settings.
- R2: A statistical measure that calculates how close the data are to the fitted regression line. In general, the higher the value of R-squared, the better the model fits your data. The value of R2 is always between 0 to 1, where:
0indicates that the model explains none of the variability of the response data around its mean.1indicates that the model explains all the variability of the response data around its mean.
- Mean Squared Error: This is the mean of the squared difference of predicted and true targets.
- Mean Absolute Error: This is the mean of the absolute difference of predicted and true targets.
- Median Absolute Error: This is the median of the absolute difference between predicted and true targets.
Model Monitoring Workflow
- Deploy a model through AutoML UI
- Obtain the access token
- Get the Model ID of the model to be used for monitoring
- Create a model monitoring job
- View the details of a model monitoring job
- Update a model monitoring job (optional)
- Enable a model monitoring job
- View and understand the output of a model monitoring job
1. Deploy a Model
- Create an AutoML experiment if you opt for the automated way to build machine learning models, and Deploy a model
2: Obtain the Access Token
You must obtain an authentication token by using your Oracle Machine Learning (OML) account credentials to send requests to OML Services. To authenticate and obtain a token, use cURL with the -d option to pass the credentials for your Oracle Machine Learning account against the Oracle Machine Learning user management cloud service REST endpoint /oauth2/v1/token. Run the following command to obtain the access token:
$ curl -X POST --header 'Content-Type: application/json' --header 'Accept: application/json' -d '{"grant_type":"password", "username":"'<yourusername>'",
"password":"' <yourpassword>'"}'"<oml-cloud-service-location-url>/omlusers/api/oauth2/v1/token"-X POSTspecifies to use a POST request when communicating with the HTTP server.-headerdefines the headers required for the request (application/json).-dsends the username and password authentication credentials as data in a POST request to the HTTP server.Content-Typedefines the response format (JSON).Acceptdefines the response format (JSON).yourusernameis the user name of a Oracle Machine Learning user with the default OML_DEVELOPER role.yourpasswordis the password for the user name.-
oml-cloud-service-location-urlis a URL containing the REST server portion of the Oracle Machine Learning User Management Cloud Service instance URL that includes the tenancy ID and database name. You can obtain the URL from the Development tab in the Service Console of your Oracle Autonomous AI Database instance.
3: Get the Model ID of the Model to be Used for Monitoring
To get the modelId , send a GET request to the deployment endpoint and specify the model URI.
modelId:$ curl -X GET "<oml-cloud-service-location-url>/omlmod/v1/deployment/HousePowerNN" \
--header "Authorization: Bearer ${token}" | jq '.modelId' In this example, the model URI is HousePowerNNNote:
The model URI is provided by the user when deploying the model using the AutoML UI or when deploying the model through a REST client.The GET request returns the following:
"modelId": "0bf13d1f-86a6-465d-93d1-8985afd1bbdb"
4: Create a Model Monitoring Job
After obtaining the access token, you can create a model monitoring job by sending a POST request to the deployment endpoint and by specifying the model URI. To create a model monitoring job, you require the model IDs for the models that you want to monitor. The request body may include a single model, or a list of up to 20 models identified by their model IDs.
Example of a POST Request for Model Monitoring Job Creation
- In the
jobScheduleparameter, specify the job start date, job end date, job frequency, and maximum number of runs. - In the
jobPropertyparameter, specify the model monitoring details such as:- Model monitoring job name and job type
- Autonomous AI Database service level
- Table where the model monitoring details will be saved
- Drift alert trigger
- Threshold
- Maximum number of runs
- Baseline and new data to be used
- Chosen balanced accuracy for the performance metric
- Start date (optional ) and end date (optional ) correspond to the DATE or TIMESTAMP column in the table or view denoted by
newData, and contained in thetimeColumnfield. If the start and end dates are not specified, the earliest and latest dates and times in thetimeColumnare used.
$ curl -X POST "<oml-cloud-service-location-url>/omlmod/v1/jobs" \
--header "Authorization: Bearer ${token}" \
--header 'Content-Type: application/json' \
--data '{
"jobSchedule": {
"jobStartDate": "2023-03-25T00:30:07Z", # job start date and time
"repeatInterval": "FREQ=DAILY", # job frequency
"jobEndDate": "2023-03-30T20:50:06Z", # job end date and time
"maxRuns": "5" # max runs within the schedule
},
"jobProperties": {
"jobName": "MY_MODEL_MONITOR1", # job name
"jobType": "MODEL_MONITORING", # job type; MODEL_SCORING
"disableJob": false, # flag to disable the job at submission
"jobServiceLevel": "LOW", # Autonomous AI Database service level; either LOW, MEDIUM, and HIGH
"inputSchemaName": "OMLUSER", # database schema that owns the input table/view
"outputSchemaName": "OMLUSER", # database schema that owns the output table
"outputData": "Global_Active_Power_Monitor", # table where the job results will be saved in the format {jobID}_{outputData}
"jobDescription": "Global active power monitoring job", # job description
"baselineData": "HOUSEHOLD_POWER_BASE", # table/view containing baseline data
"newData": "HOUSEHOLD_POWER_NEW", # table/view with new data to compare against baseline
"frequency": "Year", # time window unit that the monitoring is done on in the new data
"threshold": 0.15, # threshold to trigger drift alert
"timeColumn": "DATES", # date or timestamp column in newData
"startDate": "2008-01-01T00:00:00Z", # the start date and time of monitoring in the new data
"endDate": "2010-11-26T00:00:00Z", # the end date and time of monitoring in the new data
"caseidColumn": null, # case identifier column in the baseline and new data
"performanceMetric": "MEAN_SQUARED_ERROR", # metric used to measure model performance
"modelList": [ # model ID or list of model IDs to be monitored
"0bf13d1f-86a6-465d-93d1-8985afd1bbdb"
],
"recompute": false # flag to determine whether to overwrite the results table
}
}' | jqThe parameters to run this job are categorized into:
jobType:Specifies the type of job to be run, and is set toMODEL_MONITORINGfor model monitoring jobsoutputData:The output data identifier. The results of the job is written to a table named{jobId}_{ouputData}baselineData:The table or view that contains baseline data to monitor. At least 50 rows per period are required for model monitoring, otherwise the analysis is skippednewData:The table or view with new data to be compared against the baseline. At least 50 rows per period are required for model monitoring, otherwise the analysis is skippedmodelList:The list of models to be monitored, identified by theirmodelIds. By default, up to 20 models can be monitored by a single job
disableJob:A flag to disable the job at submission. If not set, the default isfalseand the job is enabled at submission.timeColumn:The column name containing date or thetimestampcolumn in the new data. If not provided, the entirenewDatais treated as one period.frequency:Indicates the unit of time for which the monitoring is done on with the new data. The frequency can be"day","week","month", or"year". If not provided, the entire"new"data is used as a single time period.threshold:The threshold to trigger a drift alert.recompute:A flag on whether to update the already computed periods. The default isFalse. This means that only time periods not present in the output result table will be computed.performanceMetric:The metric used to measure model performance.Note:
For regression models, the default isMEAN_SQUARED_ERROR. For classification models, the default isBALANCED_ACCURACY.caseidColumn:A case identifier column in the baseline and new data. Providing it improves the reproducibility of results.startDate:The start date or timestamp of monitoring in thenewDatacolumn. The columntimeColumnis mandatory forstartDate. IfstartDateis not provided, thenstartDatedepends on whetherfrequencyis provided. Iffrequencyis not provided, then the earliest date intimeColumnis used as thestartDate. If bothstartDateandfrequencyare not provided, then the most recent of the earliest date intimeColumnand the starting date of the 10th most recent cycle is considered as thestartDate.Note:
The supported date and time format is the ISO-8601 date and time format. For example:2022-05-12T02:33:16ZendDate:The end date or timestamp of monitoring in thenewData. The columntimeColumnis mandatory forendDate. IfendDateis not provided, then the most recent date intimeColumnwill be used.Note:
The supported date and time format is the ISO-8601 date and time format. For example:2022-05-12T02:33:16ZjobDescription:A text description of the job.outputSchemaName:The database schema that owns the output table. If not specified, the output schema will be the same as the input schema.inputSchemaName:The database schema that owns the input table or view. If not specified, the input schema will be the same as the username in the request token.jobServiceLevel:The service level for the job, which can be LOW, MEDIUM, or HIGH.
Response of the POST Request for Job Creation
Here is an example of a model monitoring job creation response:
{
"jobId": "OML$736F509B_FC1A_400A_AC75_553F1D6C5D97",
"links": [
{
"rel": "self",
"href": "<OML Service URL>/v1/jobs/OML%24736F509B_FC1A_400A_AC75_553F1D6C5D97"
}
]
}
When the job is successfully submitted, you will receive a response with the job ID. Note the jobId for future reference to submit requests for retrieving job details or to perform any action on the job.
5: View Details of the Submitted Job
To view the details of your submitted job, send a GET request to the /omlmod/v1/jobs/{jobId} endpoint, where jobId is the ID provided in response to the successful submission of your model monitoring job.
$ export jobid='OML$736F509B_FC1A_400A_AC75_553F1D6C5D97' # define the Job ID as a single-quoted variable
$ curl -X GET "<oml-cloud-service-location-url>/omlmod/v1/jobs/${jobid}" \
--header 'Accept: application/json' \
--header 'Content-Type: application/json' \
--header "Authorization: Bearer ${token}" | jqHere is a sample output of the job details request. The jobStatus CREATED indicates that the job has been created. If your job has already run once, you will see information returned about the last job run.
returns:
{
"jobId": "OML$736F509B_FC1A_400A_AC75_553F1D6C5D97",
"jobRequest": {
"jobSchedule": {
"jobStartDate": "2023-03-25T00:30:07Z",
"repeatInterval": "FREQ=DAILY",
"jobEndDate": "2023-03-30T00:30:07Z",
"maxRuns": 5
},
"jobProperties": {
"jobType": "MODEL_MONITORING",
"inputSchemaName": "OMLUSER",
"outputSchemaName": "OMLUSER",
"outputData": "Global_Active_Power_Monitor",
"jobDescription": "Global active power monitoring job",
"jobName": "MY_MODEL_MONITOR1",
"disableJob": false,
"jobServiceLevel": "LOW",
"baselineData": "HOUSEHOLD_POWER_BASE",
"newData": "HOUSEHOLD_POWER_NEW",
"timeColumn": "DATES",
"startDate": "2008-01-01T00:00:00Z",
"endDate": "2010-11-26T00:00:00Z",
"frequency": "Year",
"threshold": 0.15,
"recompute": false,
"caseidColumn": null,
"modelList": [
"0bf13d1f-86a6-465d-93d1-8985afd1bbdb"
],
"performanceMetric": "MEAN_SQUARED_ERROR"
}
},
"jobStatus": "CREATED",
"dateSubmitted": "2023-03-25T00:26:16.127906Z",
"links": [
{
"rel": "self",
"href": "<OML Service URL>/omlmod/v1/jobs/OML%24736F509B_FC1A_400A_AC75_553F1D6C5D97"
}
],
"jobFlags": [],
"state": "SCHEDULED",
"enabled": true,
"runCount": 0,
"nextRunDate": "2023-03-25T00:30:07Z"
}6: Update the Model Monitoring Job (Optional)
After an asynchronous job is submitted, you have the option to update the job. Send a POST request to the /omlmod/v1/jobs/{jobID} endpoint to update a job.
-
startDate endDatethresholdrecomputemodelList
updateProperties field. The trigger for the drift alert is updated to 0.20, and the flag recompute is set to update the already computed periods so that each job run will recalculate all time periods present in the specified timeColumn in the data.$ curl -i -X POST "<oml-cloud-service-location-url>/omlmod/v1/jobs/${jobid}" \
--header "Authorization: Bearer ${token}" \
--header 'Content-Type: application/json' \
--data '{
"updateProperties": {
"threshold": 0.20,
"recompute": "true"
}
}'Note:
A successful update will return an HTTP 204 response with no content.7: View the Model Monitoring Job Output
Once your job has run, either according to its schedule or by the RUN action, you can view its output in the table you specified in your job request with the outputData parameter. The full name of the table is {jobId}_{outputData}. You can check if your job is complete by sending a request to view its details. If your job has run at least once you should see the lastRunDetail parameter with information on that run.
%sql
SELECT IS_BASELINE, MODEL_ID, round(METRIC, 4), HAS_DRIFT, round(DRIFT, 4), MODEL_TYPE,
THRESHOLD, MODEL_METRICS
FROM OML$736F509B_FC1A_400A_AC75_553F1D6C5D97_Global_Active_Power_MonitorThe command returns a table with the columns IS_BASELINE, MODEL_ID, ROUND (METRIC, 4), HAS_DRIFT, ROUND (DRIFT, 4), MODEL_TYPE, THRESHOLD, and MODEL_METRICS. Note that the first row of results is the baseline time period. As drift is not calculated on data in the baseline time period, that is why the columns HAS_DRIFT , ROUND (DRIFT, 4), and THRESHOLD are empty for this row.
6: Perform an Action on a Model Monitoring Job (Optional)
When your job has been successfully submitted, its state is set to ENABLED by default. This means that it will run as per the schedule you specified when submitting the job unless its updated to another state, such as DISABLED. You can do this by sending a request to the /omlmod/v1/jobs/{jobid}/action endpoint.
DBMS_SCHEDULER to perform actions on jobs. There are four options for actions that can be sent to this endpoint:
DISABLE:Disables the job at submission. Theforceproperty can be used with this action to forcefully interrupt any running job.Note:
Jobs can be set toDISABLEDat submission by setting thedisableJobflag totrue.ENABLE:Enables a job. After a disabled job is enabled, the scheduler begins to automatically run the job according to its schedule.RUN:This option immediately runs the job as a way to test a job or run it outside of its schedule.STOP:Stops a currently running job.
DISABLED. $ curl -i -X POST "<oml-cloud-service-location-url>/omlmod/v1/jobs/${jobid}/action" \
--header "Authorization: Bearer ${token}" \
--header 'Content-Type: application/json' \
--data '{
"action": "DISABLE",
"force": "false"
}'Note:
Theforce parameter is set to false by default. You can use it with the DISABLE action to interrupt a running job.
When you successfully submit your job you will receive a 204 response with no body.
7: Delete a Model Monitoring Job
To delete a previously submitted job, send a DELETE request along with the jobid to the /omlmod/v1/jobs endpoint.
DELETE request to the /omlmod/v1/jobs endpoint: $ curl -X DELETE "<oml-cloud-service-location-url>/omlmod/v1/jobs/${jobid}" \
--header 'Accept: application/json' \
--header 'Content-Type: application/json' \
--header "Authorization: Bearer ${token}" | jq
Recreate the Example (Optional)
The examples here uses the tables HOUSEHOLD_POWER_BASE and HOUSEHOLD_POWER_NEW. These tables are created using the Individual Household Electric Power Consumption data from the UCI Machine Learning Repository. To recreate the example described here, follow these steps:
-
Run this command in a R paragraph in an OML notebook to drop any tables, if it exists, and suppress warnings:
%r options(warn=-1) try(ore.drop(table="HOUSEHOLD_POWER_BASE")) try(ore.drop(table="HOUSEHOLD_POWER_NEW")) -
Run the following command in a R paragraph to read and transform data:
%r test <- read.csv("https://objectstorage.us-sanjose-1.oraclecloud.com/n/adwc4pm/b/OML_Data/o/household_power_consumption.txt", sep=";") test <- transform(test, Date = as.Date(Date, format = "%d/%m/%Y")) test <- transform(test, Global_active_power = as.numeric(Global_active_power)) test <- transform(test, Global_reactive_power = as.numeric(Global_reactive_power)) test <- transform(test, Voltage = as.numeric(Voltage)) test <- transform(test, Global_intensity = as.numeric(Global_intensity)) test <- transform(test, Sub_metering_1 = as.numeric(Sub_metering_1)) test <- transform(test, Sub_metering_2 = as.numeric(Sub_metering_2)) test <- transform(test, Sub_metering_3 = as.numeric(Sub_metering_3)) colnames(test) <- c("DATES", "TIMES", "GLOBAL_ACTIVE_POWER", "GLOBAL_REACTIVE_POWER", "VOLTAGE", "GLOBAL_INTENSITY", "SUB_METERING_1", "SUB_METERING_2", "SUB_METERING_3") -
Now create the baseline data
HOUSEHOLD_POWER_BASEand new dataHOUSEHOLD_POWER_NEW. For this, run this R script:%r test_base <- test[test$DATES < "2008-01-01",] test_new <- test[test$DATES > "2007-12-31",] # Create OML proxy objects ore.create(test_base, table="HOUSEHOLD_POWER_BASE") ore.create(test_new, table="HOUSEHOLD_POWER_NEW") -
To view the baseline data
HOUSEHOLD_POWER_BASE, run this SQL command in a SQL paragraph in the notebook:%sql SELECT * FROM HOUSEHOLD_POWER_BASE FETCH FIRST 5 ROWS ONLY; -
View the new data
HOUSEHOLD_POWER_NEWby running this SQL command:%sql SELECT * FROM HOUSEHOLD_POWER_NEW FETCH FIRST 5 ROWS ONLY;