Model Monitoring Example
Model monitoring plays an important role in MLOps by allowing users to monitor the performance of production machine learning models. Monitoring gives you the ability to identify when the performance of a model is no longer satisfactory to meet business objectives and needs to be replaced with an updated model.
- Data drift: This occurs when there is a change in the profile of the data based on which predictions are being made.
- Concept drift: This happens when the expectations of what constitutes correct predictions change over time. For example, the statistical properties of a target variable, such as stocking inventory, may change over time as consumer preferences change.
The large variety of factors that contribute to data and concept drift make model monitoring an important task. It enables users to be aware when these changes compromise the prediction quality of production models.
- Accuracy: Calculates the proportion of correctly classifies cases - both Positive and Negative. For example, if there are a total of TP (True Positives)+TN (True Negatives) correctly classified cases out of TP+TN+FP+FN (True Positives+True Negatives+False Positives+False Negatives) cases, then the formula is:
Accuracy = (TP+TN)/(TP+TN+FP+FN)
- Balanced Accuracy: Evaluates how good a binary classifier is. It is especially useful when the classes are imbalanced, that is, when one of the two classes appears a lot more often than the other. This often happens in many settings such as Anomaly Detection etc.
- Recall: Calculates the proportion of actual Positives that is correctly classified.
- Precision: Calculates the proportion of predicted Positives that is True Positive.
- F1 Score: Combines precision and recall into a single number. F1-score is computed using harmonic mean which is calculated by the formula:
F1-score = 2*(precision*recall)/(precision+recall)
- AUC (Area under the ROC Curve): Provides an aggregate measure of discrimination regardless of the decision threshold. AUC - ROC curve is a performance measurement for the classification problems at various threshold settings.
- R2: A statistical measure that calculates how close the data are to the fitted regression line. In general, the higher the value of R-squared, the better the model fits your data. The value of R2 is always between 0 to 1, where:
0
indicates that the model explains none of the variability of the response data around its mean.1
indicates that the model explains all the variability of the response data around its mean.
- Mean Squared Error: This is the mean of the squared difference of predicted and true targets.
- Mean Absolute Error: This is the mean of the absolute difference of predicted and true targets.
- Median Absolute Error: This is the median of the absolute difference between predicted and true targets.
Model Monitoring Workflow
- Deploy a model through AutoML UI
- Obtain the access token
- Get the Model ID of the model to be used for monitoring
- Create a model monitoring job
- View the details of a model monitoring job
- Update a model monitoring job (optional)
- Enable a model monitoring job
- View and understand the output of a model monitoring job
1. Deploy a Model
- Create an AutoML experiment if you opt for the automated way to build machine learning models, and Deploy a model
2: Obtain the Access Token
You must obtain an authentication token by using your Oracle Machine Learning (OML) account credentials to send requests to OML Services. To authenticate and obtain a token, use cURL
with the -d
option to pass the credentials for your Oracle Machine Learning account against the Oracle Machine Learning user management cloud service REST endpoint /oauth2/v1/token
. Run the following command to obtain the access token:
$ curl -X POST --header 'Content-Type: application/json' --header 'Accept: application/json' -d '{"grant_type":"password", "username":"'<yourusername>'",
"password":"' <yourpassword>'"}'"<oml-cloud-service-location-url>/omlusers/api/oauth2/v1/token"
-X POST
specifies to use a POST request when communicating with the HTTP server.-header
defines the headers required for the request (application/json).-d
sends the username and password authentication credentials as data in a POST request to the HTTP server.Content-Type
defines the response format (JSON).Accept
defines the response format (JSON).yourusername
is the user name of a Oracle Machine Learning user with the default OML_DEVELOPER role.yourpassword
is the password for the user name.-
oml-cloud-service-location-url
is a URL containing the REST server portion of the Oracle Machine Learning User Management Cloud Service instance URL that includes the tenancy ID and database name. You can obtain the URL from the Development tab in the Service Console of your Oracle Autonomous Database instance.
3: Get the Model ID of the Model to be Used for Monitoring
To get the modelId
, send a GET
request to the deployment endpoint and specify the model URI.
modelId:
$ curl -X GET "<oml-cloud-service-location-url>/omlmod/v1/deployment/HousePowerNN" \
--header "Authorization: Bearer ${token}" | jq '.modelId'
In this example, the model URI is HousePowerNN
Note:
The model URI is provided by the user when deploying the model using the AutoML UI or when deploying the model through a REST client.The GET request returns the following:
"modelId": "0bf13d1f-86a6-465d-93d1-8985afd1bbdb"
4: Create a Model Monitoring Job
After obtaining the access token, you can create a model monitoring job by sending a POST request to the deployment endpoint and by specifying the model URI. To create a model monitoring job, you require the model IDs for the models that you want to monitor. The request body may include a single model, or a list of up to 20 models identified by their model IDs.
Example of a POST Request for Model Monitoring Job Creation
- In the
jobSchedule
parameter, specify the job start date, job end date, job frequency, and maximum number of runs. - In the
jobProperty
parameter, specify the model monitoring details such as:- Model monitoring job name and job type
- Autonomous Database service level
- Table where the model monitoring details will be saved
- Drift alert trigger
- Threshold
- Maximum number of runs
- Baseline and new data to be used
- Chosen balanced accuracy for the performance metric
- Start date (optional ) and end date (optional ) correspond to the DATE or TIMESTAMP column in the table or view denoted by
newData
, and contained in thetimeColumn
field. If the start and end dates are not specified, the earliest and latest dates and times in thetimeColumn
are used.
$ curl -X POST "<oml-cloud-service-location-url>/omlmod/v1/jobs" \
--header "Authorization: Bearer ${token}" \
--header 'Content-Type: application/json' \
--data '{
"jobSchedule": {
"jobStartDate": "2023-03-25T00:30:07Z", # job start date and time
"repeatInterval": "FREQ=DAILY", # job frequency
"jobEndDate": "2023-03-30T20:50:06Z", # job end date and time
"maxRuns": "5" # max runs within the schedule
},
"jobProperties": {
"jobName": "MY_MODEL_MONITOR1", # job name
"jobType": "MODEL_MONITORING", # job type; MODEL_SCORING
"disableJob": false, # flag to disable the job at submission
"jobServiceLevel": "LOW", # Autonomous Database service level; either LOW, MEDIUM, and HIGH
"inputSchemaName": "OMLUSER", # database schema that owns the input table/view
"outputSchemaName": "OMLUSER", # database schema that owns the output table
"outputData": "Global_Active_Power_Monitor", # table where the job results will be saved in the format {jobID}_{outputData}
"jobDescription": "Global active power monitoring job", # job description
"baselineData": "HOUSEHOLD_POWER_BASE", # table/view containing baseline data
"newData": "HOUSEHOLD_POWER_NEW", # table/view with new data to compare against baseline
"frequency": "Year", # time window unit that the monitoring is done on in the new data
"threshold": 0.15, # threshold to trigger drift alert
"timeColumn": "DATES", # date or timestamp column in newData
"startDate": "2008-01-01T00:00:00Z", # the start date and time of monitoring in the new data
"endDate": "2010-11-26T00:00:00Z", # the end date and time of monitoring in the new data
"caseidColumn": null, # case identifier column in the baseline and new data
"performanceMetric": "MEAN_SQUARED_ERROR", # metric used to measure model performance
"modelList": [ # model ID or list of model IDs to be monitored
"0bf13d1f-86a6-465d-93d1-8985afd1bbdb"
],
"recompute": false # flag to determine whether to overwrite the results table
}
}' | jq
The parameters to run this job are categorized into:
jobType:
Specifies the type of job to be run, and is set toMODEL_MONITORING
for model monitoring jobsoutputData:
The output data identifier. The results of the job is written to a table named{jobId}_{ouputData}
baselineData:
The table or view that contains baseline data to monitor. At least 50 rows per period are required for model monitoring, otherwise the analysis is skippednewData:
The table or view with new data to be compared against the baseline. At least 50 rows per period are required for model monitoring, otherwise the analysis is skippedmodelList:
The list of models to be monitored, identified by theirmodelIds
. By default, up to 20 models can be monitored by a single job
disableJob:
A flag to disable the job at submission. If not set, the default isfalse
and the job is enabled at submission.timeColumn:
The column name containing date or thetimestamp
column in the new data. If not provided, the entirenewData
is treated as one period.frequency:
Indicates the unit of time for which the monitoring is done on with the new data. The frequency can be"day"
,"week"
,"month"
, or"year"
. If not provided, the entire"new"
data is used as a single time period.threshold:
The threshold to trigger a drift alert.recompute:
A flag on whether to update the already computed periods. The default isFalse
. This means that only time periods not present in the output result table will be computed.performanceMetric:
The metric used to measure model performance.Note:
For regression models, the default isMEAN_SQUARED_ERROR
. For classification models, the default isBALANCED_ACCURACY
.caseidColumn:
A case identifier column in the baseline and new data. Providing it improves the reproducibility of results.startDate:
The start date or timestamp of monitoring in thenewData
column. The columntimeColumn
is mandatory forstartDate
. IfstartDate
is not provided, thenstartDate
depends on whetherfrequency
is provided. Iffrequency
is not provided, then the earliest date intimeColumn
is used as thestartDate
. If bothstartDate
andfrequency
are not provided, then the most recent of the earliest date intimeColumn
and the starting date of the 10th most recent cycle is considered as thestartDate
.Note:
The supported date and time format is the ISO-8601 date and time format. For example:2022-05-12T02:33:16Z
endDate:
The end date or timestamp of monitoring in thenewData
. The columntimeColumn
is mandatory forendDate
. IfendDate
is not provided, then the most recent date intimeColumn
will be used.Note:
The supported date and time format is the ISO-8601 date and time format. For example:2022-05-12T02:33:16Z
jobDescription:
A text description of the job.outputSchemaName:
The database schema that owns the output table. If not specified, the output schema will be the same as the input schema.inputSchemaName:
The database schema that owns the input table or view. If not specified, the input schema will be the same as the username in the request token.jobServiceLevel:
The service level for the job, which can be LOW, MEDIUM, or HIGH.
Response of the POST Request for Job Creation
Here is an example of a model monitoring job creation response:
{
"jobId": "OML$736F509B_FC1A_400A_AC75_553F1D6C5D97",
"links": [
{
"rel": "self",
"href": "<OML Service URL>/v1/jobs/OML%24736F509B_FC1A_400A_AC75_553F1D6C5D97"
}
]
}
When the job is successfully submitted, you will receive a response with the job ID. Note the jobId
for future reference to submit requests for retrieving job details or to perform any action on the job.
5: View Details of the Submitted Job
To view the details of your submitted job, send a GET request to the /omlmod/v1/jobs/{jobId}
endpoint, where jobId
is the ID provided in response to the successful submission of your model monitoring job.
$ export jobid='OML$736F509B_FC1A_400A_AC75_553F1D6C5D97' # define the Job ID as a single-quoted variable
$ curl -X GET "<oml-cloud-service-location-url>/omlmod/v1/jobs/${jobid}" \
--header 'Accept: application/json' \
--header 'Content-Type: application/json' \
--header "Authorization: Bearer ${token}" | jq
Here is a sample output of the job details request. The jobStatus
CREATED indicates that the job has been created. If your job has already run once, you will see information returned about the last job run.
returns:
{
"jobId": "OML$736F509B_FC1A_400A_AC75_553F1D6C5D97",
"jobRequest": {
"jobSchedule": {
"jobStartDate": "2023-03-25T00:30:07Z",
"repeatInterval": "FREQ=DAILY",
"jobEndDate": "2023-03-30T00:30:07Z",
"maxRuns": 5
},
"jobProperties": {
"jobType": "MODEL_MONITORING",
"inputSchemaName": "OMLUSER",
"outputSchemaName": "OMLUSER",
"outputData": "Global_Active_Power_Monitor",
"jobDescription": "Global active power monitoring job",
"jobName": "MY_MODEL_MONITOR1",
"disableJob": false,
"jobServiceLevel": "LOW",
"baselineData": "HOUSEHOLD_POWER_BASE",
"newData": "HOUSEHOLD_POWER_NEW",
"timeColumn": "DATES",
"startDate": "2008-01-01T00:00:00Z",
"endDate": "2010-11-26T00:00:00Z",
"frequency": "Year",
"threshold": 0.15,
"recompute": false,
"caseidColumn": null,
"modelList": [
"0bf13d1f-86a6-465d-93d1-8985afd1bbdb"
],
"performanceMetric": "MEAN_SQUARED_ERROR"
}
},
"jobStatus": "CREATED",
"dateSubmitted": "2023-03-25T00:26:16.127906Z",
"links": [
{
"rel": "self",
"href": "<OML Service URL>/omlmod/v1/jobs/OML%24736F509B_FC1A_400A_AC75_553F1D6C5D97"
}
],
"jobFlags": [],
"state": "SCHEDULED",
"enabled": true,
"runCount": 0,
"nextRunDate": "2023-03-25T00:30:07Z"
}
6: Update the Model Monitoring Job (Optional)
After an asynchronous job is submitted, you have the option to update the job. Send a POST request to the /omlmod/v1/jobs/{jobID}
endpoint to update a job.
-
startDate
endDate
threshold
recompute
modelList
updateProperties
field. The trigger for the drift alert is updated to 0.20
, and the flag recompute
is set to update the already computed periods so that each job run will recalculate all time periods present in the specified timeColumn
in the data.$ curl -i -X POST "<oml-cloud-service-location-url>/omlmod/v1/jobs/${jobid}" \
--header "Authorization: Bearer ${token}" \
--header 'Content-Type: application/json' \
--data '{
"updateProperties": {
"threshold": 0.20,
"recompute": "true"
}
}'
Note:
A successful update will return an HTTP 204 response with no content.7: View the Model Monitoring Job Output
Once your job has run, either according to its schedule or by the RUN action, you can view its output in the table you specified in your job request with the outputData
parameter. The full name of the table is {jobId}_{outputData}
. You can check if your job is complete by sending a request to view its details. If your job has run at least once you should see the lastRunDetail
parameter with information on that run.
%sql
SELECT IS_BASELINE, MODEL_ID, round(METRIC, 4), HAS_DRIFT, round(DRIFT, 4), MODEL_TYPE,
THRESHOLD, MODEL_METRICS
FROM OML$736F509B_FC1A_400A_AC75_553F1D6C5D97_Global_Active_Power_Monitor
The command returns a table with the columns IS_BASELINE
, MODEL_ID
, ROUND (METRIC, 4)
, HAS_DRIFT
, ROUND (DRIFT, 4)
, MODEL_TYPE
, THRESHOLD
, and MODEL_METRICS
. Note that the first row of results is the baseline time period. As drift is not calculated on data in the baseline time period, that is why the columns HAS_DRIFT
, ROUND (DRIFT, 4)
, and THRESHOLD
are empty for this row.
6: Perform an Action on a Model Monitoring Job (Optional)
When your job has been successfully submitted, its state is set to ENABLED
by default. This means that it will run as per the schedule you specified when submitting the job unless its updated to another state, such as DISABLED
. You can do this by sending a request to the /omlmod/v1/jobs/{jobid}/action
endpoint.
DBMS_SCHEDULER
to perform actions on jobs. There are four options for actions that can be sent to this endpoint:
DISABLE:
Disables the job at submission. Theforce
property can be used with this action to forcefully interrupt any running job.Note:
Jobs can be set toDISABLED
at submission by setting thedisableJob
flag totrue
.ENABLE:
Enables a job. After a disabled job is enabled, the scheduler begins to automatically run the job according to its schedule.RUN:
This option immediately runs the job as a way to test a job or run it outside of its schedule.STOP:
Stops a currently running job.
DISABLED
. $ curl -i -X POST "<oml-cloud-service-location-url>/omlmod/v1/jobs/${jobid}/action" \
--header "Authorization: Bearer ${token}" \
--header 'Content-Type: application/json' \
--data '{
"action": "DISABLE",
"force": "false"
}'
Note:
Theforce
parameter is set to false
by default. You can use it with the DISABLE action to interrupt a running job.
When you successfully submit your job you will receive a 204 response with no body.
7: Delete a Model Monitoring Job
To delete a previously submitted job, send a DELETE
request along with the jobid
to the /omlmod/v1/jobs
endpoint.
DELETE
request to the /omlmod/v1/jobs
endpoint: $ curl -X DELETE "<oml-cloud-service-location-url>/omlmod/v1/jobs/${jobid}" \
--header 'Accept: application/json' \
--header 'Content-Type: application/json' \
--header "Authorization: Bearer ${token}" | jq
Recreate the Example (Optional)
The examples here uses the tables HOUSEHOLD_POWER_BASE
and HOUSEHOLD_POWER_NEW
. These tables are created using the Individual Household Electric Power Consumption data from the UCI Machine Learning Repository. To recreate the example described here, follow these steps:
-
Run this command in a R paragraph in an OML notebook to drop any tables, if it exists, and suppress warnings:
%r options(warn=-1) try(ore.drop(table="HOUSEHOLD_POWER_BASE")) try(ore.drop(table="HOUSEHOLD_POWER_NEW"))
-
Run the following command in a R paragraph to read and transform data:
%r test <- read.csv("https://objectstorage.us-sanjose-1.oraclecloud.com/n/adwc4pm/b/OML_Data/o/household_power_consumption.txt", sep=";") test <- transform(test, Date = as.Date(Date, format = "%d/%m/%Y")) test <- transform(test, Global_active_power = as.numeric(Global_active_power)) test <- transform(test, Global_reactive_power = as.numeric(Global_reactive_power)) test <- transform(test, Voltage = as.numeric(Voltage)) test <- transform(test, Global_intensity = as.numeric(Global_intensity)) test <- transform(test, Sub_metering_1 = as.numeric(Sub_metering_1)) test <- transform(test, Sub_metering_2 = as.numeric(Sub_metering_2)) test <- transform(test, Sub_metering_3 = as.numeric(Sub_metering_3)) colnames(test) <- c("DATES", "TIMES", "GLOBAL_ACTIVE_POWER", "GLOBAL_REACTIVE_POWER", "VOLTAGE", "GLOBAL_INTENSITY", "SUB_METERING_1", "SUB_METERING_2", "SUB_METERING_3")
-
Now create the baseline data
HOUSEHOLD_POWER_BASE
and new dataHOUSEHOLD_POWER_NEW
. For this, run this R script:%r test_base <- test[test$DATES < "2008-01-01",] test_new <- test[test$DATES > "2007-12-31",] # Create OML proxy objects ore.create(test_base, table="HOUSEHOLD_POWER_BASE") ore.create(test_new, table="HOUSEHOLD_POWER_NEW")
-
To view the baseline data
HOUSEHOLD_POWER_BASE
, run this SQL command in a SQL paragraph in the notebook:%sql SELECT * FROM HOUSEHOLD_POWER_BASE FETCH FIRST 5 ROWS ONLY;
-
View the new data
HOUSEHOLD_POWER_NEW
by running this SQL command:%sql SELECT * FROM HOUSEHOLD_POWER_NEW FETCH FIRST 5 ROWS ONLY;