Expectation Maximization for Anomaly Detection
EM identifies anomalies based on probability density, ensuring accurate anomaly detection for better data integrity.
An object is identified as an outlier in an EM Anomaly model if its anomaly probability is greater than 0.5. A label of 1 denotes normal, while a label of 0 denotes anomaly. The EM technique models the underlying data distribution of a data set, and the probability density of a data record is translated into an anomaly probability.
The following example displays the code snippet used for anomaly detection
using the Expectation Maximization algorithm. Specify the
EMCS_OUTLIER_RATE
setting to capture the desired rate of outliers in the
training data set.
-- SET OUTLIER RATE IN SETTINGS TABLE - DEFAULT IS 0.05
--
BEGIN DBMS_DATA_MINING.DROP_MODEL('CUSTOMERS360MODEL_AD');
EXCEPTION WHEN OTHERS THEN NULL; END;
/
DECLARE
v_setlst DBMS_DATA_MINING.SETTING_LIST;
BEGIN
v_setlst('ALGO_NAME') := 'ALGO_EXPECTATION_MAXIMIZATION';
v_setlst('PREP_AUTO') := 'ON';
v_setlst('EMCS_OUTLIER_RATE') := '0.1';
DBMS_DATA_MINING.CREATE_MODEL2(
MODEL_NAME => 'CUSTOMERS360MODEL_AD',
MINING_FUNCTION => 'CLASSIFICATION',
DATA_QUERY => 'SELECT * FROM CUSTOMERS360_V',
CASE_ID_COLUMN_NAME => 'CUST_ID',
SET_LIST => v_setlst,
TARGET_COLUMN_NAME => NULL); -- NULL target indicates anomaly detection
END;
/
To view the complete example, see https://github.com/oracle-samples/oracle-db-examples/blob/main/machine-learning/sql/23ai/oml4sql-anomaly-detection-em.sql.