Validation Workflow
The MARM validation process is a systematic, six-stage flow designed to ensure model integrity—from initial data profiling to final report generation.
- Initiation Methods: Validation can be launched using two primary entry points:
- Model Inventory: Perform the following:
- Search for the model in the model inventory list
- Click on the model’s Name. It will redirect to the Model Details View page.
- Click the Start New Validation button for a fresh validation or Resume Validation and Discard Draft in case of on-going validation.
- Model Inventory: Perform the following:
- Managing Validation States
- Start New Validation: To start a new validation.
- Resume Validation: If a validation is already in progress, the system identifies the existing draft, allowing you to pick up exactly where you left off.
- Discard Draft: If the current analysis is no longer relevant, you can discard the active draft to enable a fresh validation launch.
- Operational Example: RETAIL_PD_MODEL_V1 To begin the workflow for our PD model:
- Navigate to the Model Inventory and search for
RETAIL_PD_MODEL_V1. - If multiple versions are available for the same name, validator need to select the appropriate version.
- Click on the name
RETAIL_PD_MODEL_V1from the list view It will redirect to the Model Details View page for this model.
Select Start New Validation or Resume Validation, if validation is already going on. The system will automatically pull the latest artifacts (code/data) to commence validation.
Data Input and Profiling
This stage provides a deep-dive into the data health and stability of the model, ensuring the input features remain consistent with the data used during the initial training phase.- Drift and Distribution Analysis: The system automatically executes Python-based statistical profiling to identify Data Drift over time. It assesses the skewness and normality of the current dataset to determine if the model’s original assumptions still hold true.
- Comparison Configuration: To perform the analysis, configure the following:
- Current Dataset: Define the period by entering a specific Date Range or selecting from presets: Last 7 days, 30 days, or 90 days.
- Baseline Dataset: Select the reference point for comparison, either the original Training data or the Last Validation Run (which automatically fetches the relevant historical dates and data).
- Execution and Visualization: Upon clicking Compute, the engine compares the datasets, automatically plotting distribution graphs for every feature. The system identifies "Drifted Features" where the statistical variance has breached system-defined thresholds.
- Operational Insights: These profiles provide the technical justification for model adjustments, such as tweaking specific parameters or adding new data transformations to handle outliers or noise.
- Navigate to the Model Inventory and search for
Operational Example: RETAIL_PD_MODEL_V1
- Selection: The validator selects the Last 90 days as the Current Dataset and checks the Training dataset option in the Baseline section to compare against the model's inception logic.
- Computation: After clicking Compute, the system generates drift graphs for all input variables.
- Observation: The system flags that the Average Credit Score in the applicant pool has drifted significantly (skewed lower) compared to the baseline. This insight is automatically captured in the drift table.
- Action: The validator notes in the comments that the model may need a recalibration of weightings to remain accurate for the current economic climate and proceeds to the next validation step.
Outcome Analysis & Backtesting
- Core Functionality: Comparison of model predictions against actual outcomes for a selected production period, with reference to historical benchmarks (training or previous validation results).
- Workflow Process:
- Current Dataset Selection: Define the period by inputting a specific Date Range or selecting from presets (Last 7 days, 30 days, or 90 days).
- Baseline Selection: Choose the reference benchmark—either the Training dataset or the Last Validation Run—to automatically fetch comparison dates & data.
- Dynamic Metric Filtering: Upon selecting the datasets, the system automatically filters and displays relevant statistical techniques (for example Gini, AUC, F1-Score) based on the Model Type (Regression or Classification).
- Execution and Validation: Clicking Compute triggers the engine to calculate and display a side-by-side metric comparison across the current, baseline, and last validation datasets. The Validator reviews these results, adds step-specific comments, and proceeds. Supports a comprehensive set of statistical metrics:
- For regression model: MAE, MSE, RMSE, MAPE, R2 Score, Median AE, Max Error, Percentiles, Mean Error, Std. Error.
- For classification model: True positive, True Negative, False Positive, False Negative, Accuracy, Precision, Recall, F1 Score, AUC ROC, PR AUC, Log Loss, Brier Score.
Operational Example: RETAIL_PD_MODEL_V1
- Define Scope: The validator selects the Last 90 days as the Current Dataset. Under the Baseline section, they select Training and Last Validation Run to enable a three-way performance comparison.
- Technical Selection: Because the system recognizes the PD model as a Classification type, it automatically populates metrics such as Gini, AUC, and ROC.
- Compute: The validator clicks Compute. The system generates a report comparing the current Gini coefficient (e.g., 0.62) against the Training Gini (e.g., 0.65) and the Last Validation Gini (e.g., 0.64).
- Results and Documentation: The system displays the performance graphs and metric tables. The validator observes that while there is a slight dip, the performance remains within acceptable limits.
- Finalize: The validator enters a comment noting the stability of the AUC/ROC curves and saves the step to proceed.
Conceptual Soundness
- Questionnaire-Based Assessment: The validator completes a targeted questionnaire by selecting the appropriate options for each question.
- Final Review and Commentary: After answering the questions, the validator provides an overall summary comment for the stage before moving to the next step.
- Questionnaire Configurability: To align with specific bank standards, the entire questionnaire is fully configurable. Users can update the questionnaire by modifying the JSON configuration file (
questionnaire.json) in the MARM backend located atconfiguration\configuration\marm\conceptual_soundnesswithout requiring code changes. There is a template file as well namedquestionnaire.template.jsonin the same folder that you can use to design a new questionnaire.
Operational Example: For RETAIL_PD_MODEL_V1, the validator will answer questions regarding the as per the slandered and provide a final assessment of the model's theoretical fitness for credit risk.
Model Tiering Review
- Automated Computation: The system computes the final Risk Tier based on standard business logic. The validator selects scores for specific business parameters, such as Financial Impact and Regulatory Scope.
- Tiering Criteria Configurability: All tiering parameters and scoring weights are fully configurable via the back-end to align with specific internal organizational policies. Administrator can simply update the criteria by modifying the JSON configuration file i.e. criteria.json in the MARM back-end located at “\configuration\configuration\marm\tiering” without requiring code changes. There is a template file as well named
criteria.template.jsonin the same folder that can be used to design the new criteria.
Operational Example: For RETAIL_PD_MODEL_V1, the validator checks all the criteria, read the description of criteria and each class in it & assigns the score. For this model, validator selects “High” class for "Regulatory Scope" and other parameters. The system then automatically calculates the model as final Tier on the basis of configuration.
Final Review
- Findings Consolidation: The system presents section-wise findings from prior validation stages (Data Profiling and Data Analysis, Outcome Analysis and Backtesting, Conceptual Soundness, and Model Tiering Review), including comment counts and full comment history (author, timestamp, and message), so validators can complete closure with full context.
- Validation Schedule Decision: Validators can explicitly choose whether to set the next validation date. If enabled, a date is selected in-line; if not select Do Not Set Now, no date is recorded for this cycle.
- Tier Outcome Summary: The page displays the computed model tier result from Model Tiering Review, including tier, risk level, total score, and recommendation text for final reference.
- Finding Summary By Section: A consolidated log that maps the validator's specific comments and qualitative insights to their corresponding workflow stages, providing a centralized audit trail of all identified issues and comments. Final Decision and Reviewer Notes: Validators select a final decision (Approve / Require changes / Reject) and can add final reviewer comments to document closure rationale, residual risk, and sign-off notes.
- Completion and Submission: The Finish action is enabled after final decision selection, and submission completes the run and returns the user to Model Details with confirmation.