Adding data in sandbox

5.1.1 Adding data in sandbox

Introduction

After creating the sandbox workspace, users must populate it with relevant data from the production schema. This data population step ensures that the sandbox contains representative and consistent data needed for configuring and executing stress testing projects.

The data added typically includes source data from the atomic schema such as instrument-level exposures, reference dimensions, time series variables, lookup tables, and any other information required to compute metrics, run models, and analyze portfolios. Populating the sandbox with accurate and complete data is essential for meaningful simulations and valid output results.

Although the system provides a default data ingestion pipeline as part of sandbox creation, user intervention is still required to initiate the population process. Users can define whether to overwrite existing data (truncate and insert) or append to it. They can also apply global or table-level filters, specify SQL conditions (for example, based on MISDATE), and optimize performance using JDBC properties and rejection thresholds.

This activity is not fully automated and is not triggered automatically when a stress testing project is initiated. It must be explicitly executed by the user. However, once a sandbox is populated, it can be reused for multiple stress testing cycles, as long as the data remains relevant. If fresh data is required for a new reporting period or simulation scenario, users must manually re-populate the sandbox with the updated dataset.

Performing this step ensures that the sandbox environment mirrors the necessary production data landscape, enabling accurate testing, validation, and analysis.

After creating the sandbox workspace, migrate the data from the production schema to sandbox schema.

Login to STSA.
The Workspace Summary page is displayed.
To populate (or add) data in the workspace, click action next to the corresponding workspace and select Populate .
The Populate Workspace page is displayed.
The first section displays the Workspace name, purpose, data and time when the workspace was created and the source of the data.
Depending on the type of data loading job, select one of the following options:
You can either overwrite the existing data (truncate and insert) or to append to the existing data.
- Overwrite: In this mode, the underlying tables are truncated (overwritten on existing data) followed by an insert operation.
- Append: In this mode, the underlying tables will be populated (added to the existing data) in the append mode.
Depending on what data you want to copy to the sandbox, select one of the following options:
- Selective: In this mode, only the tables filtered (selected in the Table level Data Filters) are populated.
- ALL: In this mode, all the underlying tables mapped to the workspace are populated along with the filters mentioned below for specific tables.
In the Data Filters - Global level, enter the data filter that you want to apply on all the tables selected for data sourcing.
For example: If MISDATE is equal to Today, then it is applied to all tables (wherever it is available) for selected Data Sources during population. If this field is not found (MISDATE) in the tables, it is not updated.

Else, click Use Template to select a json file.
After selecting the template, any pre-filled values is overridden with the template provided values.

For more information, see the Populate a Workspace section in the Oracle Financial Services Model Management and Governance User Guide .
In the Data Filters - Table level, provide the data filters individually on the tables. Select the table and then enter the SQL filter.

Note:
You can provide multiple table names for the same SQL filter.

Note:
Global filters are not applicable for those tables on which filters have been applied individually.
In the Data Filter - Hint, you can provide database Hints at table-level and SQL prescripts at schema level for data load performance improvement during workspace population.
To filter data from a large database, you can provide additional parameters in the Additional Parameters section.
Provide a source and target prescript mentioning the fetch size of JDBC properties and the batch commit size of JDBC properties for data upload.
In the Select Unlimited or Customize the Rejection Threshold section, select one of the following options:
- Unlimited - Here, all the errors will be ignored during the data population.
- Custom Rejection Threshold - Enter the maximum of number of inserts that may fail for any of the selected tables. You can provide the maximum number of inserts that can fail while loading data to a given table from all the sources. In case of threshold breach, all the inserts into the particular target schema will be rolled back. However, it will continue with populating the next target schema.
To load the data from production to sandbox schema, click Populate Workspace and then click Create and Execute batch.
The status of the job is displayed in the Execution History of the sandbox.