6 Creating Validation Checks

This section contains the following topics:

"About Validation Checks"
"Creating a Validation Check Batch"
"Creating a Validation Check"
"Installing Validation Checks"
"Running a Validation Check Batch"
"Modifying a Validation Check"
"Creating a Custom Validation Check"

See "Viewing and Running Validation Check Batches".

About Validation Checks

Validation checks—also called edit checks—are programs designed to identify flawed data: discrepancies (called queries in InForm). Each one must check for a single problem and apply the same text, state, and action to each discrepancy created. Validation checks can test any combination of data—including lab and InForm data and data from different CRFs—contained in a single data model.

The same flawed data point can have several discrepancies created against it raised by different validation checks; for example, if the value is both out of range and does not make sense compared to a related data point.

The first time a validation check is executed, it automatically creates a target table with columns corresponding to source table columns you select and a row for each discrepancy identified. Each time it runs it updates the table with new or changed data. The data in this table is displayed in the VC (Validation Check) Listings page.

Validation checks are part of a study or study template. When you apply a study template to a study, the validation checks from the study template are copied to the target study.

Automated Discrepancy Closure You can set up a validation check so that it closes any discrepancies it previously created if the underlying data point has been updated in such a way that it is no longer discrepant.

Manual Discrepancy Closure Alternatively, you can set up a validation check so that, when the underlying data point is no longer discrepant, the validation check marks the discrepancy as "Answered" but a data manager must review the discrepancy and data change and manually close it.

Custom Programs You can define most validation checks, including comparing related data points across tests, visits, or sources, directly in the Validation Check window. However, if you need a more complex program, you can write a program in SAS or PL/SQL and upload it to Oracle Life Sciences Data Hub (Oracle LSH) and reference the program from a validation check; see "Creating a Custom Validation Check" for more information.

Creating a Validation Check Batch

Validation checks are executed in batches of one or more. Before you create a validation check you must create the batch in which to execute it. You can use batches to group validation checks in logical ways, for example:

Checks that have dependencies on each other so they must be executed in a particular order.
Standard checks kept in a set for ease of reuse in many studies.
Checks that should all be executed manually, or triggered by the same event, or scheduled for the same frequency.

To create a validation check batch:

Select your study in the Home page and click Study Configuration.
Select Validation Checks at the bottom of the left pane.
Select the clinical data model whose data you want to check.
Click Create a New Batch.
Enter a name for the batch that is unique within the model. If it is not unique within the study, the system automatically appends "_1" to its name.
Enter a description (optional). The description is displayed in the places where you can run the batch.
If the validation checks in the batch should be run in a particular order select the Ordered Execution? check box. You specify the order when you create the validation checks.

Validation checks in unordered batches run in parallel.
Check Can be Triggered to allow the successful completion of a transformation or data load writing to the clinical data model that this batch runs against to trigger the execution of this batch.
Click OK.
Create the validation checks to run in the batch; see "Creating a Validation Check".

Creating a Validation Check

Before you create a validation check, you must create a batch in which to execute it; see "Creating a Validation Check Batch".
Select the batch. If it is not already checked out, click Check Out.
Click Create a New VC. Then see:
- "Building the Validation Check Query"
- "Defining Validation Check Details"

Building the Validation Check Query

To create the query for the validation check:

Enter a name and description for the query. The name is displayed on the VC Listings page; the description is not. The Listings page can display about 25 characters of the name without requiring scrolling to see the whole name. You may want to use a naming convention to make it easier to find queries later.
Select Authorize access to this listing for users without Blind Break rights if you know that only nonblinded data will be displayed in the listing. This option is available only if at least one source table contains blinded data and if you have the required blinding-related privileges.

If any source table is blinded at any level and this attribute is not selected, the system automatically blinds the entire target table, so that by default no data is displayed on the VC Listings page for this validation check. Only a user with the required Blind Break privileges can view the data.
Select Create VC Using a Custom Program if you need more complex logic than you can create here and you have already created the program in Oracle LSH; see "Creating a Custom Program in Oracle Life Sciences Data Hub". Then click the Edit icon to select the custom program.

If you select Create VC Using a Custom Program, select the source table(s) and skip the next steps. Your program must take care of selecting columns for display in the target table on the VC Listings page, creating joins, and specifying criteria if required.
Specify the columns to display; see "Selecting Columns to Display in VC Listings".
If needed, create joins between tables; see "Creating Joins".
Specify query criteria; see "Specifying Criteria".

Selecting Columns to Display in VC Listings

In the Select Columns tab, identify the columns you want to display for each record retrieved by the query. You can give columns a different header for display and combine or write expressions on one or more columns as required. The system creates a SELECT clause for the query based on your specifications.

In the Source pane, expand the node for the table or tables whose data you want to display in the VC Listings page for each record retrieved.
Select the columns you want to display. You can use Ctrl+click and Shift+click to select multiple columns at a time. You can also select a table to add all its columns and then remove the ones you do not want to display.
Click the Add (+) icon. The system displays the columns you selected under Selected Columns.
If you want to display a heading for the column different from the column name, enter it in the Alias field.
Expression: If you want to write an expression on a column, click its Edit (pencil) icon; see "Using the Expression Builder".

To write an expression that operates on multiple columns you must add each column to the same row:
1. After moving one column into Selected Columns, highlight it there.
2. Select the second column in the Source pan and click the arrow icon in the Source pane. The system adds the second column to the same row. If you need a third column for your expression, add it the same way.
3. Enter an alias for the column that will display the results of the expression.
For example, you may want to display the red blood cell count collected three times in a visit and then display the average of the three results. To do this, select the column for red blood cell count in the Source pane four times and add it to Selected Columns. Then select the last one under Selected Columns, select it again under Source and click the arrow icon, and then repeat. All three are now in the same row. Enter an alias such as "RBC Avg" in the row with all three. Then write an expression for each row to populate the first three columns with data from each sample and the last to calculate and display the average. See "Using the Expression Builder".

Creating Joins

To compare columns in different tables, you must define a join between the tables.

To create a join, select and add the tables to be joined in the Source pane and then:

Go to the Joins tab and click the Add (+) icon in the Joins pane. The system adds a row.
Click in the row. The system displays two drop-down lists containing the tables in the join.
Select the tables to be joined.
Specify if it is to be an outer join on either the left or right side. Leave the checkbox unselected to create an inner join.

An inner join (sometimes called a simple join) returns only those rows that satisfy the join condition.

An outer join extends the result of a simple join. An outer join returns all rows that satisfy the join condition and also returns some or all of those rows from one table for which no rows from the other satisfy the join condition:
- A left outer join returns returns all rows from Table 1 and only rows meeting the join condition from Table 2. Rows in Table 1 that do not have a corresponding row in Table 2 have null values in the columns from Table 2.
- A right outer join returns returns all rows from Table 2 and only rows meeting the join condition from Table 1. Rows in Table 2 that do not have a corresponding row in Table 1 have null values in the columns from Table 1.
- A full outer join returns all rows from both or all tables. Rows in either table that do not have a corresponding row in the other table have null values in the columns from the other table.
Click the Add (+) icon in the Join Details pane. The system adds a row.
Click in the row. The system displays two drop-down lists containing the columns for each table and another for operators.
Create the Join condition for the Where clause by selecting a column from each table and the operator required. For example, where both tables have a column called USUBJID, select those columns and select an operator of equals (=).

To specify additional Join conditions, click the Add (+) icon in the Join Details pane again as many times as required.
Click OK.

Specifying Criteria

In the Criteria tab, identify the table(s) whose columns you want to operate on to determine which rows to retrieve. The system creates a WHERE clause based on your specifications.

Select the table or tables you need in the Source pane and click the Add icon. If you need to operate on columns from two or more tables in the same expression, add them to the same row:
1. After moving one table into the Criteria tab, highlight it there.
2. Select the second table in the Source pan and click the arrow icon in the Source pane. The system adds the second table to the same row in the Criteria tab. If you need a third table for your expression, add it the same way.
Click the edit icon to open the Expression Builder. See "Using the Expression Builder".
Save. The system generates a system PL/SQL program.

Defining Validation Check Details

Validation checks must be associated with a batch of validation checks that are executed together.

Select the batch. If it is not already checked out, click Check Out.
Click Create a New VC.
Enter a name for the validation check that is unique within the batch. If it is not unique within the batch, the system automatically appends "_1" to its name. Avoid using special characters except underscore (_) and reserved words; see "Naming Objects".
All the other fields affect the discrepancies identified by the validation check:
- Allow Auto Close: If selected, rerunning the validation check automatically closes any discrepancy it created if the underlying data point is modified so that it no longer satisfies the criteria of the validation check.
  
  If not selected, rerunning the validation check changes the state of discrepancies not satisfying the criteria to Answered, so that they can be manually closed after review.
- Category: Select a category for the validation check. The validation check applies this category to each discrepancy it creates. Users can filter and count discrepancies by category.
  
  See "Viewing and Creating Categories".
- Discrepancy Initial State: Select a state to be applied to discrepancies created by this validation check: Open or Candidate. You can require review in Oracle DMW or send discrepancies to InForm immediately in either state.
- Discrepancy Text: Enter the text to be displayed for the discrepancy. This is the message that will be used for communication with InForm, lab, or Oracle DMW users. Describe the problem with the data and/or the action required.
- Initial Discrepancy Action: Select the action you want the validation check to immediately perform on the new discrepancies, if any; see "Actions".
- Primary Source Table: You must designate one data point as the primary one against which discrepancies are created. Select the table that contains the primary data point. If the validation check logic processes only one data point, select its table. If the logic processes two or more data points, you must designate one of them as primary.
  
  Note:
  You can ensure that all data points are displayed with the discrepancy in the VC Listings page by adding their columns in the Selected Columns tab.
- Primary Source Column: Select the column that contains the primary data point.
- Execution Order: Enter a number to indicate the order in which this validation check should be run, in relation to other validation checks in the same batch. The system runs the lowest numbered check first, then the next highest, and so on. The numbers do not need to be consecutive, and you may want to use numbers divisible by 10, for example, so that you can add a new validation check at any point. This setting is available only in batches with ordered execution. The default value is 100. Up to 6 digits are allowed.
- Continue on Error: If selected, the batch continues to execute subsequent validation checks even after one fails. This setting is available only in batches with ordered execution. Unordered batches always run all validation checks.

Installing Validation Checks

Before you can run a validation check, you must select its batch and select Install. This creates database packages for the generated program. If any source tables are not installed, the Installation job tries to install the whole clinical data model.

The batch must be Installable to be installed. It is not Installable if:

It does not have any validation checks.
One or more source tables are not installable.
If it has a custom program, the program is not installable, which may be due to not having source code or table descriptors.

If the batch does not install successfully, the system displays an error message with a Job ID. You can view the log file on the Home page's Validation Checks tab.

Running a Validation Check Batch

After you have installed the validation check batch, you can run it in the Home page, Validation Check tab.

You can run the job at any time on demand or schedule it to run on a regular basis or link it to other jobs as part of a scheduled or triggered forward-chained execution; see "Viewing and Running Validation Check Batches".

The resulting data is displayed in the VC (Validation Checks) Listings page.

Modifying a Validation Check

To make any change to a validation check other than enabling or disabling it, you must check out the batch. To make most changes, click Update and follow instructions for "Creating a Validation Check".

The system also supports the following types of changes:

"Reordering Validation Checks"
"Disabling a Validation Check"
"Upgrading a Validation Check Batch"

Reordering Validation Checks

To change the execution order of an ordered batch, click Re-order. The system displays all validation checks in the batch with their Execution Order number. Edit the numbers as required. They do not need to be consecutive. The system runs the validation check with the lowest number first, then the next higher number, and so on. For example, it would run checks with these numbers in the following order: 20, 30, 35, 36, 40, 100, 200.

Note:

If a validation check has Enabled set to No, the system ignores its execution order number and does not run it.

Disabling a Validation Check

You can prevent a validation check from being executed when the batch is executed by selecting it and then selecting Disable. To include in the batch execution after disabling it, select it and then select Enable.

You can change this setting when the validation check batch is checked in or out.

Upgrading a Validation Check Batch

If there have been metadata changes in the source clinical data model—for example, change of column length—that affect any validation check in a batch, the batch becomes upgradable.

Go to the Validation Check Batch list in either the Home or the Study Configuration page.
Select the batch and click Upgrade. The system updates the table and column mappings behind the scenes.

Creating a Custom Validation Check

If you want to write a program that is too complex for the user interface, you can write a PL/SQL or SAS program in an editor and upload it to an Oracle Life Sciences Data Hub (Oracle LSH) program, then associate it with a validation check. For example, if you need to refer to a table outside the data model such as shared codelist tables, lab reference ranges, or SI conversion factors, you must create a custom program and use a static reference within it.

You also need a custom program to call APIs for assigning flags to records or applying actions to records.

If you have already written an appropriate SAS or PL/SQL program outside of Oracle LSH, you can upload it to an Oracle LSH program. However, to support data lineage tracing, you will need to add columns to the target table and populate them as described below.

Note:

To create and use a SAS program, you must purchase SAS separately and integrate it with Oracle LSH. See the Oracle Life Sciences Data Hub Installation Guide for instructions.

Create the program the same way you do for a transformation; see "Creating a Custom Program in Oracle Life Sciences Data Hub" except for this additional task:

Each validation check should have a single target table. To support data lineage tracing, this target table must have one column per source table to store the surrogate key for the source table, which is a concatenation of the primary key values for each record; see "How the System Tracks Data Lineage".
1. In the Oracle LSH Program page, create a new Table Descriptor and Table instance for the target table.
2. In addition to the columns you need to display the results of the validation check, add one column for each source source table with a name like source_table_SKEY.
3. In the source code, populate each of these columns with the value from the CDR$SKEY column of each source table.
Define the validation check in a batch and enter values for all required fields.
Select Create VC using a Custom Program. The system displays a window where you can select the source table(s) for the custom validation check program.
When the Select a Program icon appears, click it and select the program. You can use the Query By Example fields above any column to search for all or part of a value in that column; for example, enter %cardio% to search for a program in the Cardiology category your company has set up; see "Setting Up Custom Program and Function Categories".