7 Batch Loading Data

The chapter contains the following topics:

The Batch Data Loader can load large amounts of data from sources such as labs, avoiding manual data entry.

The Batch Data Loader is designed only to update response data, not header keys. To update the header keys, you must perform a hard delete and reload the data. See "Batch-Deleting Data" for instructions and cautions about this procedure.

The Batch Data Loader cannot handle the following types of data:

  • DCI Modules that specify clinical planned events

  • DCI/DCM combinations that have different collect time flags; for example, both the DCI and the DCM should collect time data, or neither should.

  • DCI/DCM combinations with different dates and times

  • Optional DCMs

In a flexible study, after completing the batch data load, the system automatically runs the incremental expectedness calculation job for all patients whose data changed. See Oracle Clinical Creating a Study for more information.

7.1 Setting Up for Batch Data Loading

This section contains the following topics:

7.1.1 Defining a Rollback Segment

To prevent exceeding the rollback segment space and causing the entire job to fail, you can choose to run the batch data loader in its own rollback segment. Define the rollback segment as follows:

  1. Define and tune a rollback segment and bring it online, following instructions in the Oracle SQL Reference manual.

  2. Enable the segment for use in Oracle Clinical by adding the entry BDL_R_ SE to the reference codelist OCL_STATE. For further information on this reference codelist, see the Oracle Clinical Administrator's Guide.

7.1.2 Defining and Using Format Masks

To batch-load data into Oracle Clinical, you must create or select a format mask with which you specify how the data in your files corresponds to fields in Oracle Clinical. Masks achieve this correspondence in two different ways:

  • Positional masks define a length for each column in the data file; for example, the STANDARD format mask (described below) assigns the first ten characters to the Investigator, the following ten characters to the site name, and so on.

  • Delimited masks have no set column length and separate column values with a user-defined delimiter character such as a comma (,). Flexibility in field length can save space.

Note:

You cannot load multibyte data using positional format masks, including the STANDARD format mask supplied by Oracle. Use delimited masks for this purpose.

Oracle Clinical ships with a predefined format mask of the fixed or positional type. You can define your own format masks as required.

7.1.2.1 STANDARD Mask Format

The STANDARD mask is the format used by default when a new data file is created. Table 7-1 lists the expected order of column values for each record, with the first ten characters (1-10) allotted to the first column value, the second ten characters (11-20) allotted to the second column value, and so on. Certain columns are mandatory.

Table 7-1 Default Batch Data Load Fields

Name Position/Type Description Mandatory?

Investigator

01:10 CHAR

Investigator

No

Site

11:20 CHAR

Site

No

patient

21:30 CHAR

Patient

Yes

document_number

31:50 CHAR

Document Number: Leave document _number blank if your location uses document number derivation (see "User-Defined Identifier Handling" in the Oracle Clinical Application Programming Interface Guide). Otherwise, leaving it blank causes errors loading multiple DCMs/DCIs at once. If given, document number is used even with the document number derivation defined.

No

clin_plan_event_name

51:70 CHAR

Clinical Planned Event

Yes

subevent_number

71:72 NUM

Subevent Number

*This column is mandatory if there is no value for dci_date.

No*

dci_date

73:80 CHAR

Visit Date (YYYYMMDD): dci_date = dcm_date, and dci_time = dcm_time; if dci_time= null, both dci_time and dcm_time = null.

Note: Do not enter zero to represent an unknown part of the date (for example, 20110000). Instead, leave the date partial (for example, 2011).

*This column is mandatory if there is no value for subevent_number.

No*

dci_time

81:86 CHAR

Time of Visit (HHMMSS)

No

dci_name

87:116 CHAR

DCI Name

Yes

dcm_name

117:132 CHAR

DCM Name

Yes

dcm_subset_name

133:140 CHAR

DCM Subset Name

Yes

dcm_question_grp_name

141:170 CHAR

Question Group Name

No

dcm_question_name

171:190 CHAR

Question Name

Note: Before adding a question, make sure that it is not derived. You can check by navigating to Glib, Questions, and then Questions and querying for the question.

Yes

dcm_que_occ_sn

191:193 NUM

Question Occurrence Sequence Number

(0 if none)

Yes

repeat_sn

194:196 NUM

Repeat Sequence Number

(1 if no repeats)

Yes

value_text

197:396 CHAR

Value Text for the question. If the question has a default value, repeating or non-repeating, provide user data here if it is not provided.

No

data_comment_text

397:596 CHAR

Data Comment Text

No

qualifying_value

597:666 CHAR

Qualifying Question Value Text, if applicable. Qualifying value is mandatory if required by DCM definition; it is ignored otherwise.

No

Study

667:681 CHAR

Study

No


7.1.2.2 User-Specified Formats

You can specify your own input formats to stipulate different column positions and different lengths from the defaults. You can also specify the delimiter to separate fields, removing the restriction of the fixed field format.

To define a format:

  1. From the Conduct menu, select Batch Data Load, then select Batch Load Formats.

  2. Enter the name of the format in the Name field.

    STANDARD is the name of the format mask provided by Oracle.

  3. Specify a delimiter if you are creating a delimited mask or leave the Delimiter field blank if you are creating a positional mask.

  4. The Valid box is not enterable. See Step 9.

  5. Press Enter or Tab. The system enters values under Format Components: Seq, Field Name, Mandatory, and Field Length.

  6. Start indicates the start column for this field in the data file. An entry of zero means the field is not provided, so the field is ignored. This field is filled automatically with a number and the delimiter if the entry is not zero.

  7. Default Value is not used by the system.

  8. Change the status of the format if you want, and if Valid=Y. The default is P for provisional.

  9. Save. The system runs a check on your format definition. If it is valid, the system checks the Valid box. If the format definition is not valid, the system does not check the Valid box and the format does not appear in the LOV for Format Mask in the Batch Data Load window.

Table 7-2 Batch Load Format Fields: Upper Part

Name Description Size; Data Type

Name

Name of the format.

16; free text

Status

Status of the format. The default is P for provisional. Other possibilities are A for active, and R for retired.

1; CHAR

Delimiter

Special symbol you choose to mark the position of a format in the database column if you do not choose STANDARD.

1; CHAR

Valid

Whether the mask is valid; checked denotes validity.

box


Table 7-3 Batch Load Format Fields: Format Components

Name Description

Seq

Sequence of the format components in the database table. This order is automatically provided by the system. You can change order, which is particularly useful for delimited masks.

Field Name

Name of the field to which the format applies; system-provided.

Mandatory

A box to indicate whether the field is required. The information is automatically provided by the system.

Start

Indicates the position in the database where the field to which you are applying the format will start. You can either choose a delimiter in the upper part of this window to indicate position, or, if the Delimiter field is zero, enter the position here. If you leave this field blank, it remains null. A zero means the field is not used, because a delimiter exists.

Field Length

Number of characters or digits allowed for the field to which you are applying the format. The defaults are entered automatically and cannot be changed.

Default Value

The system does not use the value in this field.


7.1.2.3 Normalized Lab Data

You can identify data as normalized lab data by defining a repeating Question Group with a single pair of questions, one to collect the lab test name and the other to collect the lab test value; see the chapter on DCMs in Oracle Clinical Creating a Study.

When you batch load lab data in a normalized format, the system automatically generates two records from each incoming normalized lab data record, for both inserts and updates. The lab test name is then stored as a Question response and you can use it in data extract and in validation Procedures to identify the test name for each test result.

For example, from records with a normalized input file format as follows:

,,2,,VISIT1,0,20110101,,HEMA,HEMA,HEMA,LAB,HCT,0,1,43 
,,2,,VISIT1,0,20110101,,HEMA,HEMA,HEMA,LAB,HGB,0,2,17 

where the lab test names are HCT and HGB, the respective lab test values for the patient are 43 and 17, the lab test Question name is LTEST and the lab test result name is LVALUE, the Batch Data Load process converts the preceding two input records to the following four records:

,,2,,VISIT1,0,20110101,,HEMA,HEMA,HEMA,LAB,LTEST,0,1,HCT 
,,2,,VISIT1,0,20110101,,HEMA,HEMA,HEMA,LAB,LVALUE,0,1,43 
,,2,,VISIT1,0,20110101,,HEMA,HEMA,HEMA,LAB,LTEST,0,2,HGB 
,,2,,VISIT1,0,20110101,,HEMA,HEMA,HEMA,LAB,LVALUE,0,2,17 

7.2 Batch Data Load Stages

The batch load process includes several stages: Load, Prepare, and Transfer. You can choose to process data one stage at a time, checking error files after each stage and then proceed to the next stage, or peform a combination of stages; see "Using the Batch Data Loader".

7.2.1 Load

Moves the data in the data files from the server to the Oracle Clinical database table Temporary Batch Data Items (TBDI).

7.2.2 Prepare

Updates the TBDI table, and creates the Temporary Batch DCI and Temporary Batch DCM tables. During this stage the system looks for completeness and consistency of the records, whether the keys are valid, and checks for locked RDCMs.

Also during this stage, records from the data files are divided into INSERT and UPDATE records. INSERT records are those that do not already exist in Oracle Clinical, and need to be created; UPDATE records are those that have updated information for Received DCIs and Received DCMs that already exist in Oracle Clinical. These records are termed pending inserts and pending updates, respectively.

You can run the Detail Load Impact Report, or the Summary version of this report, after the Prepare stage to get a list of how many RDCMs have been updated or inserted.

7.2.3 Transfer

Moves the data from the temporary database tables (Temporary Batch Data Items, Temporary Batch DCIs, and Temporary Batch DCMs) to the permanent Oracle Clinical tables (RDCMs, RDCIs, Responses, and Actual Events) if necessary.

7.3 Standard Batch Data Load and Out of Study Batch Data Load

Oracle Clinical offers two batch data load processes. Both processes can handle multiple data files at a time.

Standard Batch Data Load This option can handle only a single DCM in a single DCI in each data file; this usually corresponds to a single, relatively simple CRF (see Oracle Clinical Creating a Study for information on DCMs and DCIs). All data must be for a single study.

To use this process, select Batch Data Load from the Conduct menu.

Out of Study Batch Data Load This option can handle multiple DCMs in a DCI, and multiple DCIs in a data file and can process data from multiple studies at the same time. You may choose to use the Out of Study option even for a single study because it allows multiple DCMs and DCIs per data file.

To use this process, select Out of Study Load from the Conduct menu.

7.4 Using the Batch Data Loader

This section describes the following steps:

7.4.1 Preprocessing Data Files

Before you start loading data into Oracle Clinical, some preprocessing may be necessary to bring data into conformity with required format standards.

If you are using a positional or fixed format, including the shipped STANDARD mask, each value occupies a fixed column position, as listed in Table 7-1. If there is no data in a field, that field must contain spaces for the expected number of characters to make the record conform to the fixed format. However, you can leave the fields for subevent number, investigator, site, document number, or DCM question group name blank, in which case the individual field is calculated automatically. (See "Subevent Number Not Supplied" for a discussion on what occurs when the subevent number is left blank.) If the data is not in the correct fixed format, batch data load will not succeed.

If you haven't already, also check the format mask to make sure that the target fields do not include derived questions because this can cause the job to fail or even corrupt data or prevent derivation for other documents. See Defining and Using Format Masks for information about format masks and see Loading Data into Derived Question Fields for details about risks and remediation.

Note:

Do not leave any empty lines in the load file. The system treats each line in the load file as a record. If there is an empty line, the system takes that as an empty record and invokes an error.

7.4.1.1 Subevent Number Not Supplied

When a subevent number is not supplied in the data file, the system attempts to derive a subevent number by following an algorithm that matches existing data for the patient against data being processed in the current batch load. The system follows these steps that comprise the algorithm:

  1. For each patient/clinical planned event, the system derives a table of existing subevent numbers, and the minimum/maximum dates/times associated with them.

    All data files being loaded as a group, as well as data in the Received DCMs and Received DCIs tables, are subject to the algorithm's actions.

  2. The system processes each null subevent number in the incoming data file(s) in ascending date/time order. This is achieved by checking, in the order of existing or already allocated subevents, whether the DCM date and DCM time of the new data falls within the minimum/maximum dates/times of the existing subevents.

    For the purpose of matching, null DCM times are treated as 00:00:00, or the earliest time for that date.

  3. If the new data falls within the range of an existing subevent or one already allocated, it is assigned that subevent number.

  4. If the new data does not fall within the range of an existing subevent, a new one is allocated, and the date and time of the new data is associated with it. Subsequently processed records that match are therefore allocated the same subevent number.

You cannot use this matching algorithm to automatically allocate subevent numbers for duplicate DCMs, because it does not include a check on whether the new subevent assignment will cause a collision with existing Received DCMs. For example, if data for subevent 0 already exists at 01-DEC-99 00:00:00, and new data for an existing DCM is processed with the same date, it will be assigned subevent 0 and will be detected by later phases in the Prepare stage as an update of the existing data. If the new data has a different document number, the document will be rejected during Prepare as a mismatched update.

Similarly, new data with time information that falls within the range of existing data will be matched to the existing subevent and will be later rejected as a mismatched update on time. For example, if data for subevent 1 exists at 01-DEC-99 00:00:00 and 02-DEC-99 00:00:00, and new data for a DCM that exists (at 01-DEC-ok

9 08:00:00, for instance) is processed at 01-DEC-99 12:00:00, it will be assigned subevent 1 and be detected by later phases in the Prepare stage as an update of the existing data. Since the times differ, it will be rejected.

7.4.1.2 Loading Data into Derived Question Fields

Batch Data Load does not prevent you from loading data into a derived field. However, this can cause problems. This section covers a few examples.

Target field is a derived question that is associated with a Thesaurus DVG

If Batch Data Load completes successfully and adds data for a derived question that is associated with a Thesaurus DVG and the loaded response is not valid in the DVG, this raises a thesaurus univariate discrepancy.

Subsequently, when the derivation procedure updates the response, the response record is not versioned, DISCREPANCY_INDICATOR = U for the derived question response record, VALIDATION_STATUS = CNN, and the Thesaurus univariate discrepancy is obsoleted with the resolution DATA CHANGE, but the discrepancy indicator does not update.

Lastly, if a second Batch Data Load targets the question, the Transfer stage fails with the error:

Updating discrepancy entries - rows to update
Fatal error: No rows in Discrepancy Entries table updated.

Target field is a TMS-validated derived question that is associated with a Thesaurus DVG

If Batch Data Load completes successfully and adds data for a TMS-validated derived question that is associated with a Thesaurus DVG, Batch Validation fails and the .out file contains the following:

Derived questions exist for TMS. Calling TMS_OCL_DERV.DERIVATION second time
value of :last_tms_deriv_ts 16-SEP-2016 08:17:42
Error during TMS Thesaurus Validation
Batch Validation aborted due to an error
SQL error during TMS Thesaurus Validation: SQLERROR ORA-00001: unique constraint (RXC.RESPONSE_PK_IDX) violated
ORA-06512:

Running Batch Validation a second time completes successfully.

This unique constraint error interrupts TMS derivation which can lead to data not being derived back for other documents which should have been processed in the same Batch Validation run. A Force Rederivation is required to correct this.

7.4.2 Creating and Viewing Data File Entries

You must log information about the files into Oracle Clinical—that is, create entries for the data files so that Oracle Clinical can access and process them.

7.4.2.1 Creating Data File Entries for Out of Study Batch Data Loads

  1. From the Conduct menu, select Out of Study Load, then select Batch Load Data Files.

    Note:

    For test mode, from the Definition menu, select Test a Study, choose Out of Study Load, and then select Batch Data Load Files.

    The Maintain Out of Study Batch Load Data File window opens. The system automatically populates the Creation TS field with the current date and time and the Status field with RECEIVED. You cannot change these fields; they change automatically as the data files are processed.

  2. Enter a data file name in the Name field.

  3. The OS File Name is prepopulated with the appropriate directory path. Add a slash or backslash, depending on your operating system, and the filename.

    The system validates the directory path when you submit the job.

  4. If your data is lab data, enter the lab name in the Lab Name field.

  5. If you have defined a position and size for your data in the database table from the Batch Load Formats menu entry, enter the name of the format mask in the Mask Name field.

7.4.2.2 Creating Data File Entries for Standard Batch Data Loads

If your data files have multiple DCMs per DCI or you want to process files from different studies at the same time, use Out of Study Batch Data Load instead; see "Creating Data File Entries for Out of Study Batch Data Loads".

To specify the file for a standard Batch Data Load:

  1. From the Conduct menu, select Batch Data Load, then select Batch Load Data Files.

    The Maintain Load Data File window opens. The system automatically populates the Creation TS field for the data file name with the current date and time and the Status field with LOAD. You cannot change these fields; they change automatically as the data files are processed.

    Note:

    For test mode, from the Definition menu, select Test a Study, choose Batch Data Load, and then select Batch Data Load Files.
  2. Set your study context to the study you want to work with.

  3. Enter a data file group name in the Name field.

  4. Enter a separate record for each data file to be loaded.

    To select files from a directory, see "Selecting Data Files".

  5. Enter the appropriate Format Mask for the files.

  6. Save your input.

To continue, go to "Loading Data".

Table 7-4 Maintain Batch Load Data File Field Details

Name Description Size; Data Type

Name

Data file group name. Mandatory.

30; CHAR

Creation TS

Timestamp.

20; DATE

Status

Whether the group file has completed all transfer of data.

15; CHAR

Data File Name

Name of the data file, used internally by Oracle Clinical. The same data file name can be used subsequently for a different Received date. The name controls subsequent processing and therefore should be meaningful. Follow your company's naming policies.

Not updatable once the data file is processed.

30; CHAR

Received

Date Oracle Clinical received the data file. Default value: current date and time. Not updatable once the data file is processed.

18; Date

OS File Name

Operating system file name used to load this data file. Full file name and path (including disk name) should be specified.

The directory path must match the one entered in the BDL_ROOT setting of the OCL_STATE local reference codelist plus the user-specific subdirectory if BDL_USERDIRS in OCL_STATE is set to Y. The subdirectory name must be your username minus "OPS$" (if your username includes "OPS$"). The path cannot include quotation marks (" ").

Not updateable once the data file is processed.

200; CHAR

Data File Status

Current status of the data file during load. It is system generated but you can change the Data File Status value to repeat an unsuccessful step of the batch load process. The following changes are allowed: from BAD LOAD to RECEIVED, from BAD REPAIR to DATA LOADED or RECEIVED, and from BAD TRANSFER/INSERT/UPDATE to PREPARED, DATA LOADED, or RECEIVED. Statuses include:

15; CHAR

 

RECEIVED = 'Data file is received and is ready for load operation.

 
 

DATA LOADED = Load stage successfully completed.

 
 

BAD LOAD = Load stage unsuccessful, processing stopped.

 
 

PREPARED = Prepare stage successfully completed.

 
 

PART-PREPARED = Prepare stage successfully completed for some documents within the file.

 
 

BAD PREPARE = Prepare stage unsuccessful, processing stopped.

 
 

INSERTED = Transfer stage (insert) successfully completed.

 
 

BAD INSERT = Transfer stage (insert) unsuccessful, processing stopped.

 
 

UPDATED = Transfer stage (update) successfully completed.

 
 

BAD UPDATE = Transfer stage (update) unsuccessful, processing stopped.

 
 

BAD TRANSFER = Transfer unsuccessful.

 
 

COMPLETE = Transfer successful.

 

Lab

Code identifying the lab that is the data source — from the electronic file.Not updateable once the data file is processed.

10; CHAR

Format Mask

Specifies the format for this data file load. LOV displays all active masks.

 

Load TS

Date and time the file is loaded into the system. Not updateable once the data file is processed.

18; DATE

Insert Status

Status of the insert. For example:

INSERTED = Transfer insert successful.

BAD INSERT = Insert failed.

PENDING = Prepare complete, awaiting Transfer.

NONE = Nothing to insert.

UNKNOWN = Initial setting.

15; CHAR

Insert TS

Date and time the insert data in the file was transferred to the Oracle tables.

18; DATE

Update Status

Status of the update. For example:

UPDATED = Transfer update successful.

BAD UPDATE = Update failed.

PENDING = Prepare complete, awaiting Transfer.

NONE = Nothing to update.

UNKNOWN = Initial setting.

15; CHAR

Update TS

Date and time the update data in the file was transferred to the Oracle tables.

18; DATE

Comment Text

Free form comment text. Not updateable once the data file is processed.

200; CHAR


7.4.3 Selecting Data Files

To select data files from a directory to include in a data file group:

  1. Click the Select Server Files button in the Maintain Batch Load Data File window to reach the Select Server Files window.

  2. Enter the directory that contains your data files.

  3. Specify a query filter, if necessary. The wildcard % is allowed.

  4. Click the Get Directory Information button for information on this directory.

    Note:

    If information is retrieved a second time for the same directory in the same session, and if Refresh? is selected, another query goes to the operating system; otherwise, you would see outdated information.
  5. Click in the Select column next to each data file you want to include. You can use the Select All and Deselect All buttons and the navigation buttons at the bottom of the window.

  6. Save your input.

7.4.4 Loading Data

To successfully load data from an external source into Oracle Clinical tables you must perform all three batch data load stages, in order: Load, Prepare, and Transfer (see "Batch Data Load Stages"). You can choose to perform a single stage or multiple stages at a time. Performing a single stage allows you to check the error file just for that stage and takes less time.

  1. Navigate to Conduct and, depending on the files you are loading, select either Batch Data Load or Out of Study Load. See Standard Batch Data Load and Out of Study Batch Data Load for details.

  2. Select one of the following menu options to execute one or more of the Batch Load stages:

    • Load Batch Data

    • Prepare Batch Data

    • Transfer Batch Data

    • Load/Prepare Batch Data

    • Load/Prepare/Transfer Batch Data

    • Load/Prepare/Privileged Transfer. This option allows you to update locked data, and is available only if you have special privileges. If you select this option the system automatically updates locked data without prompts if updates to additional data are loaded.

    The appropriate Parameterized Submission (PSUB) form is displayed with the parameters required for the stage or stages you selected.

  3. Enter values for the job's parameters. The parameters displayed depend on the stage or stages you selected. The complete list of parameters includes:

    Table 7-5 Standard and Out of Study Batch Data Load Parameters

    Stage Parameter Name Description

    Load

    Datafile Group

    ID of the data file group you want to load. A list of values is available.

    Load

    Max No. of Bad Data Allowed During Load

    Maximum number of bad records allowed before the system ends the entire process. If fewer than the maximum number of bad records are found, the batch load continues; the bad records are not loaded, but are reported in the log file.

    Prepare

    OS File ID for Prepare

    ID of the data file group you want to prepare. A list of values is available.

    Prepare

    Max No. of Errors Allowed During Prepare

    Maximum number of bad records allowed before the system ends the entire process. If fewer than the maximum number of bad records are found, the batch load continues; the bad records are not loaded, but are reported in the log file.

    Prepare

    Prepare to Completion?

    Whether to complete the Prepare phase or to accept a partial completion.

    Prepare

    Spool Directory for Reloadable Data Files

    If Prepare to Completion is Y, the system populates this field with the appropriate value for the user, according to the values of the BDL_ROOT and BDL_USERDIRS in the OCL_STATE local reference codelist, and the user's account name (minus "OPS$" if it contains OPS$.

    Transfer

    Data File Group ID for Transfer

    ID of the data file group you want to transfer. A list of values is available.

    Transfer

    Transfer Mode

    The mode in which the transfer should take place. Values include:

    INSERT - Transfer only pending insert records.

    UPDATE - Transfer only pending updates records.

    BOTH - Transfer both pending inserts and pending update records. If you are using Out of Study load and are loading DCIs with multiple DCMs, select BOTH for proper processing. The process sees the records in the first Received DCM in a Received DCI as inserts and the records in subsequent Received DCMs as updates to the existing Received DCI.


  4. Click the Job Details button to display the Submission Details window and set the appropriate fields; see Oracle Clinical Getting Started for details. Then click Back to return to the Submission of Module window.

  5. Click Submit Job.

  6. Click the Job Status button to monitor progress of the batch job.

    If there is a problem loading the data, the load stops and an error message appears. To review details about why the load was aborted, look at the log file created during the batch load. See "Troubleshooting Batch Validation".

    Note:

    After the Transfer stage you must indicate if you want to purge pending or bad updates or inserts by displaying the Maintain Batch Load Data File form.

    To delete inserted (new) records included in the batch file, select Purge Inserts from the Special menu. Update Status should be UPDATED, and Insert Status should be PENDING or BAD INSERT.

    To delete updates (additions to existing records), select Purge Updates from the Special menu. Insert Status should be INSERTED, and Update Status should be PENDING or BAD UPDATE.

  7. If you have selected one of the steps that does not execute all three stages, repeat steps 2 through 6 for the remaining stages until the batch data load process is completed successfully.

You can use the Batch Job window to query on your job ID to see when it completes, or you can use the Maintain Batch Load Data File window or the Maintain Out of Study Batch Data Load File window to view the status of the batch job; the system automatically updates the Data File Status (DF Status) depending on your menu selection and the success of the current batch load stage.

In the Maintain Batch Load Data File window you can also change the comment text (only before processing starts) for a data file or purge either the pending inserts or pending updates, which are records of the data files not successfully inserted into the Temporary Batch Data Items, Temporary Batch DCIs, or Temporary Batch DCMs tables.

Purging Pending Inserts and Updates in the Maintain Batch Data Load Window In the bottom part of this window, place the cursor on the data file group name with the appropriate data file status and then select either Purge Updates or Purge Inserts from the Special menu.

Purging Pending Inserts and Updates for Out of Study Load Files To purge pending inserts or pending updates of Out of Study Load files:

  1. Query for the data file in the Maintain Out of Study Batch Load Data File window.

    In the bottom half of the window the last part of each data file group name contains the study name for which the data was loaded. Note the study name and the data file group names for which you want to purge inserts and/or updates.

  2. Navigate to the Maintain Batch Data Load Files window, select the appropriate study and query for the data file group name.

  3. In the bottom part of this window, place the cursor on the data file group name with the appropriate data file status and then select either Purge Updates or Purge Inserts from the Special menu.

7.5 Troubleshooting Batch Validation

Batch Validation completes with a status of FAILURE when it encounters any error or warning. However, a failed job may abort or it may complete with warnings. It does not abort if there were only warning messages because a custom procedure or a TMS derivation procedure failed. Any other error is a fatal error and causes the job to abort.

Check if a failed Batch Validation aborted or completed with warnings

If the log file displays the message Batch Validation aborted due to an error, the job aborted. If the log file displays the message NOTE: Batch Validation Completed with warnings due to TMS or Procedure execution or generation, this can mean that the job returned warnings only and completed, or that the job aborted due to an error and also returned warnings; in this case, the job aborted if there is any SQL error in the log file that does not relate to TMS execution (TMS SQL errors would return only warnings).

Effects of an aborted job

If Batch Validation aborts, updates related to univariate discrepancies and procedures that executed successfully before encountering the fatal error are not rolled back. The Current view will show any updates made prior to the fatal error. However, the values for Last_Batch_TS and Mark_rdcm are not updated. For this reason, stable views will not reflect the updates that occurred before the job aborted.

Effects of a job that completed with warnings

When Batch Validation continues to run despite warnings caused by procedures, edit checks are performed as expected for all validation procedures that did not fail and data is populated for derivation procedures that did not fail. The Current view shows updated data and the Stable view data is refreshed.

Note:

The default error log is overwritten at each stage of the batch data load. To retain the error log you must enter a new name in the Output File field of the Submission Details window of PSUB.

7.5.1 Troubleshooting the Prepare Phase

The batch data loader also performs error checking during the Prepare phase. To diagnose these errors, follow these steps:

  1. Open the .log file to get a general overview of the errors. When data files fail, you can check the log file for the status code of RECEIVED or PREPARED. The data file will not contain these status codes.

  2. Open the .out file to see specific details about the errors.

    • First review the error messages, and use Table 7-6 to diagnose them.

    • Then review the Error Reporting Table that follows each reported error to get a more detailed analysis of the problems. Table 7-7 shows an example of the Error Reporting Table.

Table 7-6 Error Messages During Prepare Phase

Error Explanation/Action

Warning: The following patient(s) have freeze_flag set.

This message is generated if the freeze flag is set on any patient in the data file. To remove the warning, unfreeze the patient.

Warning: The following patient(s) are either not assigned to a site, have site assignments inconsistent with that in the data file, or they are not from the present location.

Correct the patients and site assignments displayed in the message.

If patients are not from the current location, this is a fatal error.

Fatal error: Aborting due to the presence of patients and/or Investigators from locations other than the present location in the data file.

This message and the previous one are generated if patients in the data file do not have a study site assignment or have an incompatible study site assignment with an Investigator. To remove the warning, correct the patient's study site assignment.

Warning: The following Investigator(s) are either not assigned to a site or have site assignments inconsistent with that in the data file, or they are not from the present location.

This message is generated if there are patients or Investigators in the data file who do not belong to the current location. To remove the warning, correct the location of the patient.

If they are not from the current location, this is a fatal error.

Warning: Locked DCI(s) present in data file.

The contents of the data file will update data already in RECEIVED_DCIS, but the DCI indicates that it is locked for data update. The row in the data file is uniquely identified by the reported sets of keys. Use the Privileged Transfer task, if you have access to it, to update locked data.

Warning: Locked DCM(s) present in data file.

The contents of the data file will update data already in RECEIVED_DCMS, but the DCM indicates that it is locked for data update. The row in the data file is uniquely identified by the reported sets of keys. Use the Privileged Transfer task, if you have access to it, to update locked data.

Error: The following candidate(s) for insert(s) have document numbers that are present in RECEIVED_DCIS.

The data file contains data to insert into RECEIVED_DCIS, but the document number in the data file is already present in RECEIVED_DCIS.

Correct the document number.

This error may also be caused, in the case of manual Batch Data Load, by a key change in the document. Use Log-In to change keys.

Error: Mismatch between RDCI and TBDCI in visit date and/or time.

The contents of the data file will update data already in RECEIVED_DCIS, but the visit date or visit time in the data file do not match data in RECEIVED_DCIS.

Correct the visit date and/or time, or use Log-In to update the existing data.

Error: Mismatch between RDCM and TBDCM in visit date and/or time.

The contents of the data file will update data already in RECEIVED_DCMS, but the visit date or visit time in the data file do not match data in RECEIVED_DCMS.

Correct the visit date and/or time.

Error: Mismatch between RDCI and TBDCI in document number.

The contents of the data file will update data already in RECEIVED_DCIS, but the document number in the data file does not match data in RECEIVED_DCIS.

Correct the document number.

Error: Row(s) in data file have same patient, event, and subevent but different document number, visit date, and/or time.

There is duplicate data in the data file corresponding to the same combination of identifying keys.

Correct the document number, visit date, or visit time in the data file.

Error: Question name and occurrence number combination is not collected in the Question Group.

The question name and the occurrence number combination in the data file does not match the question group in the same row in the same data file.

Correct the question name, occurrence number, or question group name.

Error: Question name and occurrence number combination is not collected in any Question group.

The question name and the occurrence number combination in the data file does not match any question groups.

Match the question name and the occurrence number combination with the correct question group.

Error: The following Investigator(s) are not valid.

Check the validity of Investigator names.

Error: The following site(s) are not valid.

Check the validity of site names.

Error: The following patient(s) are not valid.

Check the validity of patient positions.

Error: The following clinical planned event(s) are not valid.

Check the validity of the clinical planned events.

Fatal error: Aborting due to presence of invalid keys.

The data file contains data for which unique identifiers cannot be found for the corresponding character keys. The system aborts immediately after reporting the character keys causing this error.

Correct the invalid keys.

Error: Invalid date or time in the data file.

The data file contains invalid dates or times.

Correct the dates and times.

Error: Question group q_grp_name either doesn't exist or is not collected in the DCM.

The data file contains invalid question groups.

Use valid question groups.

Error occurred while preparing the data file.

Updating Data_Files with status: BAD PREPARE

Errors occurred during the Prepare phase, and the database was updated accordingly.

Updating Data_Files with status: PREPARED

No errors occurred during the Prepare phase, and the database was updated accordingly.

Error: Question name and occurrence number combination occurs in multiple collected Question groups.

Question name and occurrence sequence number combinations appear in multiple question groups.

Correct the data so that such unique occurrences happen only in one question group.

Fatal Error: Internal inconsistency in DCI/DCM definition.

DCIs and DCMs are not set up correctly in the system.

Check the setup of DCIs and DCMs.

Error: The following qualifying value(s) are not valid for the qualifying Question.

There are mismatching qualifying values between a qualifying question and n discrete values for that question.

Either update the discrete values or change the data file to match the discrete values.

Error: Mismatch between RDCI and TBDCI in Investigator.

The contents of the data file will update data already in RECEIVED_DCIS, but the Investigator in the data file does not match data in RECEIVED_DCIS. To remove this error, correct the Investigator.

Error: Mismatch between RDCI and TBDCI in site.

The contents of the data file will update data already in RECEIVED_DCIS, but the site in the data file does not match data in RECEIVED_DCIS.

Correct the site.

Error: Mismatch between RDCM and TBDCM in Investigator.

The contents of the data file will update data already in RECEIVED_DCMS, but the Investigator in the data file does not match data in RECEIVED_DCMS.

Correct the Investigator.

Error: Mismatch between RDCM and TBDCM in site.

The contents of the data file will update data already in RECEIVED_DCMS, but the site in the data file does not match data in RECEIVED_DCMS.

Correct the site.

Error: Candidate update of RDCM matches candidate insert of RDCI.

Data in the file will cause an update of RECEIVED_DCMS (i.e. data already exists for this row), but an insert in RECEIVED_DCIS (i.e. data is not present for this row).

Look at the identifying keys reported for this row, and correct the identifying keys.

Warning: The following Questions are in the DCM but are missing in the data file.

Missing questions associated with the DCM have not been reported in the data file.

Report data for all DCM questions.

Warning: The following non-repeating Question groups have repeat_sn greater than 1 in the data file.

Rows in the data file have nonrepeating question groups with repeat sequence numbers greater than 1.

Remove the occurrences where a repeat sequence number greater than one exists for non-repeating question groups.

Error: The following Investigator(s) do not have a corresponding valid site.

No site is entered in the data file, and the system fails to generate a site from that of the Investigator, as the Investigator has no valid site.

Change the Investigator, or assign a valid site to the Investigator.


Table 7-7 shows an example of an Error Reporting Table that is displayed in the .OUT file if there are errors associated with the Load or Prepare phase. The example reports errors for Investigators. An equivalent Error Reporting Table is created for errors associated with patients.

Table 7-7 Error Reporting Table for Investigators

Investigator Wrong Study No Site Wrong Site Wrong Location TRDCI Site Investigator Site Investigator Location

3260

No

No

Yes

No

No

Site 2

Site 1


In Table 7-8, the "Yes" in the Wrong Site column indicates that Oracle Clinical was expecting a different site from the one it found in the data file. In this case, it was expecting Site 1 and found Site 2.

To correct this error, either:

  • change all occurrences of Site 2 to Site 1 for Investigator 3260 in the data file

  • change the site assignment for Investigator 3260 to Site 1 in Oracle Clinical

Table 7-8 Other Examples of Possible Errors

If "Yes" Under.... Then...

No Site

No site was found. Oracle Clinical shows the correct site under "Investigator Site."

Wrong Location

The wrong location was given. Oracle Clinical shows the correct location under "Investigator Location."

Wrong Study

The wrong study was given. In this case, Oracle Clinical cannot anticipate what is the right study.


After the Transfer stage, the log shows the number of records updated and the number of records inserted.

7.5.2 Troubleshooting the Transfer Phase

The log file keeps track of the status of every discrete job during the Transfer process, including the names of the data files being loaded, summary information, and errors. If the transfer is successful, the log shows the status at the beginning and at the end.

7.6 Batch-Deleting Data

You can use a batch job to delete study data when appropriate. For example:

  • After loading an Out of Study file, you can delete all data associated with a bad data file or data file group.

  • You can also delete batch data loads of externally maintained data where the audit history is maintained in an external system.

  • There are also two types of manually entered data that can be deleted from Oracle Clinical without causing problems with the audit trail:

    • First-pass data entered into a study but never made accessible.

    • Data in development databases where audit is not required.

You can navigate to the batch job for deleting study data from the Conduct menu, by selecting Security, and then selecting Delete Study Information or Delete Study Information with Audit. PSUB window appears. If you do not specify any values for DCI, document number, data file, or data file group, all data in the study will be deleted.

If the study is a flexible study, after completing the batch data delete job, the system automatically runs the full expectedness calculation job. See Oracle Clinical Creating a Study for more information.

Note:

The batch data delete that is not audited is a hard delete - you will not be able to roll back changes made after such a delete. Use this function with extreme care.

Note:

Most users have access only to the Batch Data Deletion with Audit job. However some users may have access to the Batch Data Deletion job, which has an additional Audit parameter allowing the user to either audit the data deletion or not.

You can change data entry information and maintain an audit trail by selecting Update from the Data Entry menu and making changes in Data Entry Update mode. You can also use the batch job Delete Study Information with Audit.

7.6.1 Auditing Data Deletion

The Oracle Clinical table STUDY_DATA_DELETE_AUDIT records each time a batch data delete job is run. The table records:

  • the parameters used to invoke the deletion

  • the actual number of each object that is deleted by the run

This audit table enables you to document the use and impact of the data deletion facility, and may help you diagnose whether a problem has been caused by a hard data delete.

When you execute batch data delete with audit, the records in the deleted CRFs are backed up in a set of audit tables. For each source table that is deleted, a new corresponding audit table is created with a suffix of $BDD (for instance, RECEIVED_DCIS$BDD). Columns in the table include the batch_job_id followed by the same columns as in the source table. For example, RESPONSES$BDD includes batch_job_id followed by same fields as responses.

The job generates audit tables corresponding to the following Oracle Clinical tables:

7.6.1.1 Data File Tables

Depending on parameter values specified for data file and data file group id during batch data deletion, data from the source table may be deleted and therefore audited.

  • If you specify a data file, the corresponding record in DATA_FILES is deleted.

  • If you specify a data file group and no data file, the corresponding record in DATA_FILE_GROUPS and related records in DATA_FILES is deleted.

  • If you accept all default parameters, all records for the study in DATA_FILE_GROUPS and DATA_FILES are deleted.

The DATA _FILES table stores names of files used to do a batch data load.

The DATA_FILE_GROUPS stores group of files that are submitted together.

Table 7-9 DATA FILE Tables Used in Batch Data Deletion

Source Table Corresponding Audit Table

DATA_FILES

DATA_FILES$BDD

DATA_File_Groups

DATA_File_Groups$BDD


7.6.1.2 Data Tables

Table 7-10 DATA Tables Used in Batch Data Deletion

Source Table Corresponding Audit Table

RESPONSES

RESPONSES$BDD

RESPONSE_LOBS

RESPONSE_LOBS$BDD

RECEIVED_DCIS

RECEIVED_DCIS$BDD

RECEIVED_DCMS

RECEIVED_DCMS$BDD

DISCREPANCY_ENTRIES

DISCREPANCY_ENTRIES$BDD

VALIDATION_REPORTED_VALUES

VALIDATION_REPORTED_VALUES$BDD

ACTUAL_EVENTS

ACTUAL_EVENTS$BDD

RDCI_HISTORY

RDCI_HISTORY$BDD

DISCREPANCY_ENTRY_REVIEW_HIST

DISCREPANCYENTRYREVIEWHIST$BDD

RECEIVED_PAGE_HISTORY

RECEIVED_PAGE_HISTORY$BDD

RECEIVED_PAGES

RECEIVED_PAGES$BDD


7.6.1.3 Definition Tables

Data from definitional tables is not deleted during batch data delete. However, definitions related to CRFs being deleted are audited in order to preserve identifying information. You can delete or modify definitional data in the below tables, especially if all related patient data is deleted. These tables are used to generate reports on the audited data

Table 7-11 DEFINITION Tables Used in Batch Data Deletion

Source Table Corresponding Audit Table

DCMS

DCMS$BDD

DCIS

DCIS$BDD

DSCM_QUESTION_GROUP

DCM_QUESTION_GROUPS$BDD

DCM_QUESTIONS

DCM_QUESTIONS$BDD


7.6.1.4 Layout Tables

As with definitional data, the form layout tables are audited in case users delete or modify relevant layouts after CRFs are deleted. Data from the source table is not deleted during a batch data delete, but you may be inclined to retire or modify layouts if all related patient data is deleted. You can use this layout information to reconstruct the patient data report for hard-deleted patient data.

Table 7-12 LAYOUT Tables Used in Batch Data Deletion

Source Table Corresponding Audit Table

DCI_FORM_VERSIONS

DCI_FORM_VERSIONS$BDD

DCM_LAYOUT_GRAPHICS

DCM_LAYOUT_GRAPHICS$BDD

DCM_LAYOUT_TEXT

DCM_LAYOUT_TEXT$BDD

DCM_LAYOUT_PAGES

DCM_LAYOUT_PAGES$BDD

DCM_LAYOUT_ABS_PAGES

DCM_LAYOUT_ABS_PAGES$BDD


7.6.1.5 Other Reference Tables

Records from the tables listed below are not deleted during batch data delete. However, these tables will also be audited to preserve status at the time the patient data was deleted. You may be inclined to delete data from these tables if all related patient data is deleted. With auditing, if you delete all the data for a patient and then delete the patient, you can still find the patient enrollment data from the PATIENT_POSITIONS$BDD table.

Table 7-13 Reference Tables Used in Batch Data Deletion

Source Table Corresponding Audit Table

CLINICAL_PLANNED_EVENTS

CLINICAL_PLANNED_EVENTS$BDD

PATIENT_POSITIONS

PATIENT_POSITIONS$BDD

OCL_STUDY_SITES

OCL_STUDY_SITES$BDD

OCL_SITES

OCL_SITES$BDD

STUDY_SITE_PATIENT_POSITIONS

STUDYSITE PATIENTPOSITIONS$BDD


7.6.2 Reports on Audited Data Deletion

You can run two scripts to generate reports on audited deletions of data and discrepancies. For information, see the section about running reports on deleted data and discrepancies in the "Utilities" chapter of the Oracle Clinical Administrator's Guide.

7.7 Batch Data Load Reports

From the Conduct menu, select Conduct Reports, then select Batch Data Load to find several reports on batch loaded data.

Note:

The Processed Data Report, despite its name, displays prepared and not processed data in the temporary tables following the successful completion of the Prepare stage of a batch data load. Data that has gone through the Transfer stage (after the Prepare stage) does not appear in the report.