Subject Data Extract

Data managers and statisticians can extract cumulative subject data for data analysis or regulatory reporting of a given study.

The Subject Data Extract provides sponsors tabular data organized by form for downstream clinical programming including generation of study results and regulatory reporting.

Note:

In Oracle Clinical One Platform, SAS Transport File Formats are currently running on Version 8, so make sure you use a SAS client that's on version 9.3 or later. For more details, see the SAS documentation platform.Foot 1

Modes

Available for all 3 modes: Testing, Training, and Production.

Roles that can extract data

Users with the Run Subject Data Extract permission can generate this report and only see data they are authorized to view.

Note:

Depending on your permissions, the Subject Data Extract may include unblinding data.

The Subject Data Extract can contain blinded or unblinded data, depending on your permissions within the study. Hidden data is only accessible if it has been data classified and the user's role includes the appropriate data classification permission. A user may have access to certain hidden, data classified information, but not necessarily all of it.

Output details

Here is a list of details on how data may be displayed in the Subject Data Extract:
  • When you generate the Subject Data Extract, you can choose between four main types of output formats: sas7bdat, XPORT, CPORT, and CSV. For simplicity, throughout this document, the term SAS format refers collectively to sas7bdat, XPORT, and CPORT. The term CSV format refers to CSV files.
  • If no subject data was entered for a form, the form dataset includes only a header row; the rest of the file is blank.
  • Records of cleared forms are not included. This also applies to deleted repeating forms.

    Similarly, if the data is cleared from the parent question of a dynamic form, which results in the dynamic form being hidden, no record of that form is included in the extract.

  • When a question uses a code list for its options, the extract includes the option’s value, label, and code.
  • When the Subject Data Extract is generated in SAS format, numeric values are displayed without leading zeros.

    For example, if a site user collects a value of "0001.12345", the data extracted in a SAS format displays that value as "1.12345".

  • Make sure you define SAS properties. These required fields default to the Question Reference Code and Label but can be updated by Study Designers as needed. The Subject Data Extract will use the values set for SAS properties.
  • In a two-section form, the questions from the Questions Before the Table section are combined with the questions in the table, providing you with a single, unified record of data to analyze.
  • For a lab form, the Sample Collection Date and the Fasting questions are merged with the results, as well, for the same reasons.
  • If a site user does not answer a question or they apply a data flag to it, the corresponding field in the extract is populated with a code specific to each data flag. For more details about how data flags are mapped in the four columns corresponding to each question (QUESTIONLABEL, QUESTIONLABEL_R, QUESTIONLABEL_F, and QUESTIONLABEL_D) see form-item-output-mapping-subject-data-extract.html#GUID-37036FC3-983B-4C8F-81CC-C5CE5F53EE07.
    The following codes can be displayed:
    • Not Applicable (NA): C48660
    • Not Done (ND): C49484
    • Unknown (UNK): C17998
    • Not Answered: -99999

      Note:

      The report excludes forms where all questions are flagged Not Answered.
  • When generating the Subject Data Extract Report, a dataset called FORMASSOCIATION is included, which provides details about all of the form associations that are present in the study.

    Note:

    The Subject Data Extract includes form association data for all Link & Show Form rules defined in the study design.
  • The Subject Data extract includes one record per form; each record is a snapshot of that form’s data at the time the extract was generated.

Settings

Filter Description
File Name Select the appropriate naming convention for your data extract. These naming conventions are only available for data extracted in a CSV format:
  • <Form Name>
  • <Study Name>_Form Name>_<Time stamp>
  • <Study Name>_<Form Name>
  • <Form Name>_<Time stamp>
Site Choose one or more sites that you're assigned to. This drop-down includes sites in all states: New, Active, Retired.
File Type Select the output type for the report:

Tip: For regulatory submissions, use XPORT. For analysis in SAS, use sas7bdat. For general viewing and flexibility, use CSV.

CSV if you want an easy-to-use format for reviewing, sharing, or analyzing data across a wide range of programs, including spreadsheets like Excel and various data analysis tools.

If you choose the CSV format, for each form designed in the study, you receive a separate CSV file. Each question or item in the form has four corresponding columns, as described in the Field Descriptions section below.

Transport (XPORT) if you need to submit data for regulatory purposes (such as FDA submissions) or need the most widely compatible SAS format.

If you generate output in a SAS format (CPORT, Transport (XPORT), or sas7bdat), individual datasets for each form are included, each with SAS labels and variable names.

CPORT if you want to archive your data or transfer it efficiently between SAS environments. This SAS proprietary format is especially useful for backup and migration tasks within SAS.

Note: CPORT is designed for use with SAS software and offers flexibility within SAS environments.

sas7bdat if you want a official SAS format that’s easy to open in Excel or other programs, or if you just want to quickly review the data.

Note: When using sas7bdat, datasets are provided in a ZIP file. Inside the ZIP, each form has its own dataset in the native SAS format (.sas7bdat), whichD includes all variable names and labels for each data element.

Field descriptions

Key Column SAS Label Description
TENANTID Tenant Identifier Indicates the organization identifier.
STUDYID Study Identifier A study ID as specified by the study manager when they created the study.
COUNTRY Country of Investigator Site A site's country as specified by the site manager when they created the sites in Oracle Clinical One Platform.
SITEID Site Identifier Indicates the site's ID at the time of data collection.
INVID Investigator Identifier (DEA Number) A principal investigator's ID as specified by the site manager when they created the sites in Oracle Clinical One Platform.
INVNAM Investigator First and Last Name A principal investigator's name as specified by the site manager when they created the sites in Oracle Clinical One Platform.
USUBJID Unique Subject ID (GUID) Unique subject ID across all studies for all applications or submissions involving the product.
SCRNID Screening Number

The number assigned to a subject after screening or enrollment.

Note: This matches the SUBJID column unless you are Replacing the Subject Number with the Randomization Number.

SUBJID Subject Identifier (Subject Number) Indicates the unique subject number within the study.
VISITNUM Visit Identifier Indicates a visit's ID as specified by the study designer.

For the completion or withdrawal actions, this field is blank if the study design does not include a Completion or Withdrawal event.

VISIT Visit Title Indicates a visit or event title as specified by the study designer.
UNSCHED Unscheduled/Cycle Visit Instance Number Indicates the unscheduled visit instance number or cycle number for cycle visits.
SVSTDTC Visit Start Date (Start Date/Time of Visit) Indicates a visit's start date, represented in ISO 8601 character format.

For the completion or withdrawal actions, this field is blank if the study design does not include a Completion or Withdrawal event.

DOMAIN Form Reference Code (Source Form) Indicates the form's reference code as a the Study designer defined it.
LABID Lab ID The Lab ID.

Note: For VISIT_SYS dataset, this field is blank, as the visit-level and form or section fields do not apply.

NAM Lab Name The Lab name.

Note: For VISIT_SYS dataset, this field is blank, as the visit-level and form or section fields do not apply.

REPEATNUMBER Row number in a repeating form.
For different types of forms, this column displays data in the following ways:
  • For one-section (flat) forms, these fields are not populated.
  • For repeating forms, this column is populated with the number of repeating form instances used in a study.
  • For lab forms and two-section forms, this column is populated with the instance number of each lab form.

Note: For VISIT_SYS dataset, this field is blank, as the visit-level and form or section fields do not apply.

SREPEATID Repeating Section Unique Identifier

This is a unique number assigned to the Questions in the Table section of a two-section forms, including a lab form.

SREPEATNUMBER Row number in a repeating section

This is a row's unique number (whether located in a lab form or a two-section form). This number is unique for every section (in a two-section form), form, and event for a subject.

ENTEREDBY The user who initially entered data into the form. Indicates the user name of the user who initially entered data into the form or answered a question.
ENTEREDDATE The date when the user entered data into the form. Date is UTC Timezone. The UTC time and date of when the user entered data into the form or answered a question, represented in ISO 8601 character format.
LASTCHANGEDBY The latest user or system user who modified any form item. Indicates the user name of the latest user to have modified any questions on the form. Queries are not taken into account as form updates.
LASTCHANGEDDATE The latest date of any form item that is modified. Date is UTC Timezone. The UTC time and date of when a question was last modified, represented in ISO 8601 character format.
Form actions like sign, verify, freeze, lock, or unanswered flags are not considered as a last update.
  • Deleting forms (such as repeating forms, two-section forms, or a repeating section row) show in this field.

  • Only clearing data on a dynamic form affects the date; publishing or hiding it without data changes does not.

  • For the completion or withdrawal actions, this field is blank if the study design does not include a Completion or Withdrawal event.

REFCODE

Note: Each question displays in the extract using the question's reference code specified in form design. For example, if the reference code of a question is AGE, the ITEM column outputs as AGE.

The answer to the question label. Indicates the answer to a question as entered by a site user.

Form questions get represented in four different columns: Raw (ITEM_R), Formatted (ITEM_F), Decode (ITEM_D) and the item itself (ITEM). For more details about mappings of specific data types, see form-item-output-mapping-subject-data-extract.html#GUID-37036FC3-983B-4C8F-81CC-C5CE5F53EE07.

REFCODE_R

Note: Each question displays in the extract using the question's reference code specified in form design. For example, if the reference code of a question is AGE, the ITEM_R column outputs as AGE_R.

The question label's raw value. Indicates the raw value: alphanumeric value as entered in Oracle Clinical One Platform with no conversions. This includes data entry flags.

Form questions get represented in four different columns: Raw (ITEM_R), Formatted (ITEM_F), Decode (ITEM_D) and the item itself (ITEM). For more details about mappings of specific data types, see form-item-output-mapping-subject-data-extract.html#GUID-37036FC3-983B-4C8F-81CC-C5CE5F53EE07.

REFCODE_F

Note: Each question displays in the extract using the question's reference code specified in form design. For example, if the reference code of a question is AGE, the ITEM_F column outputs as AGE_F.

The question label's data type. Indicates the formatted value: value as entered in Oracle Clinical One Platform converted to the question data type as per form design.

Form questions get represented in four different columns: Raw (ITEM_R), Formatted (ITEM_F), Decode (ITEM_D) and the item itself (ITEM). For more details about mappings of specific data types, see form-item-output-mapping-subject-data-extract.html#GUID-37036FC3-983B-4C8F-81CC-C5CE5F53EE07.

REFCODE_D

Note: Each question displays in the extract using the question's reference code specified in form design. For example, if the reference code of a question is AGE, the ITEM_D column outputs as AGE_D.

The question label's data type for the answer. Indicates the decoded raw value, with additional considerations according to data type. If the question has a code value, it is populated in this field.

Form questions get represented in four different columns: Raw (ITEM_R), Formatted (ITEM_F), Decode (ITEM_D) and the item itself (ITEM). For more details about mappings of specific data types, see form-item-output-mapping-subject-data-extract.html#GUID-37036FC3-983B-4C8F-81CC-C5CE5F53EE07.

DeletedQn Deleted Questions

If a form item was deleted in a previous version of the study, but data had been collected for it, this data is displayed under this column.

Note: The SAS Label of the deleted form item is displayed as D<GUID>, using the deleted form's GUID.

VISSTAT Visit Status

Indicates the current progress or completion state of a visit or event as recorded in the study. Can be one of the following values:

Note: This field only exists in the VISIT_SYS dataset.

  • INPROGRESS
  • INCOMPLETE
  • INCOMPLETE_ERROR
  • COMPLETE
  • COMPLETE_ERROR
  • NOT STARTED

For the screening, screen fail, completion or withdrawal actions, this field is blank.

For more information about the visit statuses, see Visit status and icon descriptions.

SKIPYN Visit Skipped

Indicates a visit or event that was not conducted or attended, as recorded by the study coordinator.

Note: This field only exists in the VISIT_SYS dataset.

  • Y for skipped visits
  • N for not skipped visits

    For the screening, screen fail, completion or withdrawal actions, this field is blank.

DSDECOD Disposition Term

This indicates the event that occurred for the subject. Can be one of the following values:

Note: This field only exists in the VISIT_SYS dataset.

  • Screened: Indicates an event where a subject is successfully screened.

    If a subject fails the initial screening, you may see multiple instances of events such as Screen and Screen Failure.

  • Screen Failure: Indicates an event where a subject has failed screening.

    If the event is undone, it's no longer included in the output.

  • Randomized: Indicates an event where a subject was randomized in a study.
  • Completed: Indicates an event where a subject completed a study.

    If the event is undone, it's no longer included in the output.

  • Withdrawal: Indicates an event where a subject is withdrawn.

    If the event is undone, it's no longer included in the output.

This field is blank if no event occurs.

For more information about the subject statuses, see Available subject statuses.

DSSTDAT Disposition Date

Indicates the date the event occurred.

Date is UTC Timezone in the format YYYY-MM-DD.

Note: This field only exist in the VISIT_SYS dataset.

  • For the Screened, Randomization, Completed, Withdrawal, and Screen Failure events, the date shows when the event occurred in Oracle Clinical One Platform.
  • For the Screen Failure, Withdrawal, and Completion events, the date reflects the date entered by the site user.

This field is blank if no event occurs.

REASON Reason

Specifies the explanation or justification provided for an action or outcome (such as skipping a visit), as entered by the user.

Note: This field only exists in the VISIT_SYS dataset.

For the screening, screen fail, completion or withdrawal actions, this field is blank.

SOURCEFORMID Source Form ID Indicates the ID of the source form, that is, the form containing the question that launches a form association.

Note This field only exist in the Form Associations dataset.

SOURCEITEMID Source Item ID Indicates the source item ID. The source item is the question/item that launches a form association.

Note: This field only exist in the Form Associations dataset.

RVISITNUM Related Visit Identifier Indicates the ID of the visit that contains the form with which the source form is related/linked.

Note: This field only exist in the Form Associations dataset.

RVISIT Related Visit Title Indicates the name of the visit that contains the form with which the source form is related/linked.

Note: This field only exist in the Form Associations dataset.

RVISITINST Related Unscheduled/Cycle Instance Indicates the instance number of the unscheduled visit or cycle visit that contains the form with which the source form is related/linked.

Note: This field only exist in the Form Associations dataset.

RSVSTDTC Related Visit Start Date Indicates the start date of the visit that contains the form with which the source form is related/linked.

Note: This field only exist in the Form Associations dataset.

RDOMAIN Related Form Reference Code Indicates the name of the form with which the source form is related/linked.

Note: This field only exist in the Form Associations dataset.

RFORMID Related Form ID Indicates the name of the form with which the source form is related/linked.

Note: This field only exist in the Form Associations dataset.

RREPEATNUMBER Related Repeat Number

Indicates the number of the repeating form instance with which the source form is linked.

Note: This field only exist in the Form Associations dataset.

RSREPEATID Related Repeating Section Unique Identifier Indicated the instance number of the flat section with which the source form is linked.

Note: This field only exists in the Form Associations dataset.

RSREPEATNUMBER Related row number in a repeating section Indicated the instance number of the section with which the source form is linked.

Note: This field only exist in the Form Associations dataset.



Footnote Legend

Footnote 1: SAS is a registered trademark or trademark of SAS Institute, Inc. in the USA and other countries. Other brand and product names are registered trademarks of their respective companies.