Subject Data Extract
Data managers and statisticians can extract cumulative subject data for data analysis or regulatory reporting of a given study.
Note:
In Oracle Clinical One Platform, SAS Transport File Formats are currently running on Version 8, so make sure you use a SAS client that's on version 9.3 or later. For more details, see the SAS documentation platform.Foot 1Modes
Available for all 3 modes: Testing, Training, and Production.
Roles that can extract data
Note:
Depending on your permissions, the Subject Data Extract may include unblinding data.The Subject Data Extract can contain blinded or unblinded data, depending on your permissions within the study. Hidden data is only accessible if it has been data classified and the user's role includes the appropriate data classification permission. A user may have access to certain hidden, data classified information, but not necessarily all of it.
Output details
- When you generate the Subject Data Extract, you can choose between four main types of output formats: sas7bdat, XPORT, CPORT, and CSV. For simplicity, throughout this document, the term SAS format refers collectively to sas7bdat, XPORT, and CPORT. The term CSV format refers to CSV files.
- If no subject data was entered for a form, the form dataset includes only a header row; the rest of the file is blank.
- Records of cleared forms are not included. This also applies to deleted repeating forms.
Similarly, if the data is cleared from the parent question of a dynamic form, which results in the dynamic form being hidden, no record of that form is included in the extract.
- When a question uses a code list for its options, the extract includes the option’s value, label, and code.
- When the Subject Data Extract is generated in SAS format, numeric values are displayed without leading zeros.
For example, if a site user collects a value of "0001.12345", the data extracted in a SAS format displays that value as "1.12345".
- Make sure you define SAS properties. These required fields default to the Question Reference Code and Label but can be updated by Study Designers as needed. The Subject Data Extract will use the values set for SAS properties.
- In a two-section form, the questions from the Questions Before the Table section are combined with the questions in the table, providing you with a single, unified record of data to analyze.
- For a lab form, the Sample Collection Date and the Fasting questions are merged with the results, as well, for the same reasons.
- If a site user does not answer a question or they apply a data flag to it, the corresponding field in the extract is populated with a code specific to each data flag. For more details about how data flags are mapped in the four columns corresponding to each question (QUESTIONLABEL, QUESTIONLABEL_R, QUESTIONLABEL_F, and QUESTIONLABEL_D) see form-item-output-mapping-subject-data-extract.html#GUID-37036FC3-983B-4C8F-81CC-C5CE5F53EE07.
The following codes can be displayed:
- Not Applicable (NA): C48660
- Not Done (ND): C49484
- Unknown (UNK): C17998
- Not Answered: -99999
Note:
The report excludes forms where all questions are flagged Not Answered.
- When generating the Subject Data Extract Report, a dataset called FORMASSOCIATION is included, which provides details about all of the form associations that are present in the study.
Note:
The Subject Data Extract includes form association data for all Link & Show Form rules defined in the study design. - The Subject Data extract includes one record per form; each record is a snapshot of that form’s data at the time the extract was generated.
Settings
| Filter | Description | |
|---|---|---|
| File Name | Select the appropriate naming convention for your data extract. These naming conventions are only available for data extracted in a CSV format:
|
|
| Site | Choose one or more sites that you're assigned to. This drop-down includes sites in all states: New, Active, Retired. | |
| File Type | Select the output type for the report:
Tip: For regulatory submissions, use XPORT. For analysis in SAS, use sas7bdat. For general viewing and flexibility, use CSV. |
CSV if you want an easy-to-use format for reviewing, sharing, or analyzing data across a wide range of programs, including spreadsheets like Excel and various data analysis tools. If you choose the CSV format, for each form designed in the study, you receive a separate CSV file. Each question or item in the form has four corresponding columns, as described in the Field Descriptions section below. |
| Transport (XPORT) if you need to submit data for regulatory purposes (such as FDA submissions) or need the most widely compatible SAS format.
If you generate output in a SAS format (CPORT, Transport (XPORT), or sas7bdat), individual datasets for each form are included, each with SAS labels and variable names. |
||
|
CPORT if you want to archive your data or transfer it efficiently between SAS environments. This SAS proprietary format is especially useful for backup and migration tasks within SAS. Note: CPORT is designed for use with SAS software and offers flexibility within SAS environments. |
||
|
sas7bdat if you want a official SAS format that’s easy to open in Excel or other programs, or if you just want to quickly review the data. Note: When using sas7bdat, datasets are provided in a ZIP file. Inside the ZIP, each form has its own dataset in the native SAS format (.sas7bdat), whichD includes all variable names and labels for each data element. |
||
Field descriptions
| Key Column | SAS Label | Description |
|---|---|---|
| TENANTID | Tenant Identifier | Indicates the organization identifier. |
| STUDYID | Study Identifier | A study ID as specified by the study manager when they created the study. |
| COUNTRY | Country of Investigator Site | A site's country as specified by the site manager when they created the sites in Oracle Clinical One Platform. |
| SITEID | Site Identifier | Indicates the site's ID at the time of data collection. |
| INVID | Investigator Identifier (DEA Number) | A principal investigator's ID as specified by the site manager when they created the sites in Oracle Clinical One Platform. |
| INVNAM | Investigator First and Last Name | A principal investigator's name as specified by the site manager when they created the sites in Oracle Clinical One Platform. |
| USUBJID | Unique Subject ID (GUID) | Unique subject ID across all studies for all applications or submissions involving the product. |
| SCRNID | Screening Number |
The number assigned to a subject after screening or enrollment. Note: This matches the SUBJID column unless you are Replacing the Subject Number with the Randomization Number. |
| SUBJID | Subject Identifier (Subject Number) | Indicates the unique subject number within the study. |
| VISITNUM | Visit Identifier | Indicates a visit's ID as specified by the study designer.
For the completion or withdrawal actions, this field is blank if the study design does not include a Completion or Withdrawal event. |
| VISIT | Visit Title | Indicates a visit or event title as specified by the study designer. |
| UNSCHED | Unscheduled/Cycle Visit Instance Number | Indicates the unscheduled visit instance number or cycle number for cycle visits. |
| SVSTDTC | Visit Start Date (Start Date/Time of Visit) | Indicates a visit's start date, represented in ISO 8601 character format.
For the completion or withdrawal actions, this field is blank if the study design does not include a Completion or Withdrawal event. |
| DOMAIN | Form Reference Code (Source Form) | Indicates the form's reference code as a the Study designer defined it. |
| LABID | Lab ID | The Lab ID.
Note: For VISIT_SYS dataset, this field is blank, as the visit-level and form or section fields do not apply. |
| NAM | Lab Name | The Lab name.
Note: For VISIT_SYS dataset, this field is blank, as the visit-level and form or section fields do not apply. |
| REPEATNUMBER | Row number in a repeating form. |
For different types of forms, this column displays data in the following ways:
Note: For VISIT_SYS dataset, this field is blank, as the visit-level and form or section fields do not apply. |
| SREPEATID | Repeating Section Unique Identifier |
This is a unique number assigned to the Questions in the Table section of a two-section forms, including a lab form. |
| SREPEATNUMBER | Row number in a repeating section |
This is a row's unique number (whether located in a lab form or a two-section form). This number is unique for every section (in a two-section form), form, and event for a subject. |
| ENTEREDBY | The user who initially entered data into the form. | Indicates the user name of the user who initially entered data into the form or answered a question. |
| ENTEREDDATE | The date when the user entered data into the form. Date is UTC Timezone. | The UTC time and date of when the user entered data into the form or answered a question, represented in ISO 8601 character format. |
| LASTCHANGEDBY | The latest user or system user who modified any form item. | Indicates the user name of the latest user to have modified any questions on the form. Queries are not taken into account as form updates. |
| LASTCHANGEDDATE | The latest date of any form item that is modified. Date is UTC Timezone. | The UTC time and date of when a question was last modified, represented in ISO 8601 character format.
Form actions like sign, verify, freeze, lock, or unanswered flags are not considered as a last update.
|
| REFCODE
Note: Each question displays in the extract using the question's reference code specified in form design. For example, if the reference code of a question is AGE, the ITEM column outputs as AGE. |
The answer to the question label. | Indicates the answer to a question as entered by a site user.
Form questions get represented in four different columns: Raw (ITEM_R), Formatted (ITEM_F), Decode (ITEM_D) and the item itself (ITEM). For more details about mappings of specific data types, see form-item-output-mapping-subject-data-extract.html#GUID-37036FC3-983B-4C8F-81CC-C5CE5F53EE07. |
| REFCODE_R
Note: Each question displays in the extract using the question's reference code specified in form design. For example, if the reference code of a question is AGE, the ITEM_R column outputs as AGE_R. |
The question label's raw value. | Indicates the raw value: alphanumeric value as entered in Oracle Clinical One Platform with no conversions. This includes data entry flags.
Form questions get represented in four different columns: Raw (ITEM_R), Formatted (ITEM_F), Decode (ITEM_D) and the item itself (ITEM). For more details about mappings of specific data types, see form-item-output-mapping-subject-data-extract.html#GUID-37036FC3-983B-4C8F-81CC-C5CE5F53EE07. |
| REFCODE_F
Note: Each question displays in the extract using the question's reference code specified in form design. For example, if the reference code of a question is AGE, the ITEM_F column outputs as AGE_F. |
The question label's data type. | Indicates the formatted value: value as entered in Oracle Clinical One Platform converted to the question data type as per form design.
Form questions get represented in four different columns: Raw (ITEM_R), Formatted (ITEM_F), Decode (ITEM_D) and the item itself (ITEM). For more details about mappings of specific data types, see form-item-output-mapping-subject-data-extract.html#GUID-37036FC3-983B-4C8F-81CC-C5CE5F53EE07. |
| REFCODE_D
Note: Each question displays in the extract using the question's reference code specified in form design. For example, if the reference code of a question is AGE, the ITEM_D column outputs as AGE_D. |
The question label's data type for the answer. | Indicates the decoded raw value, with additional considerations according to data type. If the question has a code value, it is populated in this field.
Form questions get represented in four different columns: Raw (ITEM_R), Formatted (ITEM_F), Decode (ITEM_D) and the item itself (ITEM). For more details about mappings of specific data types, see form-item-output-mapping-subject-data-extract.html#GUID-37036FC3-983B-4C8F-81CC-C5CE5F53EE07. |
| DeletedQn | Deleted Questions |
If a form item was deleted in a previous version of the study, but data had been collected for it, this data is displayed under this column. Note: The SAS Label of the deleted form item is displayed as D<GUID>, using the deleted form's GUID. |
| VISSTAT | Visit Status |
Indicates the current progress or completion state of a visit or event as recorded in the study. Can be one of the following values: Note: This field only exists in the VISIT_SYS dataset.
For the screening, screen fail, completion or withdrawal actions, this field is blank. For more information about the visit statuses, see Visit status and icon descriptions. |
| SKIPYN | Visit Skipped |
Indicates a visit or event that was not conducted or attended, as recorded by the study coordinator. Note: This field only exists in the VISIT_SYS dataset.
|
| DSDECOD | Disposition Term |
This indicates the event that occurred for the subject. Can be one of the following values: Note: This field only exists in the VISIT_SYS dataset.
This field is blank if no event occurs. For more information about the subject statuses, see Available subject statuses. |
| DSSTDAT | Disposition Date |
Indicates the date the event occurred. Date is UTC Timezone in the format YYYY-MM-DD. Note: This field only exist in the VISIT_SYS dataset.
This field is blank if no event occurs. |
| REASON | Reason |
Specifies the explanation or justification provided for an action or outcome (such as skipping a visit), as entered by the user. Note: This field only exists in the VISIT_SYS dataset. For the screening, screen fail, completion or withdrawal actions, this field is blank. |
| SOURCEFORMID | Source Form ID | Indicates the ID of the source form, that is, the form containing the question that launches a form association.
Note This field only exist in the Form Associations dataset. |
| SOURCEITEMID | Source Item ID | Indicates the source item ID. The source item is the question/item that launches a form association.
Note: This field only exist in the Form Associations dataset. |
| RVISITNUM | Related Visit Identifier | Indicates the ID of the visit that contains the form with which the source form is related/linked.
Note: This field only exist in the Form Associations dataset. |
| RVISIT | Related Visit Title | Indicates the name of the visit that contains the form with which the source form is related/linked.
Note: This field only exist in the Form Associations dataset. |
| RVISITINST | Related Unscheduled/Cycle Instance | Indicates the instance number of the unscheduled visit or cycle visit that contains the form with which the source form is related/linked.
Note: This field only exist in the Form Associations dataset. |
| RSVSTDTC | Related Visit Start Date | Indicates the start date of the visit that contains the form with which the source form is related/linked.
Note: This field only exist in the Form Associations dataset. |
| RDOMAIN | Related Form Reference Code | Indicates the name of the form with which the source form is related/linked.
Note: This field only exist in the Form Associations dataset. |
| RFORMID | Related Form ID | Indicates the name of the form with which the source form is related/linked.
Note: This field only exist in the Form Associations dataset. |
| RREPEATNUMBER | Related Repeat Number |
Indicates the number of the repeating form instance with which the source form is linked. Note: This field only exist in the Form Associations dataset. |
| RSREPEATID | Related Repeating Section Unique Identifier | Indicated the instance number of the flat section with which the source form is linked.
Note: This field only exists in the Form Associations dataset. |
| RSREPEATNUMBER | Related row number in a repeating section | Indicated the instance number of the section with which the source form is linked.
Note: This field only exist in the Form Associations dataset. |
- Form question output mapping in the Subject Data Extracts
As clinical data is extracted from Oracle Clinical One Platform to display in the Subject Data Extract, form questions get represented in four different columns: Raw, Formatted, Decode and the question itself.
Related Topics
Parent topic: Data extract descriptions
Footnote Legend
Footnote 1: SAS is a registered trademark or trademark of SAS Institute, Inc. in the USA and other countries. Other brand and product names are registered trademarks of their respective companies.