Skip Headers
Oracle® Health Sciences Omics Data Bank Programmer's Guide
Release 1.0.1

Part Number E27509-02
Go to Documentation Home
Home
Go to Book List
Book List
Go to Table of Contents
Contents
Go to Index
Index
Go to Feedback page
Contact Us

Go to previous page
Previous
Go to next page
Next
View PDF

5 Model Dictionary

This chapter describes all entities in all the tables of Oracle Health Sciences Omics Data Bank. It contains the following topics:

System Columns

System fields common to all tables are described in the following table:

Table 5-1 System Columns

Table Column Data Type Description Required (Not null)

Any table

ROW_WID

NUMBER

System. Field will be set by sequence (not mapped)

Y

Any table

W_INSERT_DT

DATE

System. This column stores the date on which the record was inserted in the data warehouse table.

N

Any table

W_UPDATE_DT

DATE

System. This column stores the date on which the record was last updated in the data warehouse table.

N

Any table

ETL_PROC_WID

NUMBER

System. This column is the unique identifier for the specific ETL process used to create or update this data.

Y

Any table

ENTERPRISE_ID

NUMBER

System. Unique identifier for an enterprise in a multi-enterprise environment.

N


Reference Data Tables

The following table describes entities in the reference data tables:

Table 5-2 Reference Entities

Table Column Data Type Description Required (Not null)

W_EHA_HUGO_INFO

APPROVED_SYMBOL

VARCHAR2(200)

Approved Symbol for a gene in HUGO (Human Genome Organization). Also foreign key to W_EHA_GENE as HUGO_NAME.

Y

W_EHA_HUGO_INFO

HGNC_ID

VARCHAR2(100)

HUGO Gene Nomenclature Committee Identifier

N

W_EHA_HUGO_INFO

APPROVED_NAME

VARCHAR2(200)

Approved Name

N

W_EHA_HUGO_INFO

STATUS

VARCHAR2(50)

Status

N

W_EHA_HUGO_INFO

LOCUS_TYPE

VARCHAR2(50)

Locus Type e.g. RNA, complex locus constituent, fragile site

N

W_EHA_HUGO_INFO

LOCUS_GROUP

VARCHAR2(50)

Grouping locus types into supertypes, e.g protein-coding gene, non-coding RNA etc

N

W_EHA_HUGO_INFO

PREVIOUS_SYMBOLS

VARCHAR2(200)

Previous Symbols

N

W_EHA_HUGO_INFO

PREVIOUS_NAMES

VARCHAR2(1000)

Previous Names

N

W_EHA_HUGO_INFO

SYNONYMS

VARCHAR2(200)

Synonyms

N

W_EHA_HUGO_INFO

NAME_SYNONYMS

VARCHAR2(500)

Name Synonyms

N

W_EHA_HUGO_INFO

CHROMOSOME

VARCHAR2(20)

Chromosome

N

W_EHA_HUGO_INFO

APPROVED_DT

DATE

Date Approved

N

W_EHA_HUGO_INFO

MODIFIED_DT

DATE

Date Modified

N

W_EHA_HUGO_INFO

SYMBOL_CHANGED_DT

DATE

Date Symbol Changed

N

W_EHA_HUGO_INFO

NAME_CHANGED_DT

DATE

Date Name Changed

N

W_EHA_HUGO_INFO

ACCESSION_NUMBERS

VARCHAR2(200)

Accession Numbers

N

W_EHA_HUGO_INFO

ENZYME_IDS

VARCHAR2(200)

Enzyme IDs

N

W_EHA_HUGO_INFO

ENTREZ_GENE_ID

VARCHAR2(100)

Entrez database Gene ID

N

W_EHA_HUGO_INFO

ENSEMBL_GENE_ID

VARCHAR2(100)

Ensembl database Gene ID

N

W_EHA_HUGO_INFO

MGI_ID

VARCHAR2(100)

Mouse Genome Database Gene ID

N

W_EHA_HUGO_INFO

SPECIALIST_DB_LINKS

VARCHAR2(1000)

Specialist Database Links

N

W_EHA_HUGO_INFO

SPECIALIST_DB_IDS

VARCHAR2(500)

Specialist Database IDs

N

W_EHA_HUGO_INFO

PUBMED_IDS

VARCHAR2(200)

Pubmed database IDs

N

W_EHA_HUGO_INFO

REFSEQ_IDS

VARCHAR2(200)

RefSeq database IDs

N

W_EHA_HUGO_INFO

GENE_FAMILY_TAG

VARCHAR2(100)

Gene Family Tag

N

W_EHA_HUGO_INFO

GENE_FAMILY_DESC

VARCHAR2(200)

Gene family description

N

W_EHA_HUGO_INFO

RECORD_TYPE

VARCHAR2(50)

Record Type

N

W_EHA_HUGO_INFO

PRIMARY_IDS

VARCHAR2(200)

Primary Indentifiers separated by comma

N

W_EHA_HUGO_INFO

SECONDARY_IDS

VARCHAR2(200)

Secondary Identifiers separated by comma

N

W_EHA_HUGO_INFO

CCDS_IDS

VARCHAR2(400)

Consensus Coding Sequence (CCDS) Identifers

N

W_EHA_HUGO_INFO

VEGA_IDS

VARCHAR2(200)

Vertebrae Genome Annotation database IDs

N

W_EHA_HUGO_INFO

LOCUS_SPECIFIC_DBS

VARCHAR2(1000)

Locus Specific Databases

N

W_EHA_HUGO_INFO

MAPPED_GDB_ID

VARCHAR2(100)

Genome Database ID (mapped data)

N

W_EHA_HUGO_INFO

MAPPED_ENTREZ_GENE_ID

VARCHAR2(100)

Entrez Gene ID (mapped data supplied by NCBI)

N

W_EHA_HUGO_INFO

MAPPED_OMIM_ID

VARCHAR2(100)

Online Mendelian Inheritance in Man ID (mapped data supplied by NCBI)

N

W_EHA_HUGO_INFO

MAPPED_REFSEQ_ID

VARCHAR2(100)

National Center for Biotechnology Information (NCBI) Reference Sequence database (mapped data supplied by NCBI)

N

W_EHA_HUGO_INFO

MAPPED_UNIPROT_ID

VARCHAR2(100)

Universal Protein Resource database ID (mapped data supplied by UniProt)

N

W_EHA_HUGO_INFO

MAPPED_ENSEMBL_ID

VARCHAR2(100)

Ensembl genome database ID (mapped data supplied by Ensembl)

N

W_EHA_HUGO_INFO

MAPPED_UCSC_ID

VARCHAR2(100)

University of California, Santa Cruz database ID (mapped data supplied by UCSC)

N

W_EHA_HUGO_INFO

MAPPED_MGI_ID

VARCHAR2(100)

Mouse Genome Database ID (mapped data supplied by MGI)

N

W_EHA_HUGO_INFO

MAPPED_RGD_ID

VARCHAR2(100)

Rat Genome Database ID (mapped data supplied by RGD)

N

W_EHA_DNA_SOURCE

SPECIES_WID

NUMBER

Foreign Key to W_EHA_SPECIES table

Y

W_EHA_DNA_SOURCE

CHROMOSOME

VARCHAR2(50)

Chromosome identifier e.g. X; parsed out of each line in EMBL source file; 3rd section after 2nd colon.

N

W_EHA_DNA_SOURCE

ACCESSION

VARCHAR2(200)

Accession identifier for sequence, parsed out of each line in EMBL source file; the whole AC line contents, except for the 'chromosome:' prefix.

N

W_EHA_DNA_SOURCE

START_POSITION

NUMBER

Start location of sequence, parsed out of each line in EMBL source file; 4th section after 3rd colon.

N

W_EHA_DNA_SOURCE

END_POSITION

NUMBER

End position of sequence, parsed out of each line in EMBL source file; 5th section after 4th colon.

N

W_EHA_DNA_SOURCE

RELEASED_DT

DATE

Released Date which is text that needs to be converted to DATE column.

N

W_EHA_DNA_SOURCE

VERSION

VARCHAR2(30)

Version of sequence, parsed out of each line in EMBL source file; 2nd section of AC line after 1st colon.

N

W_EHA_DNA_SOURCE

DNA_SEQUENCE

CLOB

DNA sequence corresponding to all bases in the Start-End position (should match difference in length) This should end up with proper DNA base pairs as well as other letters for special meaning.

N

W_EHA_DNA_SOURCE

DNA_SOURCE_COMMENT

CLOB

Comment lines concatenated

N

W_EHA_PROT_INFO

PROTEIN_WID

NUMBER

Foreign key W_EHA_PROTEIN

Y

W_EHA_PROT_INFO

PROTEIN_NAME

VARCHAR2(200)

Name of Protein, set from each line using the keyword portion e.g. INS_HUMAN.

N

W_EHA_PROT_INFO

ACCESSION

VARCHAR2(200)

Accession Identifier(s), set from the AC line. For this example the field would be set to P01308; Q5EEX2;

N

W_EHA_PROT_INFO

RELEASED_DT

DATE

Released date, set from the DT line of the SwissProt file, while often there are multiple dates for different revisions, only the last date is used for update.

N

W_EHA_PROT_INFO

PROT_INFO_COMMENT

CLOB

Comments applicable to individual protein, as specified in the SwissProt reference file

N

W_EHA_PROTEIN

SPECIES_WID

NUMBER

Foreign key to W_EHA_SPECIES. Note that there is a SPECIES FK in this table and also in the DNA_SOURCE table. This is because the user can load SwissProt information separately from EMBL information. The SwissProt data would have no link to the DNA_SOURCE directly and the FK to SPECIES is needed for PROTEIN records as well.

Y

W_EHA_PROTEIN

SEQUENCE_CHECKSUM

VARCHAR2(32)

Checksum derieved from AMINO ACID SEQUENCE in order to quickly match up protein in SwissProt with Ensembl record

N

W_EHA_PROTEIN

SEQUENCE_LENGTH

NUMBER

Length of AMINO ACID SEQUENCE

N

W_EHA_PROTEIN

AMINO_ACID_SEQUENCE

CLOB

Sequence of Amino Acids; This field is set from the SQ section of the SwissProt file or the “/translation” qualifier for coding segment (CDS) gene components.

N

W_EHA_GENE

GENE_NAME

VARCHAR2(200)

Ensembl recognized gene identifier

Y

W_EHA_GENE

HUGO_NAME

VARCHAR2(200)

Recognized HUGO gene name which can be used to look up proper chromosome information in the HUGO_INFO table.

N

W_EHA_GENE_COMP_SEGMENT

GENE_COMPONENT_WID

NUMBER

Foreign key to the parent gene component that has this segment definition.

Y

W_EHA_GENE_COMP_SEGMENT

SOURCE_WID

NUMBER

Foreign key to the DNA source buffer that this segment refers to. Note that external source references will create DNA_SOURCE placeholder records that do not have the real sequence data loaded.

Y

W_EHA_GENE_COMP_SEGMENT

NUMBER_IN_SEQUENCE

NUMBER

Number to keep track of the order of each segment.

N

W_EHA_GENE_COMP_SEGMENT

START_POSITION

NUMBER

Stores the starting position of the segment.

N

W_EHA_GENE_COMP_SEGMENT

END_POSITION

NUMBER

Stores the end position of the segment.

N

W_EHA_GENE_COMP_SEGMENT

COMPLEMENT

NUMBER(1)

This column Is used as a flag to indicate reverse transcription. Either 1 or 0.

N

W_EHA_GENE_COMPONENT

STRUCTURE_WID

NUMBER

This is the foreign key to the parent STRUCTURE record which is created for each gene component that has the same TRANSCRIPT_ID.

Y

W_EHA_GENE_COMPONENT

COMPONENT_TYPE

VARCHAR2(100)

This field stores the keyword described for the gene component. Possible values are CDS (coding sequence), exon, intron, etc.

Y

W_EHA_GENE_COMPONENT

PROTEIN_WID

NUMBER

Foreign key to the protein record. Most gene components are grouped together that are involved with some protein product. For example, it can link to a PROTEN record created for the Ensembl reference.

N

W_EHA_GENE_SEGMENT

GENE_WID

NUMBER

Foreign key to the parent GENE record. For example, FK can link to a GENE record with a name of “ENSG00000237037”.

Y

W_EHA_GENE_SEGMENT

SOURCE_WID

NUMBER

Foreign key to the W_EHA_DNA_SOURCE for each segment.

Y

W_EHA_GENE_SEGMENT

NUMBER_IN_SEQUENCE

NUMBER

This field is used to number each gene segment as reported to construct the correct sequence of DNA bases for the gene definition.

N

W_EHA_GENE_SEGMENT

START_POSITION

NUMBER

This field is the starting offset into the parent DNA_SOURCE buffer.

N

W_EHA_GENE_SEGMENT

END_POSITION

NUMBER

This field is the ending offset into the parent DNA_SOURCE buffer.

N

W_EHA_GENE_SEGMENT

COMPLEMENT

NUMBER(1)

This field is used to indicate reverse transcription. For the example above this field will be set to 0.

N

W_EHA_GENE_STRUCTURE

GENE_WID

NUMBER

Foreign key to the parent GENE record.

Y

W_EHA_GENE_STRUCTURE

TRANSCRIPT_ID

VARCHAR2(1000)

This field is used to store the value for TRANSCRIPT_ID used as the anchor for all of the gene components.

N

W_EHA_PROT_COMPONENT

PROTEIN_INFO_WID

NUMBER

Foreign key to the parent PROTEIN INFO record.

Y

W_EHA_PROT_COMPONENT

COMPONENT_TYPE

VARCHAR2(100)

This field is used to store the keyword used for each component of the protein molecule. Component types are detemined by what types exist in SwissProt file. Example types include: SITE, CHAIN, MOD_RES

Y

W_EHA_PROT_COMPONENT

START_POSITION

VARCHAR2(100)

This field is used to indicate the start amino acid position in the protein molecule which identifies this component.

N

W_EHA_PROT_COMPONENT

END_POSITION

VARCHAR2(100)

This field is used to indicate the end amino acid position in the protein molecule which identifies this component.

N

W_EHA_PROT_COMPONENT

COMPONENT_DESC

VARCHAR2(2000)

This field is used to capture any remaining data on the line about the feature. There is often a description text which would simply be stored in this field.

N

W_EHA_SPECIES

SPECIES_NAME

VARCHAR2(200)

This field is derived from the OS line in the file. This will not include common names in parenthesis.

N

W_EHA_SPECIES

COMMON_NAME

VARCHAR2(200)

This field has the common names of an organism that would be parsed in the parenthesis from the OS line. For example, human.

N

W_EHA_SPECIES

PROMOTER_OFFSET

NUMBER

This field will be set on load of the database to allow for a default value to be used to compute promoter regions for each gene of this organism. Note that not every gene has documented promoter features in Ensembl. So each lab usually has some offset to compute an acceptable region before the start of the first coding segment to be used as a promoter region. Usually there is a different offset for Eukaryote and Prokaryote organisms.

N

W_EHA_VARIANT

SOURCE_WID

NUMBER

Foreign key to the W_EHA_DNA_SOURCE buffer used for reporting this information.

Y

W_EHA_VARIANT

START_POSITION

NUMBER

This field will store the starting position of which DNA base is referenced in the variant.

Y

W_EHA_VARIANT

END_POSITION

NUMBER

This field will store the ending position of which DNA base is referenced in the variant.

Y

W_EHA_VARIANT

STATUS

VARCHAR2(100)

Status on variant, if downloaded from Ensembl then should be set to KNOWN, if uploaded from result files b/c found that it does not exist in Ensembl then it would be set to NOVEL

N

W_EHA_VARIANT

REPLACE_TAG

VARCHAR2(1000)

The sequence of the reference. Used with W_EHA_ALLELE_SEQ to indicate the changes in base pairs, deletions, or insertions. If insertion, it is set to '-'.

N

W_EHA_VARIANT

STRAND

VARCHAR2(1)

Strand indicator, '+' for forward strand and '-' for reverse-complement strand

N

W_EHA_VARIANT

CHROMOSOME

VARCHAR2(50)

Chromosome identifier e.g. X; parsed out of each line in gvf file for Known variants, for unknown, it is deduced based on position

N

W_EHA_VARIANT

ABSOLUTE_POSITION

NUMBER

Absolute position on the chromosome, obtained by adding the DNA buffer source position on the chromosome and START_POSITION

N

W_EHA_PATHWAY

PATHWAY_SOURCE_ID

VARCHAR2(50)

This names the source of pathway as specified in pathwaycommons file. Example sources are: REACTOME, NCI_NATURE, CELLMAP

Y

W_EHA_PATHWAY

PATHWAY_NAME

VARCHAR2(200)

As specified in pathwaycommons.com file, 1st column, example: Nef Mediated CD4 Down-regulation

N

W_EHA_PATHWAY_PROTEIN

HUGO_SYMBOL

VARCHAR2(200)

Foreign key into W_EHA_HUGO_INFO table used to look up gene information.

Y

W_EHA_PATHWAY_PROTEIN

PATHWAY_WID

NUMBER

Foreign key into W_EHA_PATHWAY table

Y

W_EHA_SOURCE_LIT_REF

SOURCE_WID

NUMBER

Foreign key to parent record. Note that the name is different for each literature reference table.

Y

W_EHA_SOURCE_LIT_REF

REFERENCE_NUMBER

NUMBER

Number of the reference source from Ensembl file.

Y

W_EHA_SOURCE_LIT_REF

REFERENCE_POSITION

VARCHAR2(2000)

RP line (Reference Position)

N

W_EHA_SOURCE_LIT_REF

REFERENCE_TITLE

VARCHAR2(2000)

RT line (Reference Title)

N

W_EHA_SOURCE_LIT_REF

REFERENCE_AUTHOR

VARCHAR2(2000)

RA line (Reference Author)

N

W_EHA_SOURCE_LIT_REF

CROSS_REF

VARCHAR2(2000)

RX line (Reference Cross Reference)

N

W_EHA_SOURCE_LIT_REF

REFERENCE_GROUP

VARCHAR2(2000)

RG line (Reference Group)

N

W_EHA_SOURCE_LIT_REF

REFERENCE_LOCATION

VARCHAR2(2000)

RL line (Reference Location).

N

W_EHA_PROT_LIT_REF

PROTEIN_INFO_WID

NUMBER

Foreign key to parent record. Note that the name is different for each literature reference table.

Y

W_EHA_PROT_LIT_REF

REFERENCE_NUMBER

NUMBER

Number of the reference source from Ensembl file.

Y

W_EHA_PROT_LIT_REF

REFERENCE_POSITION

VARCHAR2(2000)

RP line (Reference Position)

N

W_EHA_PROT_LIT_REF

REFERENCE_TITLE

VARCHAR2(2000)

RT line (Reference Title)

N

W_EHA_PROT_LIT_REF

REFERENCE_AUTHOR

VARCHAR2(2000)

RA line (Reference Author)

N

W_EHA_PROT_LIT_REF

CROSS_REF

VARCHAR2(2000)

RX line (Reference Cross Reference)

N

W_EHA_PROT_LIT_REF

REFERENCE_GROUP

VARCHAR2(2000)

RG line (Reference Group)

N

W_EHA_PROT_LIT_REF

REFERENCE_LOCATION

VARCHAR2(2000)

RL line (Reference Location).

N

W_EHA_VARIANT_QLFR

VARIANT_WID

NUMBER

Foreign key to parent record. Note that the name is different for each qualifier table.

Y

W_EHA_VARIANT_QLFR

QUALIFIER_TAG

VARCHAR2(100)

Text after the “/” tag in Ensembl file and up until the equal sign. For example, “/note”

Y

W_EHA_VARIANT_QLFR

QUALIFIER_VALUE

VARCHAR2(4000)

Any text after the equal sign in Ensembl reference file removing the double quotes. Note that some qualifiers can extend more than one line. The double quotes will indicate the end of the qualifier value.

N

W_EHA_SOURCE_QLFR

SOURCE_WID

NUMBER

Foreign key to parent record. Note that the name is different for each qualifier table.

Y

W_EHA_SOURCE_QLFR

QUALIFIER_TAG

VARCHAR2(100)

Text after the “/” tag in Ensembl file and up until the equal sign. For example, “/note”

Y

W_EHA_SOURCE_QLFR

QUALIFIER_VALUE

VARCHAR2(4000)

Any text after the equal sign in Ensembl reference file removing the double quotes. Note that some qualifiers can extend more than one line. The double quotes will indicate the end of the qualifier value.

N

W_EHA_GENE_QLFR

GENE_WID

NUMBER

Foreign key to parent record. Note that the name is different for each qualifier table.

Y

W_EHA_GENE_QLFR

QUALIFIER_TAG

VARCHAR2(100)

Text after the “/” tag in Ensembl file and up until the equal sign. For example, “/note”

Y

W_EHA_GENE_QLFR

QUALIFIER_VALUE

VARCHAR2(4000)

Any text after the equal sign in Ensembl reference file removing the double quotes. Note that some qualifiers can extend more than one line. The double quotes will indicate the end of the qualifier value.

N

W_EHA_GENE_COMP_QLFR

GENE_COMPONENT_WID

NUMBER

Foreign key to parent record. Note that the name is different for each qualifier table.

Y

W_EHA_GENE_COMP_QLFR

QUALIFIER_TAG

VARCHAR2(100)

Text after the “/” tag in Ensembl file and up until the equal sign. For example, “/note”

Y

W_EHA_GENE_COMP_QLFR

QUALIFIER_VALUE

VARCHAR2(4000)

Any text after the equal sign in Ensembl reference file removing the double quotes. Note that some qualifiers can extend more than one line. The double quotes will indicate the end of the qualifier value.

N

W_EHA_PROT_COMP_QLFR

PROT_COMPONENT_WID

NUMBER

Foreign key to parent record. Note that the name is different for each qualifier table.

Y

W_EHA_PROT_COMP_QLFR

QUALIFIER_TAG

VARCHAR2(100)

Text after the “/” tag in Ensembl file and up until the equal sign. For example, “/note”

Y

W_EHA_PROT_COMP_QLFR

QUALIFIER_VALUE

VARCHAR2(4000)

Any text after the equal sign in Ensembl reference file removing the double quotes. Note that some qualifiers can extend more than one line. The double quotes will indicate the end of the qualifier value.

N

W_EHA_VARIANT_XREF

VARIANT_WID

NUMBER

Foreign key to parent record. Note that this is different for each cross reference table.

Y

W_EHA_VARIANT_XREF

REFERENCE_ID

VARCHAR2(200)

This data is the 2nd part of the DB_XREF qualifier (after the first colon) and also the 2nd part of the DR line in Ensembl file.

Y

W_EHA_VARIANT_XREF

DATABASE

VARCHAR2(100)

This data is the first part of the /DB_XREF and the DR line. For example, EMBL.

Y

W_EHA_VARIANT_XREF

REFERENCE_SUFFIX

VARCHAR2(200)

Some references will have a suffix portion after another colon. This data is stored here as it may be involved in filter criteria on the corresponding database.

Y

W_EHA_SOURCE_XREF

SOURCE_WID

NUMBER

Foreign key to parent record. Note that this is different for each cross reference table.

Y

W_EHA_SOURCE_XREF

REFERENCE_ID

VARCHAR2(200)

This data is the 2nd part of the DB_XREF qualifier (after the first colon) and also the 2nd part of the DR line in Ensembl file.

Y

W_EHA_SOURCE_XREF

DATABASE

VARCHAR2(100)

This data is the first part of the /DB_XREF and the DR line. For example, EMBL.

Y

W_EHA_SOURCE_XREF

REFERENCE_SUFFIX

VARCHAR2(200)

Some references will have a suffix portion after another colon. This data is stored here as it may be involved in filter criteria on the corresponding database.

Y

W_EHA_PROT_COMP_XREF

PROT_COMPONENT_WID

NUMBER

Foreign key to parent record. Note that this is different for each cross reference table.

Y

W_EHA_PROT_COMP_XREF

REFERENCE_ID

VARCHAR2(200)

This data is the 2nd part of the DB_XREF qualifier (after the first colon) and also the 2nd part of the DR line in Ensembl file.

Y

W_EHA_PROT_COMP_XREF

DATABASE

VARCHAR2(100)

This data is the first part of the /DB_XREF and the DR line. For example, EMBL.

Y

W_EHA_PROT_COMP_XREF

REFERENCE_SUFFIX

VARCHAR2(200)

Some references will have a suffix portion after another colon. This data is stored here as it may be involved in filter criteria on the corresponding database.

Y

W_EHA_GENE_XREF

GENE_WID

NUMBER

Foreign key to parent record. Note that this is different for each cross reference table.

Y

W_EHA_GENE_XREF

REFERENCE_ID

VARCHAR2(200)

This data is the 2nd part of the DB_XREF qualifier (after the first colon) and also the 2nd part of the DR line in Ensembl file.

Y

W_EHA_GENE_XREF

DATABASE

VARCHAR2(100)

This data is the first part of the /DB_XREF and the DR line. For example, EMBL.

Y

W_EHA_GENE_XREF

REFERENCE_SUFFIX

VARCHAR2(200)

Some references will have a suffix portion after another colon. This data is stored here as it may be involved in filter criteria on the corresponding database.

Y

W_EHA_GENE_COMPONENT_XREF

GENE_COMPONENT_WID

NUMBER

Foreign key to parent record. Note that this is different for each cross reference table.

Y

W_EHA_GENE_COMPONENT_XREF

REFERENCE_ID

VARCHAR2(200)

This data is the 2nd part of the DB_XREF qualifier (after the first colon) and also the 2nd part of the DR line in Ensembl file.

Y

W_EHA_GENE_COMPONENT_XREF

DATABASE

VARCHAR2(100)

This data is the first part of the /DB_XREF and the DR line. For example, EMBL.

Y

W_EHA_GENE_COMPONENT_XREF

REFERENCE_SUFFIX

VARCHAR2(200)

Some references will have a suffix portion after another colon. This data is stored here as it may be involved in filter criteria on the corresponding database.

Y

W_EHA_PROT_XREF

PROTEIN_WID

NUMBER

Foreign key to parent record. Note that this is different for each cross reference table.

Y

W_EHA_PROT_XREF

REFERENCE_ID

VARCHAR2(200)

This data is the 2nd part of the DB_XREF qualifier (after the first colon) and also the 2nd part of the DR line in Ensembl file.

Y

W_EHA_PROT_XREF

DATABASE

VARCHAR2(100)

This data is the first part of the /DB_XREF and the DR line. For example, EMBL.

Y

W_EHA_PROT_XREF

REFERENCE_SUFFIX

VARCHAR2(200)

Some references will have a suffix portion after another colon. This data is stored here as it may be involved in filter criteria on the corresponding database.

Y


Result Data Tables

The following table describes entities from the result data tables:

Table 5-3 Result Entity Descriptions

Table Column Data Type Description Required (Not Null)

W_EHA_PROBE

PRIMARY_HUGO_NAME

VARCHAR2(200)

Hugo name for a gene corresponding to probe

N

W_EHA_PROBE

PROBE_NAME

VARCHAR2(200)

Name of gene probe, single probe should match to one gene only

N

W_EHA_PROBE

START_POSITION

NUMBER

Start position of DNA sequence, with the 1st base in chromosome having position 1.

N

W_EHA_PROBE

SEQUENCE

VARCHAR2(200)

Sequence of bases for the particular probe

N

W_EHA_PROBE

ACCESSION

VARCHAR2(200)

List of accession Ids associated with the probe

N

W_EHA_PROBE

PROBE_DESC

VARCHAR2(2000)

Description of the gene for probe

N

W_EHA_PROBE_XREF

PROBE_WID

NUMBER(38)

Foreign key to W_EHA_PROBE

Y

W_EHA_PROBE_XREF

REFERENCE_ID

VARCHAR2(200)

Gene ID associated to probe

Y

W_EHA_PROBE_XREF

DATABASE

VARCHAR2(100)

Database for Gene ID

Y

W_EHA_PROBE_XREF

REFERENCE_SUFFIX

VARCHAR2(200)

Suffix applied to Gene IDs in a particular database, For example, _HUMAN

N

W_EHA_RSLT_GENE_EXP

SPECIMEN_WID

NUMBER(38)

Foreign key to Specimen table in CDM1.1

Y

W_EHA_RSLT_GENE_EXP

SPEC_DATASRC_WID

NUMBER(38)

Foreign key to W_EHA_DATASOURCE table

Y

W_EHA_RSLT_GENE_EXP

PROBE_WID

NUMBER(38)

Foreign key to W_EHA_PROBE

Y

W_EHA_RSLT_GENE_EXP

HYBRIDIZATION_NAME

VARCHAR2(200)

Hybridization experiment identifier in gene expression file

N

W_EHA_RSLT_GENE_EXP

INTENSITY

NUMBER

Intensity value associated with particular probe at given hybridization

N

W_EHA_RSLT_GENE_EXP

P_VALUE

NUMBER

P-value associated with particular probe at a given hybridization

N

W_EHA_RSLT_GENE_EXP

CALL

VARCHAR2(200)

Call value associated with particular probe at a given hybridization, P for present, A for absent

N

W_EHA_DATASOURCE

DATASOURCE_CD

VARCHAR2(80)

Data source for Specimen

Y

W_EHA_DATASOURCE

DATASOURCE_NM

VARCHAR2(80)

Name of datasource for Specimen

N

W_EHA_DATASOURCE

DATASOURCE_DESC

VARCHAR2(2000)

Description of datasource for specimen

N

W_EHA_DATASOURCE

SCHEMA_NAME

VARCHAR2(30)

Schema name for datasource for specimen

N

W_EHA_DATASOURCE

DB_LINK_NAME

VARCHAR2(30)

Link to schema for specimen datasource

N

W_EHA_RSLT_TYPE

RESULT_TYPE_NAME

VARCHAR2(200)

Type of result For example, sequencing, no-call, gene expression, cnv

Y

W_EHA_RSLT_TYPE

RESULT_TYPE_DESC

VARCHAR2(2000)

Result type description

N

W_EHA_RSLT_FILE

FILE_TYPE_WID

NUMBER(38)

Foreign key to W_EHA_FILE_TYPE

Y

W_EHA_RSLT_FILE

FILE_STORAGE_FLG

VARCHAR2(1)

Type of file storage, S - Secure Files, E - External

N

W_EHA_RSLT_FILE

FILE_PATH

VARCHAR2(2000)

Path to file including filename

N

W_EHA_RSLT_FILE

VENDOR_NAME

VARCHAR2(200)

Name of vendor which supplied results in file, For example, Complete Genomics

N

W_EHA_RSLT_FILE

FILE_CONTENT_ID

RAW(128)

File identifier used in Secure File system

N

W_EHA_RSLT_FILE_TYPE

FILE_TYPE_CODE

VARCHAR2(80)

Code for type of file

Y

W_EHA_RSLT_FILE_TYPE

FILE_TYPE_NAME

VARCHAR2(200)

Full name of type of file

N

W_EHA_RSLT_FILE_TYPE

FILE_TYPE_DESC

VARCHAR2(2000)

Description of file

N

W_EHA_RSLT_FILE_TYPE

FILE_TYPE_VERSION

VARCHAR2(30)

Version of file format

Y

W_EHA_RSLT_SEQUENCING

RESULT_STUDY_WID

NUMBER(38)

Foreign key to RESULT_STUDY_WID in W_EHA_RSLT_STUDY

Y

W_EHA_RSLT_SEQUENCING

RESULT_SPEC_WID

NUMBER(38)

Foreign key to RESULT_SPEC_WID in W_EHA_RSLT_SPECIMEN table

Y

W_EHA_RSLT_SEQUENCING

RESULT_TYPE_WID

NUMBER(38)

Foreign key to RESULT_TYPE_WID in W_EHA_RSLT_TYPE table

Y

W_EHA_RSLT_SEQUENCING

RESULT_FILE_WID

NUMBER(38)

Foreign key to RESULT_FILE_WID in W_EHA_RSLT_FILE table

Y

W_EHA_RSLT_SEQUENCING

VARIANT_WID

NUMBER(38)

Foreign key to W_EHA_VARIANT table

Y

W_EHA_RSLT_SEQUENCING

CHROMOSOME_WID

NUMBER(38)

Foreign key to CHROMOSOME_WID in W_EHA_CHROMOSOME table

Y

W_EHA_RSLT_SEQUENCING

START_POSITION

NUMBER

Start position of variant, with the 1st base in chromosome having position 1.

N

W_EHA_RSLT_SEQUENCING

VARIANT_TYPE

VARCHAR2(100)

Type of variant including snp, insertion, or deletion

N

W_EHA_RSLT_SEQUENCING

ZYGOSITY

VARCHAR2(100)

Similarity of alleles in an organism to reference. If both same - hom etc.

N

W_EHA_RSLT_SEQUENCING

SCORE_VAF

NUMBER

Positive integer representing confidence in the call from CGI masterVar file

N

W_EHA_RSLT_SEQUENCING

SCORE_EAF

NUMBER

Positive or negative integer representing confidence in the call from CGI masterVar file

N

W_EHA_RSLT_SEQUENCING

ALLELE_READ_COUNT

NUMBER

Number of reads that support the given allele

N

W_EHA_RSLT_SEQUENCING

REFERENCE_READ_COUNT

NUMBER

Number of reads that support the reference sequence

N

W_EHA_RSLT_SEQUENCING_X

RESULT_STUDY_WID

NUMBER(38)

Foreign key to RESULT_STUDY_WID in W_EHA_RSLT_STUDY

Y

W_EHA_RSLT_SEQUENCING_X

ALLELE

VARCHAR2(1000)

Sequence of allele if less than or equal to 500 char

N

W_EHA_RSLT_SEQUENCING_X

ALLELE_CLOB

CLOB

Sequence of allele if greater than 500 char

N

W_EHA_RSLT_SEQUENCING_X

HAPLINK

VARCHAR2(100)

Integer ID that links the allele to alleles of other loci that are known to be on same haplotype

N

W_EHA_RSLT_SEQUENCING_X

FILTER

VARCHAR2(200)

Indicates PASS if position has passed all specified filters, otherwise lists FAIL, filters and results

N

W_EHA_RSLT_SEQUENCING_X

GENOTYPE_FILTER

VARCHAR2(200)

Indicates PASS if position has passed all genotype filters, otherwise lists FAIL, filters and results

N

W_EHA_RSLT_COPY_NBR_VAR

RESULT_STUDY_WID

NUMBER(38)

Foreign key to RESULT_STUDY_WID in W_EHA_RSLT_STUDY

Y

W_EHA_RSLT_COPY_NBR_VAR

RESULT_SPEC_WID

NUMBER(38)

Foreign key to RESULT_SPEC_WID in W_EHA_RSLT_SPECIMEN table

Y

W_EHA_RSLT_COPY_NBR_VAR

RESULT_TYPE_WID

NUMBER(38)

Foreign key to RESULT_TYPE_WID in W_EHA_RSLT_TYPE table

Y

W_EHA_RSLT_COPY_NBR_VAR

RESULT_FILE_WID

NUMBER(38)

Foreign key to RESULT_FILE_WID in W_EHA_RSLT_FILE table

Y

W_EHA_RSLT_COPY_NBR_VAR

CHROMOSOME_WID

NUMBER(38)

Foreign key to CHROMOSOME_WID in W_EHA_CHROMOSOME table

Y

W_EHA_RSLT_COPY_NBR_VAR

START_POSITION

NUMBER

Start position of DNA sequence, with the 1st base in chromosome having position 1.

N

W_EHA_RSLT_COPY_NBR_VAR

END_POSITION

NUMBER

End position of DNA sequence, with the 1st base in chromosome having position 1.

N

W_EHA_RSLT_COPY_NBR_VAR

AVG_NORMALIZED_CVG

NUMBER

Baseline-normalized average coverage over the interval from START to END position

N

W_EHA_RSLT_COPY_NBR_VAR

GC_CORRECTED_CVG

NUMBER

GC-corrected average coverage of a window width based on START and END positions

N

W_EHA_RSLT_COPY_NBR_VAR

FRACTION_UNIQUE

NUMBER

Fraction of coverage due to unique mappings

N

W_EHA_RSLT_COPY_NBR_VAR

RELATIVE_CVG

NUMBER

AVG_NORMALIZED_CVG divided by estimate of diploid median average adjusted coverage, can be null if source is N

N

W_EHA_RSLT_COPY_NBR_VAR

CALLED_PLOIDY

NUMBER

Called ploidy for the segment, usually an integer from 0 to maximum ploidy value, if N in source file, leave blank

N

W_EHA_RSLT_COPY_NBR_VAR

CALLED_CNV_TYPE

VARCHAR2(100)

Classification of CALLED_PLOIDY into one of six categories

N

W_EHA_RSLT_COPY_NBR_VAR

PLOIDY_SCORE

NUMBER

Phred-like confidence that the segment has the CALLED_PLOIDY correct

N

W_EHA_RSLT_COPY_NBR_VAR

CNV_TYPE_SCORE

NUMBER

Phred-like confidence that CALLED_CNV_TYPE is correct

N

W_EHA_RSLT_CNV_X

RESULT_STUDY_WID

NUMBER(38)

Foreign key to RESULT_STUDY_WID in W_EHA_RSLT_STUDY

Y

W_EHA_RSLT_CNV_X

REPEATS

VARCHAR2(2000)

Percent of called CNV segments that overlaps with each category of genomic repeats. Format is: Repeat category:XX. Stored if number of characters less than or equal to 2000.

N

W_EHA_RSLT_CNV_X

REPEATS_CLOB

CLOB

Percent of called CNV segments that overlaps with each category of genomic repeats. Format is: Repeat category:XX. Stored if number of characters greater than 2000.

N

W_EHA_RSLT_NOCALL

RESULT_STUDY_WID

NUMBER(38)

Foreign key to RESULT_STUDY_WID in W_EHA_RSLT_STUDY

Y

W_EHA_RSLT_NOCALL

RESULT_SPEC_WID

NUMBER(38)

Foreign key to RESULT_SPEC_WID in W_EHA_RSLT_SPECIMEN table

Y

W_EHA_RSLT_NOCALL

RESULT_TYPE_WID

NUMBER(38)

Foreign key to RESULT_TYPE_WID in W_EHA_RSLT_TYPE table

Y

W_EHA_RSLT_NOCALL

RESULT_FILE_WID

NUMBER(38)

Foreign key to RESULT_FILE_WID in W_EHA_RSLT_FILE table

Y

W_EHA_RSLT_NOCALL

CHROMOSOME_WID

NUMBER(38)

Foreign key to CHROMOSOME_WID in W_EHA_CHROMOSOME table

Y

W_EHA_RSLT_NOCALL

START_POSITION

NUMBER

Start position of DNA sequence, with the 1st base in chromosome having position 1.

N

W_EHA_RSLT_NOCALL

END_POSITION

NUMBER

End position of DNA sequence, with the 1st base in chromosome having position 1.

N

W_EHA_RSLT_NOCALL

NOCALL_TYPE

VARCHAR2(100)

Type of no-call

N

W_EHA_RSLT_NOCALL_X

RESULT_STUDY_WID

NUMBER(38)

Foreign key to RESULT_STUDY_WID in W_EHA_RSLT_STUDY

Y

W_EHA_RSLT_NOCALL_X

ALLELE

varchar2(500)

Sequence of allele if less than or equal to 500 char

N

W_EHA_RSLT_NOCALL_X

ALLELE_CLOB

CLOB

Sequence of allele if greater than 500 char

N

W_EHA_RSLT_NOCALL_X

HAPLINK

VARCHAR2(100)

Integer ID that links the allele to alleles of other loci that are known to be on same haplotype

N

W_EHA_PROBE_ALT_LINK

HUGO_NAME

VARCHAR2(200)

Hugo name for a gene corresponding to probe

Y

W_EHA_PROBE_ALT_LINK

PROBE_WID

NUMBER(38)

Foreign key to W_EHA_PROBE

Y

W_EHA_RSLT_STUDY

RESULT_STUDY_NAME

VARCHAR2(200)

Name of the study under which results will be loaded

Y

W_EHA_RSLT_STUDY

RESULT_STUDY_DESC

VARCHAR2(2000)

Description of the study under which results will be loaded

N

W_EHA_RSLT_SPECIMEN

SPECIMEN_WID

NUMBER(38)

Foreign key to Specimen table in CDM1.1

Y

W_EHA_RSLT_SPECIMEN

SPEC_DATASRC_WID

NUMBER(38)

Identifier for specimen datasource

Y

W_EHA_RSLT_SPECIMEN

SPECIMEN_NUMBER

VARCHAR2(80)

Specimen identifier in linked specimen datasource database

Y

W_EHA_RSLT_SPECIMEN

SPECIMEN_VENDOR_NUMBER

VARCHAR2(80)

Vendor identifier in linked specimen datasource database

Y

W_EHA_CHROMOSOME

CHROMOSOME

VARCHAR2(50)

Chromosome number or identifier

Y


Table Descriptions

The following table contains descriptions of all tables in ODB:

Table 5-4 Table Descriptions

Table Type Description Loaders Populate Included in the Release 1.0.1

W_EHA_HUGO_INFO

Reference

Stores the Hugo Names for all genes with cross reference information. This table matches the data from the Human Gene Nomenclature Committee.

Master populate script

N

W_EHA_DNA_SOURCE

Reference

Stores the DNA sequence source buffer where multiple genes and other features may reference. The identifciation and locus of the sequence source is given with the accession field.

Reference loader in Java

N

W_EHA_PROT_INFO

Reference

Stores PROTEIN information for each named version of a protein molecule.

Reference loader in Java

N

W_EHA_PROTEIN

Reference

Stores PROTEIN molecule information for amino acid sequences transcribed from genes.

Reference loader in Java

N

W_EHA_GENE

Reference

Stores the parent gene records as loaded from the reference. The gene table will store the hugo name as well as the the name used in the reference data. Othere tables will have foreign keys to this table such as GENE_SEGMENT and GENE_STRUCTURE.

Reference loader in Java

N

W_EHA_GENE_COMP_SEGMENT

Reference

Maps each GENE_COMPONENT to a specific region in the DNA_SOURCE buffer. Each GENE_COMPONENT can have many segments that are not contiguous.

Reference loader in Java

N

W_EHA_GENE_COMPONENT

Reference

Stores each component of the gene (i.e. mRNA, coding segments, etc...).

Reference loader in Java

N

W_EHA_GENE_SEGMENT

Reference

Maps each GENE to the DNA_SOURCE buffers used as reference. Note that a GENE definition may span multiple source references.

Reference loader in Java

N

W_EHA_GENE_STRUCTURE

Reference

Stores each transcript of the GENE used to create different PROTEIN molecules. Each GENE_STRUCTURE is indicated with a QUALIFIER indicating the transcript_id. All of the GENE_COMPONENT records with the same transcript_id will be grouped with the same pareint GENE_STRUCTURE record.

Reference loader in Java

N

W_EHA_PROT_COMPONENT

Reference

Stores the components that are imported from SwissProt files for all Protein definitions.

Reference loader in Java

N

W_EHA_SPECIES

Reference

Stores SPECIES information for each genome stored in database.

Reference loader in Java

N

W_EHA_VARIANT

Reference

Stores novel and known variants for the genomes loaded. The actual variation is stored in the REPLACE_TAG field.

Reference loader in Java, Sequencing loaders: VCF, MAF, CGI

N

W_EHA_PATHWAY

Reference

Stores PATHWAY names for protein network.

Reference loader in Java

N

W_EHA_PATHWAY_PROTEIN

Reference

Stores genes or proteins that belong to a given pathway based on import from GSEA files.

Reference loader in Java

N

W_EHA_SOURCE_LIT_REF

Reference

Stores literature references for each DNA_SOURCE definition.

Reference loader in Java

N

W_EHA_PROT_LIT_REF

Reference

Stores literature references for each PROTEIN definition.

Reference loader in Java

N

W_EHA_VARIANT_QLFR

Reference

Stores qualifier attributes for each VARIANT

Reference loader in Java

N

W_EHA_SOURCE_QLFR

Reference

Stores all of the qualifier attributes for each DNA_SOURCE definition.

Reference loader in Java

N

W_EHA_GENE_QLFR

Reference

Stores all of the qualifier attributes for each gene.

Reference loader in Java

N

W_EHA_GENE_COMP_QLFR

Reference

Stores all of the qualifier attributes for each component of the gene (i.e. mRNA, coding segments, etc...).

Reference loader in Java

N

W_EHA_PROT_COMP_QLFR

Reference

Stores all of the qualifier attributes for each component of the protein.

Reference loader in Java

N

W_EHA_VARIANT_XREF

Reference

Cross reference information for VARIANT records

Reference loader in Java

N

W_EHA_SOURCE_XREF

Reference

Stores cross reference information for DNA_SOURCE buffers.

Reference loader in Java

N

W_EHA_PROT_COMP_XREF

Reference

Stores the PROTEIN COMPONENT cross reference information linked to other databases.

Reference loader in Java

N

W_EHA_GENE_XREF

Reference

Stores the GENE cross reference information linked to other databases.

Reference loader in Java

N

W_EHA_GENE_COMPONENT_XREF

Reference

Stores GENE_COMPONENT cross reference links to other databases.

Reference loader in Java

N

W_EHA_PROT_XREF

Reference

Stores PROTEIN cross reference information

Reference loader in Java

N

W_EHA_PROBE

Result/Reference

Stores PROBE information and link to corresponding gene.

Result loader for probes (expression)

N

W_EHA_PROBE_XREF

Reference

Stores cross reference information for PROBE records.

Currently not populated

N

W_EHA_RSLT_GENE_EXP

Result

Stores gene expression intensity value along with p-value, call if available.

Result loader for gene expression

N

W_EHA_DATASOURCE

Result

Stores specimen datasource information and link, if needed.

Pre-seeded

N

W_EHA_RSLT_TYPE

Result

Stores identifiers for types of result, pre-seeded.

Pre-seeded

N

W_EHA_RSLT_FILE

Result

Stores links and other information about the result file.

Sequence loaders: CGI, MAF, VCF, gene expression loader

N

W_EHA_RSLT_FILE_TYPE

Result

Stores type of file information, including version, vendor etc, pre-seeded.

Pre-seeded

N

W_EHA_RSLT_SEQUENCING

Result

Stores sequencing results, namely substitutions, insertions, deletions coming from CGI, MAF, VCF file formats.

Sequence loaders: CGI, MAF, VCF

N

W_EHA_RSLT_COPY_NBR_VAR

Result

Stores copy number variation results from sequencing.

Currently not populated

N

W_EHA_RSLT_NOCALL

Result

Stores no-call results from CGI sequencing.

CGI sequence loader

N

W_EHA_PROBE_ALT_LINK

Result

Used when there is more than one gene that is matched to a given probe.

Currently not populated

N

W_EHA_SEQUENCING_X

Result

Stores varchar data of sequencing results coming from CGI, MAF, VCF file formats.

Sequence loaders: CGI, MAF, VCF

Y

W_EHA_RSLT_COPY_NBR_VAR_X

Result

Stores varchar data for copy number variation results from sequencing.

Currently not populated

Y

W_EHA_RSLT_NOCALL_X

Result

Stores varchar data of no-call results from CGI sequencing.

CGI sequence loader

Y

W_EHA_RSLT_STUDY

Result

Stores study information under which results pertaining to sequencing and expression are loaded.

User defined data

Y

W_EHA_RSLT_SPECIMEN

Result

Stores the Specimen details and links datasource table. All results would have specimen_wid in their tables.

Sequence loaders: CGI, MAF, VCF, gene expression loader

Y

W_EHA_CHROMOSOME

Result

Stores all possible chromosome numbers for which result sequencing data is imported. CHROMOSOME_WID is used to link to result sequencing tables.

Pre-seeded

Y

W_EHA_RSLT_SEQUENCING

REFERENCE_READ_COUNT

NUMBER

Number of reads that support the reference sequence

N

W_EHA_RSLT_SEQUENCING_X

RESULT_STUDY_WID

NUMBER(38)

Foreign key to RESULT_STUDY_WID in W_EHA_RSLT_STUDY

Y

W_EHA_RSLT_SEQUENCING_X

ALLELE

VARCHAR2(1000)

Sequence of allele if less than or equal to 500 char

N

W_EHA_RSLT_SEQUENCING_X

ALLELE_CLOB

CLOB

Sequence of allele if greater than 500 char

N

W_EHA_RSLT_SEQUENCING_X

HAPLINK

VARCHAR2(100)

Integer ID that links the allele to alleles of other loci that are known to be on same haplotype

N

W_EHA_RSLT_SEQUENCING_X

FILTER

VARCHAR2(200)

Indicates PASS if position has passed all specified filters, otherwise lists FAIL, filters and results

N

W_EHA_RSLT_SEQUENCING_X

GENOTYPE_FILTER

VARCHAR2(200)

Indicates PASS if position has passed all genotype filters, otherwise lists FAIL, filters and results

N

W_EHA_RSLT_COPY_NBR_VAR

RESULT_STUDY_WID

NUMBER(38)

Foreign key to RESULT_STUDY_WID in W_EHA_RSLT_STUDY

Y

W_EHA_RSLT_COPY_NBR_VAR

RESULT_SPEC_WID

NUMBER(38)

Foreign key to RESULT_SPEC_WID in W_EHA_RSLT_SPECIMEN table

Y

W_EHA_RSLT_COPY_NBR_VAR

RESULT_TYPE_WID

NUMBER(38)

Foreign key to RESULT_TYPE_WID in W_EHA_RSLT_TYPE table

Y

W_EHA_RSLT_COPY_NBR_VAR

RESULT_FILE_WID

NUMBER(38)

Foreign key to RESULT_FILE_WID in W_EHA_RSLT_FILE table

Y

W_EHA_RSLT_COPY_NBR_VAR

CHROMOSOME_WID

NUMBER(38)

Foreign key to CHROMOSOME_WID in W_EHA_CHROMOSOME table

Y

W_EHA_RSLT_COPY_NBR_VAR

START_POSITION

NUMBER

Start position of DNA sequence, with the 1st base in chromosome having position 1.

N

W_EHA_RSLT_COPY_NBR_VAR

END_POSITION

NUMBER

End position of DNA sequence, with the 1st base in chromosome having position 1.

N

W_EHA_RSLT_COPY_NBR_VAR

AVG_NORMALIZED_CVG

NUMBER

Baseline-normalized average coverage over the interval from START to END position

N

W_EHA_RSLT_COPY_NBR_VAR

GC_CORRECTED_CVG

NUMBER

GC-corrected average coverage of a window width based on START and END positions

N

W_EHA_RSLT_COPY_NBR_VAR

FRACTION_UNIQUE

NUMBER

Fraction of coverage due to unique mappings

N

W_EHA_RSLT_COPY_NBR_VAR

RELATIVE_CVG

NUMBER

AVG_NORMALIZED_CVG divided by estimate of diploid median average adjusted coverage, can be null if source is N

N

W_EHA_RSLT_COPY_NBR_VAR

CALLED_PLOIDY

NUMBER

Called ploidy for the segment, usually an integer from 0 to maximum ploidy value, if N in source file, leave blank

N

W_EHA_RSLT_COPY_NBR_VAR

CALLED_CNV_TYPE

VARCHAR2(100)

Classification of CALLED_PLOIDY into one of six categories

N

W_EHA_RSLT_COPY_NBR_VAR

PLOIDY_SCORE

NUMBER

Phred-like confidence that the segment has the CALLED_PLOIDY correct

N

W_EHA_RSLT_COPY_NBR_VAR

CNV_TYPE_SCORE

NUMBER

Phred-like confidence that CALLED_CNV_TYPE is correct

N

W_EHA_RSLT_CNV_X

RESULT_STUDY_WID

NUMBER(38)

Foreign key to RESULT_STUDY_WID in W_EHA_RSLT_STUDY

Y

W_EHA_RSLT_CNV_X

REPEATS

VARCHAR2(2000)

Percent of called CNV segments that overlaps with each category of genomic repeats. Format is: Repeat category:XX. Stored if number of characters less than or equal to 2000.

N

W_EHA_RSLT_CNV_X

REPEATS_CLOB

CLOB

Percent of called CNV segments that overlaps with each category of genomic repeats. Format is: Repeat category:XX. Stored if number of characters greater than 2000.

N

W_EHA_RSLT_NOCALL

RESULT_STUDY_WID

NUMBER(38)

Foreign key to RESULT_STUDY_WID in W_EHA_RSLT_STUDY

Y

W_EHA_RSLT_NOCALL

RESULT_SPEC_WID

NUMBER(38)

Foreign key to RESULT_SPEC_WID in W_EHA_RSLT_SPECIMEN table

Y

W_EHA_RSLT_NOCALL

RESULT_TYPE_WID

NUMBER(38)

Foreign key to RESULT_TYPE_WID in W_EHA_RSLT_TYPE table

Y

W_EHA_RSLT_NOCALL

RESULT_FILE_WID

NUMBER(38)

Foreign key to RESULT_FILE_WID in W_EHA_RSLT_FILE table

Y

W_EHA_RSLT_NOCALL

CHROMOSOME_WID

NUMBER(38)

Foreign key to CHROMOSOME_WID in W_EHA_CHROMOSOME table

Y

W_EHA_RSLT_NOCALL

START_POSITION

NUMBER

Start position of DNA sequence, with the 1st base in chromosome having position 1.

N

W_EHA_RSLT_NOCALL

END_POSITION

NUMBER

End position of DNA sequence, with the 1st base in chromosome having position 1.

N

W_EHA_RSLT_NOCALL

NOCALL_TYPE

VARCHAR2(100)

Type of no-call

N

W_EHA_RSLT_NOCALL_X

RESULT_STUDY_WID

NUMBER(38)

Foreign key to RESULT_STUDY_WID in W_EHA_RSLT_STUDY

Y

W_EHA_RSLT_NOCALL_X

ALLELE

varchar2(500)

Sequence of allele if less than or equal to 500 char

N

W_EHA_RSLT_NOCALL_X

ALLELE_CLOB

CLOB

Sequence of allele if greater than 500 char

N

W_EHA_RSLT_NOCALL_X

HAPLINK

VARCHAR2(100)

Integer ID that links the allele to alleles of other loci that are known to be on same haplotype

N

W_EHA_PROBE_ALT_LINK

HUGO_NAME

VARCHAR2(200)

Hugo name for a gene corresponding to probe

Y

W_EHA_PROBE_ALT_LINK

PROBE_WID

NUMBER(38)

Foreign key to W_EHA_PROBE

Y

W_EHA_RSLT_STUDY

RESULT_STUDY_NAME

VARCHAR2(200)

Name of the study under which results will be loaded

Y

W_EHA_RSLT_STUDY

RESULT_STUDY_DESC

VARCHAR2(2000)

Description of the study under which results will be loaded

N

W_EHA_RSLT_SPECIMEN

SPECIMEN_WID

NUMBER(38)

Foreign key to Specimen table in CDM1.1

Y

W_EHA_RSLT_SPECIMEN

SPEC_DATASRC_WID

NUMBER(38)

Identifier for specimen datasource

Y

W_EHA_RSLT_SPECIMEN

SPECIMEN_NUMBER

VARCHAR2(80)

Specimen identifier in linked specimen datasource database

Y

W_EHA_RSLT_SPECIMEN

SPECIMEN_VENDOR_NUMBER

VARCHAR2(80)

Vendor identifier in linked specimen datasource database

Y

W_EHA_CHROMOSOME

CHROMOSOME

VARCHAR2(50)

Chromosome number or identifier

Y