Oracle® Health Sciences Omics Data Bank Programmer's Guide Release 1.0.1 Part Number E27509-02 |
|
|
View PDF |
This chapter describes all entities in all the tables of Oracle Health Sciences Omics Data Bank. It contains the following topics:
System fields common to all tables are described in the following table:
Table | Column | Data Type | Description | Required (Not null) |
---|---|---|---|---|
Any table |
ROW_WID |
NUMBER |
System. Field will be set by sequence (not mapped) |
Y |
Any table |
W_INSERT_DT |
DATE |
System. This column stores the date on which the record was inserted in the data warehouse table. |
N |
Any table |
W_UPDATE_DT |
DATE |
System. This column stores the date on which the record was last updated in the data warehouse table. |
N |
Any table |
ETL_PROC_WID |
NUMBER |
System. This column is the unique identifier for the specific ETL process used to create or update this data. |
Y |
Any table |
ENTERPRISE_ID |
NUMBER |
System. Unique identifier for an enterprise in a multi-enterprise environment. |
N |
The following table describes entities in the reference data tables:
Table | Column | Data Type | Description | Required (Not null) |
---|---|---|---|---|
W_EHA_HUGO_INFO |
APPROVED_SYMBOL |
VARCHAR2(200) |
Approved Symbol for a gene in HUGO (Human Genome Organization). Also foreign key to W_EHA_GENE as HUGO_NAME. |
Y |
W_EHA_HUGO_INFO |
HGNC_ID |
VARCHAR2(100) |
HUGO Gene Nomenclature Committee Identifier |
N |
W_EHA_HUGO_INFO |
APPROVED_NAME |
VARCHAR2(200) |
Approved Name |
N |
W_EHA_HUGO_INFO |
STATUS |
VARCHAR2(50) |
Status |
N |
W_EHA_HUGO_INFO |
LOCUS_TYPE |
VARCHAR2(50) |
Locus Type e.g. RNA, complex locus constituent, fragile site |
N |
W_EHA_HUGO_INFO |
LOCUS_GROUP |
VARCHAR2(50) |
Grouping locus types into supertypes, e.g protein-coding gene, non-coding RNA etc |
N |
W_EHA_HUGO_INFO |
PREVIOUS_SYMBOLS |
VARCHAR2(200) |
Previous Symbols |
N |
W_EHA_HUGO_INFO |
PREVIOUS_NAMES |
VARCHAR2(1000) |
Previous Names |
N |
W_EHA_HUGO_INFO |
SYNONYMS |
VARCHAR2(200) |
Synonyms |
N |
W_EHA_HUGO_INFO |
NAME_SYNONYMS |
VARCHAR2(500) |
Name Synonyms |
N |
W_EHA_HUGO_INFO |
CHROMOSOME |
VARCHAR2(20) |
Chromosome |
N |
W_EHA_HUGO_INFO |
APPROVED_DT |
DATE |
Date Approved |
N |
W_EHA_HUGO_INFO |
MODIFIED_DT |
DATE |
Date Modified |
N |
W_EHA_HUGO_INFO |
SYMBOL_CHANGED_DT |
DATE |
Date Symbol Changed |
N |
W_EHA_HUGO_INFO |
NAME_CHANGED_DT |
DATE |
Date Name Changed |
N |
W_EHA_HUGO_INFO |
ACCESSION_NUMBERS |
VARCHAR2(200) |
Accession Numbers |
N |
W_EHA_HUGO_INFO |
ENZYME_IDS |
VARCHAR2(200) |
Enzyme IDs |
N |
W_EHA_HUGO_INFO |
ENTREZ_GENE_ID |
VARCHAR2(100) |
Entrez database Gene ID |
N |
W_EHA_HUGO_INFO |
ENSEMBL_GENE_ID |
VARCHAR2(100) |
Ensembl database Gene ID |
N |
W_EHA_HUGO_INFO |
MGI_ID |
VARCHAR2(100) |
Mouse Genome Database Gene ID |
N |
W_EHA_HUGO_INFO |
SPECIALIST_DB_LINKS |
VARCHAR2(1000) |
Specialist Database Links |
N |
W_EHA_HUGO_INFO |
SPECIALIST_DB_IDS |
VARCHAR2(500) |
Specialist Database IDs |
N |
W_EHA_HUGO_INFO |
PUBMED_IDS |
VARCHAR2(200) |
Pubmed database IDs |
N |
W_EHA_HUGO_INFO |
REFSEQ_IDS |
VARCHAR2(200) |
RefSeq database IDs |
N |
W_EHA_HUGO_INFO |
GENE_FAMILY_TAG |
VARCHAR2(100) |
Gene Family Tag |
N |
W_EHA_HUGO_INFO |
GENE_FAMILY_DESC |
VARCHAR2(200) |
Gene family description |
N |
W_EHA_HUGO_INFO |
RECORD_TYPE |
VARCHAR2(50) |
Record Type |
N |
W_EHA_HUGO_INFO |
PRIMARY_IDS |
VARCHAR2(200) |
Primary Indentifiers separated by comma |
N |
W_EHA_HUGO_INFO |
SECONDARY_IDS |
VARCHAR2(200) |
Secondary Identifiers separated by comma |
N |
W_EHA_HUGO_INFO |
CCDS_IDS |
VARCHAR2(400) |
Consensus Coding Sequence (CCDS) Identifers |
N |
W_EHA_HUGO_INFO |
VEGA_IDS |
VARCHAR2(200) |
Vertebrae Genome Annotation database IDs |
N |
W_EHA_HUGO_INFO |
LOCUS_SPECIFIC_DBS |
VARCHAR2(1000) |
Locus Specific Databases |
N |
W_EHA_HUGO_INFO |
MAPPED_GDB_ID |
VARCHAR2(100) |
Genome Database ID (mapped data) |
N |
W_EHA_HUGO_INFO |
MAPPED_ENTREZ_GENE_ID |
VARCHAR2(100) |
Entrez Gene ID (mapped data supplied by NCBI) |
N |
W_EHA_HUGO_INFO |
MAPPED_OMIM_ID |
VARCHAR2(100) |
Online Mendelian Inheritance in Man ID (mapped data supplied by NCBI) |
N |
W_EHA_HUGO_INFO |
MAPPED_REFSEQ_ID |
VARCHAR2(100) |
National Center for Biotechnology Information (NCBI) Reference Sequence database (mapped data supplied by NCBI) |
N |
W_EHA_HUGO_INFO |
MAPPED_UNIPROT_ID |
VARCHAR2(100) |
Universal Protein Resource database ID (mapped data supplied by UniProt) |
N |
W_EHA_HUGO_INFO |
MAPPED_ENSEMBL_ID |
VARCHAR2(100) |
Ensembl genome database ID (mapped data supplied by Ensembl) |
N |
W_EHA_HUGO_INFO |
MAPPED_UCSC_ID |
VARCHAR2(100) |
University of California, Santa Cruz database ID (mapped data supplied by UCSC) |
N |
W_EHA_HUGO_INFO |
MAPPED_MGI_ID |
VARCHAR2(100) |
Mouse Genome Database ID (mapped data supplied by MGI) |
N |
W_EHA_HUGO_INFO |
MAPPED_RGD_ID |
VARCHAR2(100) |
Rat Genome Database ID (mapped data supplied by RGD) |
N |
W_EHA_DNA_SOURCE |
SPECIES_WID |
NUMBER |
Foreign Key to W_EHA_SPECIES table |
Y |
W_EHA_DNA_SOURCE |
CHROMOSOME |
VARCHAR2(50) |
Chromosome identifier e.g. X; parsed out of each line in EMBL source file; 3rd section after 2nd colon. |
N |
W_EHA_DNA_SOURCE |
ACCESSION |
VARCHAR2(200) |
Accession identifier for sequence, parsed out of each line in EMBL source file; the whole AC line contents, except for the 'chromosome:' prefix. |
N |
W_EHA_DNA_SOURCE |
START_POSITION |
NUMBER |
Start location of sequence, parsed out of each line in EMBL source file; 4th section after 3rd colon. |
N |
W_EHA_DNA_SOURCE |
END_POSITION |
NUMBER |
End position of sequence, parsed out of each line in EMBL source file; 5th section after 4th colon. |
N |
W_EHA_DNA_SOURCE |
RELEASED_DT |
DATE |
Released Date which is text that needs to be converted to DATE column. |
N |
W_EHA_DNA_SOURCE |
VERSION |
VARCHAR2(30) |
Version of sequence, parsed out of each line in EMBL source file; 2nd section of AC line after 1st colon. |
N |
W_EHA_DNA_SOURCE |
DNA_SEQUENCE |
CLOB |
DNA sequence corresponding to all bases in the Start-End position (should match difference in length) This should end up with proper DNA base pairs as well as other letters for special meaning. |
N |
W_EHA_DNA_SOURCE |
DNA_SOURCE_COMMENT |
CLOB |
Comment lines concatenated |
N |
W_EHA_PROT_INFO |
PROTEIN_WID |
NUMBER |
Foreign key W_EHA_PROTEIN |
Y |
W_EHA_PROT_INFO |
PROTEIN_NAME |
VARCHAR2(200) |
Name of Protein, set from each line using the keyword portion e.g. INS_HUMAN. |
N |
W_EHA_PROT_INFO |
ACCESSION |
VARCHAR2(200) |
Accession Identifier(s), set from the AC line. For this example the field would be set to P01308; Q5EEX2; |
N |
W_EHA_PROT_INFO |
RELEASED_DT |
DATE |
Released date, set from the DT line of the SwissProt file, while often there are multiple dates for different revisions, only the last date is used for update. |
N |
W_EHA_PROT_INFO |
PROT_INFO_COMMENT |
CLOB |
Comments applicable to individual protein, as specified in the SwissProt reference file |
N |
W_EHA_PROTEIN |
SPECIES_WID |
NUMBER |
Foreign key to W_EHA_SPECIES. Note that there is a SPECIES FK in this table and also in the DNA_SOURCE table. This is because the user can load SwissProt information separately from EMBL information. The SwissProt data would have no link to the DNA_SOURCE directly and the FK to SPECIES is needed for PROTEIN records as well. |
Y |
W_EHA_PROTEIN |
SEQUENCE_CHECKSUM |
VARCHAR2(32) |
Checksum derieved from AMINO ACID SEQUENCE in order to quickly match up protein in SwissProt with Ensembl record |
N |
W_EHA_PROTEIN |
SEQUENCE_LENGTH |
NUMBER |
Length of AMINO ACID SEQUENCE |
N |
W_EHA_PROTEIN |
AMINO_ACID_SEQUENCE |
CLOB |
Sequence of Amino Acids; This field is set from the SQ section of the SwissProt file or the “/translation” qualifier for coding segment (CDS) gene components. |
N |
W_EHA_GENE |
GENE_NAME |
VARCHAR2(200) |
Ensembl recognized gene identifier |
Y |
W_EHA_GENE |
HUGO_NAME |
VARCHAR2(200) |
Recognized HUGO gene name which can be used to look up proper chromosome information in the HUGO_INFO table. |
N |
W_EHA_GENE_COMP_SEGMENT |
GENE_COMPONENT_WID |
NUMBER |
Foreign key to the parent gene component that has this segment definition. |
Y |
W_EHA_GENE_COMP_SEGMENT |
SOURCE_WID |
NUMBER |
Foreign key to the DNA source buffer that this segment refers to. Note that external source references will create DNA_SOURCE placeholder records that do not have the real sequence data loaded. |
Y |
W_EHA_GENE_COMP_SEGMENT |
NUMBER_IN_SEQUENCE |
NUMBER |
Number to keep track of the order of each segment. |
N |
W_EHA_GENE_COMP_SEGMENT |
START_POSITION |
NUMBER |
Stores the starting position of the segment. |
N |
W_EHA_GENE_COMP_SEGMENT |
END_POSITION |
NUMBER |
Stores the end position of the segment. |
N |
W_EHA_GENE_COMP_SEGMENT |
COMPLEMENT |
NUMBER(1) |
This column Is used as a flag to indicate reverse transcription. Either 1 or 0. |
N |
W_EHA_GENE_COMPONENT |
STRUCTURE_WID |
NUMBER |
This is the foreign key to the parent STRUCTURE record which is created for each gene component that has the same TRANSCRIPT_ID. |
Y |
W_EHA_GENE_COMPONENT |
COMPONENT_TYPE |
VARCHAR2(100) |
This field stores the keyword described for the gene component. Possible values are CDS (coding sequence), exon, intron, etc. |
Y |
W_EHA_GENE_COMPONENT |
PROTEIN_WID |
NUMBER |
Foreign key to the protein record. Most gene components are grouped together that are involved with some protein product. For example, it can link to a PROTEN record created for the Ensembl reference. |
N |
W_EHA_GENE_SEGMENT |
GENE_WID |
NUMBER |
Foreign key to the parent GENE record. For example, FK can link to a GENE record with a name of “ENSG00000237037”. |
Y |
W_EHA_GENE_SEGMENT |
SOURCE_WID |
NUMBER |
Foreign key to the W_EHA_DNA_SOURCE for each segment. |
Y |
W_EHA_GENE_SEGMENT |
NUMBER_IN_SEQUENCE |
NUMBER |
This field is used to number each gene segment as reported to construct the correct sequence of DNA bases for the gene definition. |
N |
W_EHA_GENE_SEGMENT |
START_POSITION |
NUMBER |
This field is the starting offset into the parent DNA_SOURCE buffer. |
N |
W_EHA_GENE_SEGMENT |
END_POSITION |
NUMBER |
This field is the ending offset into the parent DNA_SOURCE buffer. |
N |
W_EHA_GENE_SEGMENT |
COMPLEMENT |
NUMBER(1) |
This field is used to indicate reverse transcription. For the example above this field will be set to 0. |
N |
W_EHA_GENE_STRUCTURE |
GENE_WID |
NUMBER |
Foreign key to the parent GENE record. |
Y |
W_EHA_GENE_STRUCTURE |
TRANSCRIPT_ID |
VARCHAR2(1000) |
This field is used to store the value for TRANSCRIPT_ID used as the anchor for all of the gene components. |
N |
W_EHA_PROT_COMPONENT |
PROTEIN_INFO_WID |
NUMBER |
Foreign key to the parent PROTEIN INFO record. |
Y |
W_EHA_PROT_COMPONENT |
COMPONENT_TYPE |
VARCHAR2(100) |
This field is used to store the keyword used for each component of the protein molecule. Component types are detemined by what types exist in SwissProt file. Example types include: SITE, CHAIN, MOD_RES |
Y |
W_EHA_PROT_COMPONENT |
START_POSITION |
VARCHAR2(100) |
This field is used to indicate the start amino acid position in the protein molecule which identifies this component. |
N |
W_EHA_PROT_COMPONENT |
END_POSITION |
VARCHAR2(100) |
This field is used to indicate the end amino acid position in the protein molecule which identifies this component. |
N |
W_EHA_PROT_COMPONENT |
COMPONENT_DESC |
VARCHAR2(2000) |
This field is used to capture any remaining data on the line about the feature. There is often a description text which would simply be stored in this field. |
N |
W_EHA_SPECIES |
SPECIES_NAME |
VARCHAR2(200) |
This field is derived from the OS line in the file. This will not include common names in parenthesis. |
N |
W_EHA_SPECIES |
COMMON_NAME |
VARCHAR2(200) |
This field has the common names of an organism that would be parsed in the parenthesis from the OS line. For example, human. |
N |
W_EHA_SPECIES |
PROMOTER_OFFSET |
NUMBER |
This field will be set on load of the database to allow for a default value to be used to compute promoter regions for each gene of this organism. Note that not every gene has documented promoter features in Ensembl. So each lab usually has some offset to compute an acceptable region before the start of the first coding segment to be used as a promoter region. Usually there is a different offset for Eukaryote and Prokaryote organisms. |
N |
W_EHA_VARIANT |
SOURCE_WID |
NUMBER |
Foreign key to the W_EHA_DNA_SOURCE buffer used for reporting this information. |
Y |
W_EHA_VARIANT |
START_POSITION |
NUMBER |
This field will store the starting position of which DNA base is referenced in the variant. |
Y |
W_EHA_VARIANT |
END_POSITION |
NUMBER |
This field will store the ending position of which DNA base is referenced in the variant. |
Y |
W_EHA_VARIANT |
STATUS |
VARCHAR2(100) |
Status on variant, if downloaded from Ensembl then should be set to KNOWN, if uploaded from result files b/c found that it does not exist in Ensembl then it would be set to NOVEL |
N |
W_EHA_VARIANT |
REPLACE_TAG |
VARCHAR2(1000) |
The sequence of the reference. Used with W_EHA_ALLELE_SEQ to indicate the changes in base pairs, deletions, or insertions. If insertion, it is set to '-'. |
N |
W_EHA_VARIANT |
STRAND |
VARCHAR2(1) |
Strand indicator, '+' for forward strand and '-' for reverse-complement strand |
N |
W_EHA_VARIANT |
CHROMOSOME |
VARCHAR2(50) |
Chromosome identifier e.g. X; parsed out of each line in gvf file for Known variants, for unknown, it is deduced based on position |
N |
W_EHA_VARIANT |
ABSOLUTE_POSITION |
NUMBER |
Absolute position on the chromosome, obtained by adding the DNA buffer source position on the chromosome and START_POSITION |
N |
W_EHA_PATHWAY |
PATHWAY_SOURCE_ID |
VARCHAR2(50) |
This names the source of pathway as specified in pathwaycommons file. Example sources are: REACTOME, NCI_NATURE, CELLMAP |
Y |
W_EHA_PATHWAY |
PATHWAY_NAME |
VARCHAR2(200) |
As specified in pathwaycommons.com file, 1st column, example: Nef Mediated CD4 Down-regulation |
N |
W_EHA_PATHWAY_PROTEIN |
HUGO_SYMBOL |
VARCHAR2(200) |
Foreign key into W_EHA_HUGO_INFO table used to look up gene information. |
Y |
W_EHA_PATHWAY_PROTEIN |
PATHWAY_WID |
NUMBER |
Foreign key into W_EHA_PATHWAY table |
Y |
W_EHA_SOURCE_LIT_REF |
SOURCE_WID |
NUMBER |
Foreign key to parent record. Note that the name is different for each literature reference table. |
Y |
W_EHA_SOURCE_LIT_REF |
REFERENCE_NUMBER |
NUMBER |
Number of the reference source from Ensembl file. |
Y |
W_EHA_SOURCE_LIT_REF |
REFERENCE_POSITION |
VARCHAR2(2000) |
RP line (Reference Position) |
N |
W_EHA_SOURCE_LIT_REF |
REFERENCE_TITLE |
VARCHAR2(2000) |
RT line (Reference Title) |
N |
W_EHA_SOURCE_LIT_REF |
REFERENCE_AUTHOR |
VARCHAR2(2000) |
RA line (Reference Author) |
N |
W_EHA_SOURCE_LIT_REF |
CROSS_REF |
VARCHAR2(2000) |
RX line (Reference Cross Reference) |
N |
W_EHA_SOURCE_LIT_REF |
REFERENCE_GROUP |
VARCHAR2(2000) |
RG line (Reference Group) |
N |
W_EHA_SOURCE_LIT_REF |
REFERENCE_LOCATION |
VARCHAR2(2000) |
RL line (Reference Location). |
N |
W_EHA_PROT_LIT_REF |
PROTEIN_INFO_WID |
NUMBER |
Foreign key to parent record. Note that the name is different for each literature reference table. |
Y |
W_EHA_PROT_LIT_REF |
REFERENCE_NUMBER |
NUMBER |
Number of the reference source from Ensembl file. |
Y |
W_EHA_PROT_LIT_REF |
REFERENCE_POSITION |
VARCHAR2(2000) |
RP line (Reference Position) |
N |
W_EHA_PROT_LIT_REF |
REFERENCE_TITLE |
VARCHAR2(2000) |
RT line (Reference Title) |
N |
W_EHA_PROT_LIT_REF |
REFERENCE_AUTHOR |
VARCHAR2(2000) |
RA line (Reference Author) |
N |
W_EHA_PROT_LIT_REF |
CROSS_REF |
VARCHAR2(2000) |
RX line (Reference Cross Reference) |
N |
W_EHA_PROT_LIT_REF |
REFERENCE_GROUP |
VARCHAR2(2000) |
RG line (Reference Group) |
N |
W_EHA_PROT_LIT_REF |
REFERENCE_LOCATION |
VARCHAR2(2000) |
RL line (Reference Location). |
N |
W_EHA_VARIANT_QLFR |
VARIANT_WID |
NUMBER |
Foreign key to parent record. Note that the name is different for each qualifier table. |
Y |
W_EHA_VARIANT_QLFR |
QUALIFIER_TAG |
VARCHAR2(100) |
Text after the “/” tag in Ensembl file and up until the equal sign. For example, “/note” |
Y |
W_EHA_VARIANT_QLFR |
QUALIFIER_VALUE |
VARCHAR2(4000) |
Any text after the equal sign in Ensembl reference file removing the double quotes. Note that some qualifiers can extend more than one line. The double quotes will indicate the end of the qualifier value. |
N |
W_EHA_SOURCE_QLFR |
SOURCE_WID |
NUMBER |
Foreign key to parent record. Note that the name is different for each qualifier table. |
Y |
W_EHA_SOURCE_QLFR |
QUALIFIER_TAG |
VARCHAR2(100) |
Text after the “/” tag in Ensembl file and up until the equal sign. For example, “/note” |
Y |
W_EHA_SOURCE_QLFR |
QUALIFIER_VALUE |
VARCHAR2(4000) |
Any text after the equal sign in Ensembl reference file removing the double quotes. Note that some qualifiers can extend more than one line. The double quotes will indicate the end of the qualifier value. |
N |
W_EHA_GENE_QLFR |
GENE_WID |
NUMBER |
Foreign key to parent record. Note that the name is different for each qualifier table. |
Y |
W_EHA_GENE_QLFR |
QUALIFIER_TAG |
VARCHAR2(100) |
Text after the “/” tag in Ensembl file and up until the equal sign. For example, “/note” |
Y |
W_EHA_GENE_QLFR |
QUALIFIER_VALUE |
VARCHAR2(4000) |
Any text after the equal sign in Ensembl reference file removing the double quotes. Note that some qualifiers can extend more than one line. The double quotes will indicate the end of the qualifier value. |
N |
W_EHA_GENE_COMP_QLFR |
GENE_COMPONENT_WID |
NUMBER |
Foreign key to parent record. Note that the name is different for each qualifier table. |
Y |
W_EHA_GENE_COMP_QLFR |
QUALIFIER_TAG |
VARCHAR2(100) |
Text after the “/” tag in Ensembl file and up until the equal sign. For example, “/note” |
Y |
W_EHA_GENE_COMP_QLFR |
QUALIFIER_VALUE |
VARCHAR2(4000) |
Any text after the equal sign in Ensembl reference file removing the double quotes. Note that some qualifiers can extend more than one line. The double quotes will indicate the end of the qualifier value. |
N |
W_EHA_PROT_COMP_QLFR |
PROT_COMPONENT_WID |
NUMBER |
Foreign key to parent record. Note that the name is different for each qualifier table. |
Y |
W_EHA_PROT_COMP_QLFR |
QUALIFIER_TAG |
VARCHAR2(100) |
Text after the “/” tag in Ensembl file and up until the equal sign. For example, “/note” |
Y |
W_EHA_PROT_COMP_QLFR |
QUALIFIER_VALUE |
VARCHAR2(4000) |
Any text after the equal sign in Ensembl reference file removing the double quotes. Note that some qualifiers can extend more than one line. The double quotes will indicate the end of the qualifier value. |
N |
W_EHA_VARIANT_XREF |
VARIANT_WID |
NUMBER |
Foreign key to parent record. Note that this is different for each cross reference table. |
Y |
W_EHA_VARIANT_XREF |
REFERENCE_ID |
VARCHAR2(200) |
This data is the 2nd part of the DB_XREF qualifier (after the first colon) and also the 2nd part of the DR line in Ensembl file. |
Y |
W_EHA_VARIANT_XREF |
DATABASE |
VARCHAR2(100) |
This data is the first part of the /DB_XREF and the DR line. For example, EMBL. |
Y |
W_EHA_VARIANT_XREF |
REFERENCE_SUFFIX |
VARCHAR2(200) |
Some references will have a suffix portion after another colon. This data is stored here as it may be involved in filter criteria on the corresponding database. |
Y |
W_EHA_SOURCE_XREF |
SOURCE_WID |
NUMBER |
Foreign key to parent record. Note that this is different for each cross reference table. |
Y |
W_EHA_SOURCE_XREF |
REFERENCE_ID |
VARCHAR2(200) |
This data is the 2nd part of the DB_XREF qualifier (after the first colon) and also the 2nd part of the DR line in Ensembl file. |
Y |
W_EHA_SOURCE_XREF |
DATABASE |
VARCHAR2(100) |
This data is the first part of the /DB_XREF and the DR line. For example, EMBL. |
Y |
W_EHA_SOURCE_XREF |
REFERENCE_SUFFIX |
VARCHAR2(200) |
Some references will have a suffix portion after another colon. This data is stored here as it may be involved in filter criteria on the corresponding database. |
Y |
W_EHA_PROT_COMP_XREF |
PROT_COMPONENT_WID |
NUMBER |
Foreign key to parent record. Note that this is different for each cross reference table. |
Y |
W_EHA_PROT_COMP_XREF |
REFERENCE_ID |
VARCHAR2(200) |
This data is the 2nd part of the DB_XREF qualifier (after the first colon) and also the 2nd part of the DR line in Ensembl file. |
Y |
W_EHA_PROT_COMP_XREF |
DATABASE |
VARCHAR2(100) |
This data is the first part of the /DB_XREF and the DR line. For example, EMBL. |
Y |
W_EHA_PROT_COMP_XREF |
REFERENCE_SUFFIX |
VARCHAR2(200) |
Some references will have a suffix portion after another colon. This data is stored here as it may be involved in filter criteria on the corresponding database. |
Y |
W_EHA_GENE_XREF |
GENE_WID |
NUMBER |
Foreign key to parent record. Note that this is different for each cross reference table. |
Y |
W_EHA_GENE_XREF |
REFERENCE_ID |
VARCHAR2(200) |
This data is the 2nd part of the DB_XREF qualifier (after the first colon) and also the 2nd part of the DR line in Ensembl file. |
Y |
W_EHA_GENE_XREF |
DATABASE |
VARCHAR2(100) |
This data is the first part of the /DB_XREF and the DR line. For example, EMBL. |
Y |
W_EHA_GENE_XREF |
REFERENCE_SUFFIX |
VARCHAR2(200) |
Some references will have a suffix portion after another colon. This data is stored here as it may be involved in filter criteria on the corresponding database. |
Y |
W_EHA_GENE_COMPONENT_XREF |
GENE_COMPONENT_WID |
NUMBER |
Foreign key to parent record. Note that this is different for each cross reference table. |
Y |
W_EHA_GENE_COMPONENT_XREF |
REFERENCE_ID |
VARCHAR2(200) |
This data is the 2nd part of the DB_XREF qualifier (after the first colon) and also the 2nd part of the DR line in Ensembl file. |
Y |
W_EHA_GENE_COMPONENT_XREF |
DATABASE |
VARCHAR2(100) |
This data is the first part of the /DB_XREF and the DR line. For example, EMBL. |
Y |
W_EHA_GENE_COMPONENT_XREF |
REFERENCE_SUFFIX |
VARCHAR2(200) |
Some references will have a suffix portion after another colon. This data is stored here as it may be involved in filter criteria on the corresponding database. |
Y |
W_EHA_PROT_XREF |
PROTEIN_WID |
NUMBER |
Foreign key to parent record. Note that this is different for each cross reference table. |
Y |
W_EHA_PROT_XREF |
REFERENCE_ID |
VARCHAR2(200) |
This data is the 2nd part of the DB_XREF qualifier (after the first colon) and also the 2nd part of the DR line in Ensembl file. |
Y |
W_EHA_PROT_XREF |
DATABASE |
VARCHAR2(100) |
This data is the first part of the /DB_XREF and the DR line. For example, EMBL. |
Y |
W_EHA_PROT_XREF |
REFERENCE_SUFFIX |
VARCHAR2(200) |
Some references will have a suffix portion after another colon. This data is stored here as it may be involved in filter criteria on the corresponding database. |
Y |
The following table describes entities from the result data tables:
Table 5-3 Result Entity Descriptions
Table | Column | Data Type | Description | Required (Not Null) |
---|---|---|---|---|
W_EHA_PROBE |
PRIMARY_HUGO_NAME |
VARCHAR2(200) |
Hugo name for a gene corresponding to probe |
N |
W_EHA_PROBE |
PROBE_NAME |
VARCHAR2(200) |
Name of gene probe, single probe should match to one gene only |
N |
W_EHA_PROBE |
START_POSITION |
NUMBER |
Start position of DNA sequence, with the 1st base in chromosome having position 1. |
N |
W_EHA_PROBE |
SEQUENCE |
VARCHAR2(200) |
Sequence of bases for the particular probe |
N |
W_EHA_PROBE |
ACCESSION |
VARCHAR2(200) |
List of accession Ids associated with the probe |
N |
W_EHA_PROBE |
PROBE_DESC |
VARCHAR2(2000) |
Description of the gene for probe |
N |
W_EHA_PROBE_XREF |
PROBE_WID |
NUMBER(38) |
Foreign key to W_EHA_PROBE |
Y |
W_EHA_PROBE_XREF |
REFERENCE_ID |
VARCHAR2(200) |
Gene ID associated to probe |
Y |
W_EHA_PROBE_XREF |
DATABASE |
VARCHAR2(100) |
Database for Gene ID |
Y |
W_EHA_PROBE_XREF |
REFERENCE_SUFFIX |
VARCHAR2(200) |
Suffix applied to Gene IDs in a particular database, For example, _HUMAN |
N |
W_EHA_RSLT_GENE_EXP |
SPECIMEN_WID |
NUMBER(38) |
Foreign key to Specimen table in CDM1.1 |
Y |
W_EHA_RSLT_GENE_EXP |
SPEC_DATASRC_WID |
NUMBER(38) |
Foreign key to W_EHA_DATASOURCE table |
Y |
W_EHA_RSLT_GENE_EXP |
PROBE_WID |
NUMBER(38) |
Foreign key to W_EHA_PROBE |
Y |
W_EHA_RSLT_GENE_EXP |
HYBRIDIZATION_NAME |
VARCHAR2(200) |
Hybridization experiment identifier in gene expression file |
N |
W_EHA_RSLT_GENE_EXP |
INTENSITY |
NUMBER |
Intensity value associated with particular probe at given hybridization |
N |
W_EHA_RSLT_GENE_EXP |
P_VALUE |
NUMBER |
P-value associated with particular probe at a given hybridization |
N |
W_EHA_RSLT_GENE_EXP |
CALL |
VARCHAR2(200) |
Call value associated with particular probe at a given hybridization, P for present, A for absent |
N |
W_EHA_DATASOURCE |
DATASOURCE_CD |
VARCHAR2(80) |
Data source for Specimen |
Y |
W_EHA_DATASOURCE |
DATASOURCE_NM |
VARCHAR2(80) |
Name of datasource for Specimen |
N |
W_EHA_DATASOURCE |
DATASOURCE_DESC |
VARCHAR2(2000) |
Description of datasource for specimen |
N |
W_EHA_DATASOURCE |
SCHEMA_NAME |
VARCHAR2(30) |
Schema name for datasource for specimen |
N |
W_EHA_DATASOURCE |
DB_LINK_NAME |
VARCHAR2(30) |
Link to schema for specimen datasource |
N |
W_EHA_RSLT_TYPE |
RESULT_TYPE_NAME |
VARCHAR2(200) |
Type of result For example, sequencing, no-call, gene expression, cnv |
Y |
W_EHA_RSLT_TYPE |
RESULT_TYPE_DESC |
VARCHAR2(2000) |
Result type description |
N |
W_EHA_RSLT_FILE |
FILE_TYPE_WID |
NUMBER(38) |
Foreign key to W_EHA_FILE_TYPE |
Y |
W_EHA_RSLT_FILE |
FILE_STORAGE_FLG |
VARCHAR2(1) |
Type of file storage, S - Secure Files, E - External |
N |
W_EHA_RSLT_FILE |
FILE_PATH |
VARCHAR2(2000) |
Path to file including filename |
N |
W_EHA_RSLT_FILE |
VENDOR_NAME |
VARCHAR2(200) |
Name of vendor which supplied results in file, For example, Complete Genomics |
N |
W_EHA_RSLT_FILE |
FILE_CONTENT_ID |
RAW(128) |
File identifier used in Secure File system |
N |
W_EHA_RSLT_FILE_TYPE |
FILE_TYPE_CODE |
VARCHAR2(80) |
Code for type of file |
Y |
W_EHA_RSLT_FILE_TYPE |
FILE_TYPE_NAME |
VARCHAR2(200) |
Full name of type of file |
N |
W_EHA_RSLT_FILE_TYPE |
FILE_TYPE_DESC |
VARCHAR2(2000) |
Description of file |
N |
W_EHA_RSLT_FILE_TYPE |
FILE_TYPE_VERSION |
VARCHAR2(30) |
Version of file format |
Y |
W_EHA_RSLT_SEQUENCING |
RESULT_STUDY_WID |
NUMBER(38) |
Foreign key to RESULT_STUDY_WID in W_EHA_RSLT_STUDY |
Y |
W_EHA_RSLT_SEQUENCING |
RESULT_SPEC_WID |
NUMBER(38) |
Foreign key to RESULT_SPEC_WID in W_EHA_RSLT_SPECIMEN table |
Y |
W_EHA_RSLT_SEQUENCING |
RESULT_TYPE_WID |
NUMBER(38) |
Foreign key to RESULT_TYPE_WID in W_EHA_RSLT_TYPE table |
Y |
W_EHA_RSLT_SEQUENCING |
RESULT_FILE_WID |
NUMBER(38) |
Foreign key to RESULT_FILE_WID in W_EHA_RSLT_FILE table |
Y |
W_EHA_RSLT_SEQUENCING |
VARIANT_WID |
NUMBER(38) |
Foreign key to W_EHA_VARIANT table |
Y |
W_EHA_RSLT_SEQUENCING |
CHROMOSOME_WID |
NUMBER(38) |
Foreign key to CHROMOSOME_WID in W_EHA_CHROMOSOME table |
Y |
W_EHA_RSLT_SEQUENCING |
START_POSITION |
NUMBER |
Start position of variant, with the 1st base in chromosome having position 1. |
N |
W_EHA_RSLT_SEQUENCING |
VARIANT_TYPE |
VARCHAR2(100) |
Type of variant including snp, insertion, or deletion |
N |
W_EHA_RSLT_SEQUENCING |
ZYGOSITY |
VARCHAR2(100) |
Similarity of alleles in an organism to reference. If both same - hom etc. |
N |
W_EHA_RSLT_SEQUENCING |
SCORE_VAF |
NUMBER |
Positive integer representing confidence in the call from CGI masterVar file |
N |
W_EHA_RSLT_SEQUENCING |
SCORE_EAF |
NUMBER |
Positive or negative integer representing confidence in the call from CGI masterVar file |
N |
W_EHA_RSLT_SEQUENCING |
ALLELE_READ_COUNT |
NUMBER |
Number of reads that support the given allele |
N |
W_EHA_RSLT_SEQUENCING |
REFERENCE_READ_COUNT |
NUMBER |
Number of reads that support the reference sequence |
N |
W_EHA_RSLT_SEQUENCING_X |
RESULT_STUDY_WID |
NUMBER(38) |
Foreign key to RESULT_STUDY_WID in W_EHA_RSLT_STUDY |
Y |
W_EHA_RSLT_SEQUENCING_X |
ALLELE |
VARCHAR2(1000) |
Sequence of allele if less than or equal to 500 char |
N |
W_EHA_RSLT_SEQUENCING_X |
ALLELE_CLOB |
CLOB |
Sequence of allele if greater than 500 char |
N |
W_EHA_RSLT_SEQUENCING_X |
HAPLINK |
VARCHAR2(100) |
Integer ID that links the allele to alleles of other loci that are known to be on same haplotype |
N |
W_EHA_RSLT_SEQUENCING_X |
FILTER |
VARCHAR2(200) |
Indicates PASS if position has passed all specified filters, otherwise lists FAIL, filters and results |
N |
W_EHA_RSLT_SEQUENCING_X |
GENOTYPE_FILTER |
VARCHAR2(200) |
Indicates PASS if position has passed all genotype filters, otherwise lists FAIL, filters and results |
N |
W_EHA_RSLT_COPY_NBR_VAR |
RESULT_STUDY_WID |
NUMBER(38) |
Foreign key to RESULT_STUDY_WID in W_EHA_RSLT_STUDY |
Y |
W_EHA_RSLT_COPY_NBR_VAR |
RESULT_SPEC_WID |
NUMBER(38) |
Foreign key to RESULT_SPEC_WID in W_EHA_RSLT_SPECIMEN table |
Y |
W_EHA_RSLT_COPY_NBR_VAR |
RESULT_TYPE_WID |
NUMBER(38) |
Foreign key to RESULT_TYPE_WID in W_EHA_RSLT_TYPE table |
Y |
W_EHA_RSLT_COPY_NBR_VAR |
RESULT_FILE_WID |
NUMBER(38) |
Foreign key to RESULT_FILE_WID in W_EHA_RSLT_FILE table |
Y |
W_EHA_RSLT_COPY_NBR_VAR |
CHROMOSOME_WID |
NUMBER(38) |
Foreign key to CHROMOSOME_WID in W_EHA_CHROMOSOME table |
Y |
W_EHA_RSLT_COPY_NBR_VAR |
START_POSITION |
NUMBER |
Start position of DNA sequence, with the 1st base in chromosome having position 1. |
N |
W_EHA_RSLT_COPY_NBR_VAR |
END_POSITION |
NUMBER |
End position of DNA sequence, with the 1st base in chromosome having position 1. |
N |
W_EHA_RSLT_COPY_NBR_VAR |
AVG_NORMALIZED_CVG |
NUMBER |
Baseline-normalized average coverage over the interval from START to END position |
N |
W_EHA_RSLT_COPY_NBR_VAR |
GC_CORRECTED_CVG |
NUMBER |
GC-corrected average coverage of a window width based on START and END positions |
N |
W_EHA_RSLT_COPY_NBR_VAR |
FRACTION_UNIQUE |
NUMBER |
Fraction of coverage due to unique mappings |
N |
W_EHA_RSLT_COPY_NBR_VAR |
RELATIVE_CVG |
NUMBER |
AVG_NORMALIZED_CVG divided by estimate of diploid median average adjusted coverage, can be null if source is N |
N |
W_EHA_RSLT_COPY_NBR_VAR |
CALLED_PLOIDY |
NUMBER |
Called ploidy for the segment, usually an integer from 0 to maximum ploidy value, if N in source file, leave blank |
N |
W_EHA_RSLT_COPY_NBR_VAR |
CALLED_CNV_TYPE |
VARCHAR2(100) |
Classification of CALLED_PLOIDY into one of six categories |
N |
W_EHA_RSLT_COPY_NBR_VAR |
PLOIDY_SCORE |
NUMBER |
Phred-like confidence that the segment has the CALLED_PLOIDY correct |
N |
W_EHA_RSLT_COPY_NBR_VAR |
CNV_TYPE_SCORE |
NUMBER |
Phred-like confidence that CALLED_CNV_TYPE is correct |
N |
W_EHA_RSLT_CNV_X |
RESULT_STUDY_WID |
NUMBER(38) |
Foreign key to RESULT_STUDY_WID in W_EHA_RSLT_STUDY |
Y |
W_EHA_RSLT_CNV_X |
REPEATS |
VARCHAR2(2000) |
Percent of called CNV segments that overlaps with each category of genomic repeats. Format is: Repeat category:XX. Stored if number of characters less than or equal to 2000. |
N |
W_EHA_RSLT_CNV_X |
REPEATS_CLOB |
CLOB |
Percent of called CNV segments that overlaps with each category of genomic repeats. Format is: Repeat category:XX. Stored if number of characters greater than 2000. |
N |
W_EHA_RSLT_NOCALL |
RESULT_STUDY_WID |
NUMBER(38) |
Foreign key to RESULT_STUDY_WID in W_EHA_RSLT_STUDY |
Y |
W_EHA_RSLT_NOCALL |
RESULT_SPEC_WID |
NUMBER(38) |
Foreign key to RESULT_SPEC_WID in W_EHA_RSLT_SPECIMEN table |
Y |
W_EHA_RSLT_NOCALL |
RESULT_TYPE_WID |
NUMBER(38) |
Foreign key to RESULT_TYPE_WID in W_EHA_RSLT_TYPE table |
Y |
W_EHA_RSLT_NOCALL |
RESULT_FILE_WID |
NUMBER(38) |
Foreign key to RESULT_FILE_WID in W_EHA_RSLT_FILE table |
Y |
W_EHA_RSLT_NOCALL |
CHROMOSOME_WID |
NUMBER(38) |
Foreign key to CHROMOSOME_WID in W_EHA_CHROMOSOME table |
Y |
W_EHA_RSLT_NOCALL |
START_POSITION |
NUMBER |
Start position of DNA sequence, with the 1st base in chromosome having position 1. |
N |
W_EHA_RSLT_NOCALL |
END_POSITION |
NUMBER |
End position of DNA sequence, with the 1st base in chromosome having position 1. |
N |
W_EHA_RSLT_NOCALL |
NOCALL_TYPE |
VARCHAR2(100) |
Type of no-call |
N |
W_EHA_RSLT_NOCALL_X |
RESULT_STUDY_WID |
NUMBER(38) |
Foreign key to RESULT_STUDY_WID in W_EHA_RSLT_STUDY |
Y |
W_EHA_RSLT_NOCALL_X |
ALLELE |
varchar2(500) |
Sequence of allele if less than or equal to 500 char |
N |
W_EHA_RSLT_NOCALL_X |
ALLELE_CLOB |
CLOB |
Sequence of allele if greater than 500 char |
N |
W_EHA_RSLT_NOCALL_X |
HAPLINK |
VARCHAR2(100) |
Integer ID that links the allele to alleles of other loci that are known to be on same haplotype |
N |
W_EHA_PROBE_ALT_LINK |
HUGO_NAME |
VARCHAR2(200) |
Hugo name for a gene corresponding to probe |
Y |
W_EHA_PROBE_ALT_LINK |
PROBE_WID |
NUMBER(38) |
Foreign key to W_EHA_PROBE |
Y |
W_EHA_RSLT_STUDY |
RESULT_STUDY_NAME |
VARCHAR2(200) |
Name of the study under which results will be loaded |
Y |
W_EHA_RSLT_STUDY |
RESULT_STUDY_DESC |
VARCHAR2(2000) |
Description of the study under which results will be loaded |
N |
W_EHA_RSLT_SPECIMEN |
SPECIMEN_WID |
NUMBER(38) |
Foreign key to Specimen table in CDM1.1 |
Y |
W_EHA_RSLT_SPECIMEN |
SPEC_DATASRC_WID |
NUMBER(38) |
Identifier for specimen datasource |
Y |
W_EHA_RSLT_SPECIMEN |
SPECIMEN_NUMBER |
VARCHAR2(80) |
Specimen identifier in linked specimen datasource database |
Y |
W_EHA_RSLT_SPECIMEN |
SPECIMEN_VENDOR_NUMBER |
VARCHAR2(80) |
Vendor identifier in linked specimen datasource database |
Y |
W_EHA_CHROMOSOME |
CHROMOSOME |
VARCHAR2(50) |
Chromosome number or identifier |
Y |
The following table contains descriptions of all tables in ODB:
Table | Type | Description | Loaders Populate | Included in the Release 1.0.1 |
---|---|---|---|---|
W_EHA_HUGO_INFO |
Reference |
Stores the Hugo Names for all genes with cross reference information. This table matches the data from the Human Gene Nomenclature Committee. |
Master populate script |
N |
W_EHA_DNA_SOURCE |
Reference |
Stores the DNA sequence source buffer where multiple genes and other features may reference. The identifciation and locus of the sequence source is given with the accession field. |
Reference loader in Java |
N |
W_EHA_PROT_INFO |
Reference |
Stores PROTEIN information for each named version of a protein molecule. |
Reference loader in Java |
N |
W_EHA_PROTEIN |
Reference |
Stores PROTEIN molecule information for amino acid sequences transcribed from genes. |
Reference loader in Java |
N |
W_EHA_GENE |
Reference |
Stores the parent gene records as loaded from the reference. The gene table will store the hugo name as well as the the name used in the reference data. Othere tables will have foreign keys to this table such as GENE_SEGMENT and GENE_STRUCTURE. |
Reference loader in Java |
N |
W_EHA_GENE_COMP_SEGMENT |
Reference |
Maps each GENE_COMPONENT to a specific region in the DNA_SOURCE buffer. Each GENE_COMPONENT can have many segments that are not contiguous. |
Reference loader in Java |
N |
W_EHA_GENE_COMPONENT |
Reference |
Stores each component of the gene (i.e. mRNA, coding segments, etc...). |
Reference loader in Java |
N |
W_EHA_GENE_SEGMENT |
Reference |
Maps each GENE to the DNA_SOURCE buffers used as reference. Note that a GENE definition may span multiple source references. |
Reference loader in Java |
N |
W_EHA_GENE_STRUCTURE |
Reference |
Stores each transcript of the GENE used to create different PROTEIN molecules. Each GENE_STRUCTURE is indicated with a QUALIFIER indicating the transcript_id. All of the GENE_COMPONENT records with the same transcript_id will be grouped with the same pareint GENE_STRUCTURE record. |
Reference loader in Java |
N |
W_EHA_PROT_COMPONENT |
Reference |
Stores the components that are imported from SwissProt files for all Protein definitions. |
Reference loader in Java |
N |
W_EHA_SPECIES |
Reference |
Stores SPECIES information for each genome stored in database. |
Reference loader in Java |
N |
W_EHA_VARIANT |
Reference |
Stores novel and known variants for the genomes loaded. The actual variation is stored in the REPLACE_TAG field. |
Reference loader in Java, Sequencing loaders: VCF, MAF, CGI |
N |
W_EHA_PATHWAY |
Reference |
Stores PATHWAY names for protein network. |
Reference loader in Java |
N |
W_EHA_PATHWAY_PROTEIN |
Reference |
Stores genes or proteins that belong to a given pathway based on import from GSEA files. |
Reference loader in Java |
N |
W_EHA_SOURCE_LIT_REF |
Reference |
Stores literature references for each DNA_SOURCE definition. |
Reference loader in Java |
N |
W_EHA_PROT_LIT_REF |
Reference |
Stores literature references for each PROTEIN definition. |
Reference loader in Java |
N |
W_EHA_VARIANT_QLFR |
Reference |
Stores qualifier attributes for each VARIANT |
Reference loader in Java |
N |
W_EHA_SOURCE_QLFR |
Reference |
Stores all of the qualifier attributes for each DNA_SOURCE definition. |
Reference loader in Java |
N |
W_EHA_GENE_QLFR |
Reference |
Stores all of the qualifier attributes for each gene. |
Reference loader in Java |
N |
W_EHA_GENE_COMP_QLFR |
Reference |
Stores all of the qualifier attributes for each component of the gene (i.e. mRNA, coding segments, etc...). |
Reference loader in Java |
N |
W_EHA_PROT_COMP_QLFR |
Reference |
Stores all of the qualifier attributes for each component of the protein. |
Reference loader in Java |
N |
W_EHA_VARIANT_XREF |
Reference |
Cross reference information for VARIANT records |
Reference loader in Java |
N |
W_EHA_SOURCE_XREF |
Reference |
Stores cross reference information for DNA_SOURCE buffers. |
Reference loader in Java |
N |
W_EHA_PROT_COMP_XREF |
Reference |
Stores the PROTEIN COMPONENT cross reference information linked to other databases. |
Reference loader in Java |
N |
W_EHA_GENE_XREF |
Reference |
Stores the GENE cross reference information linked to other databases. |
Reference loader in Java |
N |
W_EHA_GENE_COMPONENT_XREF |
Reference |
Stores GENE_COMPONENT cross reference links to other databases. |
Reference loader in Java |
N |
W_EHA_PROT_XREF |
Reference |
Stores PROTEIN cross reference information |
Reference loader in Java |
N |
W_EHA_PROBE |
Result/Reference |
Stores PROBE information and link to corresponding gene. |
Result loader for probes (expression) |
N |
W_EHA_PROBE_XREF |
Reference |
Stores cross reference information for PROBE records. |
Currently not populated |
N |
W_EHA_RSLT_GENE_EXP |
Result |
Stores gene expression intensity value along with p-value, call if available. |
Result loader for gene expression |
N |
W_EHA_DATASOURCE |
Result |
Stores specimen datasource information and link, if needed. |
Pre-seeded |
N |
W_EHA_RSLT_TYPE |
Result |
Stores identifiers for types of result, pre-seeded. |
Pre-seeded |
N |
W_EHA_RSLT_FILE |
Result |
Stores links and other information about the result file. |
Sequence loaders: CGI, MAF, VCF, gene expression loader |
N |
W_EHA_RSLT_FILE_TYPE |
Result |
Stores type of file information, including version, vendor etc, pre-seeded. |
Pre-seeded |
N |
W_EHA_RSLT_SEQUENCING |
Result |
Stores sequencing results, namely substitutions, insertions, deletions coming from CGI, MAF, VCF file formats. |
Sequence loaders: CGI, MAF, VCF |
N |
W_EHA_RSLT_COPY_NBR_VAR |
Result |
Stores copy number variation results from sequencing. |
Currently not populated |
N |
W_EHA_RSLT_NOCALL |
Result |
Stores no-call results from CGI sequencing. |
CGI sequence loader |
N |
W_EHA_PROBE_ALT_LINK |
Result |
Used when there is more than one gene that is matched to a given probe. |
Currently not populated |
N |
W_EHA_SEQUENCING_X |
Result |
Stores varchar data of sequencing results coming from CGI, MAF, VCF file formats. |
Sequence loaders: CGI, MAF, VCF |
Y |
W_EHA_RSLT_COPY_NBR_VAR_X |
Result |
Stores varchar data for copy number variation results from sequencing. |
Currently not populated |
Y |
W_EHA_RSLT_NOCALL_X |
Result |
Stores varchar data of no-call results from CGI sequencing. |
CGI sequence loader |
Y |
W_EHA_RSLT_STUDY |
Result |
Stores study information under which results pertaining to sequencing and expression are loaded. |
User defined data |
Y |
W_EHA_RSLT_SPECIMEN |
Result |
Stores the Specimen details and links datasource table. All results would have specimen_wid in their tables. |
Sequence loaders: CGI, MAF, VCF, gene expression loader |
Y |
W_EHA_CHROMOSOME |
Result |
Stores all possible chromosome numbers for which result sequencing data is imported. CHROMOSOME_WID is used to link to result sequencing tables. |
Pre-seeded |
Y |
W_EHA_RSLT_SEQUENCING |
REFERENCE_READ_COUNT |
NUMBER |
Number of reads that support the reference sequence |
N |
W_EHA_RSLT_SEQUENCING_X |
RESULT_STUDY_WID |
NUMBER(38) |
Foreign key to RESULT_STUDY_WID in W_EHA_RSLT_STUDY |
Y |
W_EHA_RSLT_SEQUENCING_X |
ALLELE |
VARCHAR2(1000) |
Sequence of allele if less than or equal to 500 char |
N |
W_EHA_RSLT_SEQUENCING_X |
ALLELE_CLOB |
CLOB |
Sequence of allele if greater than 500 char |
N |
W_EHA_RSLT_SEQUENCING_X |
HAPLINK |
VARCHAR2(100) |
Integer ID that links the allele to alleles of other loci that are known to be on same haplotype |
N |
W_EHA_RSLT_SEQUENCING_X |
FILTER |
VARCHAR2(200) |
Indicates PASS if position has passed all specified filters, otherwise lists FAIL, filters and results |
N |
W_EHA_RSLT_SEQUENCING_X |
GENOTYPE_FILTER |
VARCHAR2(200) |
Indicates PASS if position has passed all genotype filters, otherwise lists FAIL, filters and results |
N |
W_EHA_RSLT_COPY_NBR_VAR |
RESULT_STUDY_WID |
NUMBER(38) |
Foreign key to RESULT_STUDY_WID in W_EHA_RSLT_STUDY |
Y |
W_EHA_RSLT_COPY_NBR_VAR |
RESULT_SPEC_WID |
NUMBER(38) |
Foreign key to RESULT_SPEC_WID in W_EHA_RSLT_SPECIMEN table |
Y |
W_EHA_RSLT_COPY_NBR_VAR |
RESULT_TYPE_WID |
NUMBER(38) |
Foreign key to RESULT_TYPE_WID in W_EHA_RSLT_TYPE table |
Y |
W_EHA_RSLT_COPY_NBR_VAR |
RESULT_FILE_WID |
NUMBER(38) |
Foreign key to RESULT_FILE_WID in W_EHA_RSLT_FILE table |
Y |
W_EHA_RSLT_COPY_NBR_VAR |
CHROMOSOME_WID |
NUMBER(38) |
Foreign key to CHROMOSOME_WID in W_EHA_CHROMOSOME table |
Y |
W_EHA_RSLT_COPY_NBR_VAR |
START_POSITION |
NUMBER |
Start position of DNA sequence, with the 1st base in chromosome having position 1. |
N |
W_EHA_RSLT_COPY_NBR_VAR |
END_POSITION |
NUMBER |
End position of DNA sequence, with the 1st base in chromosome having position 1. |
N |
W_EHA_RSLT_COPY_NBR_VAR |
AVG_NORMALIZED_CVG |
NUMBER |
Baseline-normalized average coverage over the interval from START to END position |
N |
W_EHA_RSLT_COPY_NBR_VAR |
GC_CORRECTED_CVG |
NUMBER |
GC-corrected average coverage of a window width based on START and END positions |
N |
W_EHA_RSLT_COPY_NBR_VAR |
FRACTION_UNIQUE |
NUMBER |
Fraction of coverage due to unique mappings |
N |
W_EHA_RSLT_COPY_NBR_VAR |
RELATIVE_CVG |
NUMBER |
AVG_NORMALIZED_CVG divided by estimate of diploid median average adjusted coverage, can be null if source is N |
N |
W_EHA_RSLT_COPY_NBR_VAR |
CALLED_PLOIDY |
NUMBER |
Called ploidy for the segment, usually an integer from 0 to maximum ploidy value, if N in source file, leave blank |
N |
W_EHA_RSLT_COPY_NBR_VAR |
CALLED_CNV_TYPE |
VARCHAR2(100) |
Classification of CALLED_PLOIDY into one of six categories |
N |
W_EHA_RSLT_COPY_NBR_VAR |
PLOIDY_SCORE |
NUMBER |
Phred-like confidence that the segment has the CALLED_PLOIDY correct |
N |
W_EHA_RSLT_COPY_NBR_VAR |
CNV_TYPE_SCORE |
NUMBER |
Phred-like confidence that CALLED_CNV_TYPE is correct |
N |
W_EHA_RSLT_CNV_X |
RESULT_STUDY_WID |
NUMBER(38) |
Foreign key to RESULT_STUDY_WID in W_EHA_RSLT_STUDY |
Y |
W_EHA_RSLT_CNV_X |
REPEATS |
VARCHAR2(2000) |
Percent of called CNV segments that overlaps with each category of genomic repeats. Format is: Repeat category:XX. Stored if number of characters less than or equal to 2000. |
N |
W_EHA_RSLT_CNV_X |
REPEATS_CLOB |
CLOB |
Percent of called CNV segments that overlaps with each category of genomic repeats. Format is: Repeat category:XX. Stored if number of characters greater than 2000. |
N |
W_EHA_RSLT_NOCALL |
RESULT_STUDY_WID |
NUMBER(38) |
Foreign key to RESULT_STUDY_WID in W_EHA_RSLT_STUDY |
Y |
W_EHA_RSLT_NOCALL |
RESULT_SPEC_WID |
NUMBER(38) |
Foreign key to RESULT_SPEC_WID in W_EHA_RSLT_SPECIMEN table |
Y |
W_EHA_RSLT_NOCALL |
RESULT_TYPE_WID |
NUMBER(38) |
Foreign key to RESULT_TYPE_WID in W_EHA_RSLT_TYPE table |
Y |
W_EHA_RSLT_NOCALL |
RESULT_FILE_WID |
NUMBER(38) |
Foreign key to RESULT_FILE_WID in W_EHA_RSLT_FILE table |
Y |
W_EHA_RSLT_NOCALL |
CHROMOSOME_WID |
NUMBER(38) |
Foreign key to CHROMOSOME_WID in W_EHA_CHROMOSOME table |
Y |
W_EHA_RSLT_NOCALL |
START_POSITION |
NUMBER |
Start position of DNA sequence, with the 1st base in chromosome having position 1. |
N |
W_EHA_RSLT_NOCALL |
END_POSITION |
NUMBER |
End position of DNA sequence, with the 1st base in chromosome having position 1. |
N |
W_EHA_RSLT_NOCALL |
NOCALL_TYPE |
VARCHAR2(100) |
Type of no-call |
N |
W_EHA_RSLT_NOCALL_X |
RESULT_STUDY_WID |
NUMBER(38) |
Foreign key to RESULT_STUDY_WID in W_EHA_RSLT_STUDY |
Y |
W_EHA_RSLT_NOCALL_X |
ALLELE |
varchar2(500) |
Sequence of allele if less than or equal to 500 char |
N |
W_EHA_RSLT_NOCALL_X |
ALLELE_CLOB |
CLOB |
Sequence of allele if greater than 500 char |
N |
W_EHA_RSLT_NOCALL_X |
HAPLINK |
VARCHAR2(100) |
Integer ID that links the allele to alleles of other loci that are known to be on same haplotype |
N |
W_EHA_PROBE_ALT_LINK |
HUGO_NAME |
VARCHAR2(200) |
Hugo name for a gene corresponding to probe |
Y |
W_EHA_PROBE_ALT_LINK |
PROBE_WID |
NUMBER(38) |
Foreign key to W_EHA_PROBE |
Y |
W_EHA_RSLT_STUDY |
RESULT_STUDY_NAME |
VARCHAR2(200) |
Name of the study under which results will be loaded |
Y |
W_EHA_RSLT_STUDY |
RESULT_STUDY_DESC |
VARCHAR2(2000) |
Description of the study under which results will be loaded |
N |
W_EHA_RSLT_SPECIMEN |
SPECIMEN_WID |
NUMBER(38) |
Foreign key to Specimen table in CDM1.1 |
Y |
W_EHA_RSLT_SPECIMEN |
SPEC_DATASRC_WID |
NUMBER(38) |
Identifier for specimen datasource |
Y |
W_EHA_RSLT_SPECIMEN |
SPECIMEN_NUMBER |
VARCHAR2(80) |
Specimen identifier in linked specimen datasource database |
Y |
W_EHA_RSLT_SPECIMEN |
SPECIMEN_VENDOR_NUMBER |
VARCHAR2(80) |
Vendor identifier in linked specimen datasource database |
Y |
W_EHA_CHROMOSOME |
CHROMOSOME |
VARCHAR2(50) |
Chromosome number or identifier |
Y |