Oracle® Health Sciences Omics Data Bank Programmer's Guide Release 2.5 E35680-04 |
|
|
PDF · Mobi · ePub |
The appendix contains the following topics:
This table stores information regarding type of result stored and is based on what data is being inserted into ODB (one row inserted per loader per file). User may choose to seed more types.
RESULT_TYPE_NAME | RESULT_TYPE_DESC | Which loader inserts |
---|---|---|
GENE_EXPRESSION |
Gene expression results |
Gene expression loader |
NOCALL |
Nocall result for sequencing given allele |
CGI masterVar loader |
SEQUENCING |
Sequencing results including simple variants such as snp, insertions, deletions |
VCF, MAF, CGI masterVarFootref 1 loaders |
COPY_NUMBER_VARIATION |
Copy Number Variation results |
CNV loader |
TCGA_RNA_SEQ_EXON |
TCGA RNA Seq results for exon information |
TCGA RNA seq loaderFoot 1 |
2-CHNL_GENE_EXPRESSION |
Gene expression results from 2-channel gene expression analysis |
Dual channel loader |
Footnote 1 CGI masterVar loader is temporarily removed from ODB 2.5 and will be available in the next release.
This table is pre-seeded with file types currently handled by loaders, it should contain six rows and the pre-seeded values are mentioned in the following table:
FILE_TYPE_CODE | FILE_TYPE_NAME | FILE_TYPE_DESC | FILE_TYPE_VERSION |
---|---|---|---|
VCF |
Variant Call Format |
File containing variant information including snps, inserts and deletions. |
4.1 |
MAF |
Mutation Annotation Format |
Mutation Annotation Format containing snps, inserts, and deletions. |
2.0 |
MAF |
Mutation Annotation Format |
Mutation Annotation Format containing snps, inserts, and deletions. |
2.1 |
MAF |
Mutation Annotation Format |
Mutation Annotation Format containing snps, inserts, and deletions. |
2.2 |
Tab-delim Expression |
Tab delimited Expression file |
Gene Expression tab delimited file format containing probe hybridization results, 3 values per hybridization: Intensity, Call, P-value. |
A |
CGI masterVar |
Complete Genomics MasterVar |
Master Variation file from Complete Genomics containing snps, inserts, deletions, and no-call information. |
2.0 |
TCGA RNA SEQ EXON |
TCGA RNA SEQ EXON |
TCGA RNA Seq tab delimited file format for exon information. |
3.1.4.0 |
Genome_Wide_SNP_6 |
Affymetrix Genome-Wide Human SNP Array 6.0 |
This file represents the TCGA data format for Affymetrix Genome-Wide Human SNP Array 6.0'. |
6.0 |
2-Channel Expression |
Agilent TCGA 2-channel Expression analysis file |
TCGA''s Agilent platform Gene Expression analysis file format containing gene level results; Log2-transformed sample or control intensity ratios. |
A |
CGI cnv |
Complete Genomics cnv |
Copy Number Variation file from Complete Genomics containing cvg , ploidy and score information. |
2.0 |
2-Channel ADF |
Agilent TCGA Array Description File |
TCGA's Agilent platform G4502A_07_01 ADF file containing the array probe information and corresponding genomic or gene annotation. |
A |
SIFT |
Sorting Tolerant From Intolerant |
SIFT predicts whether an amino acid substitution affects protein function. |
4.0 |
PolyPhen |
Polymorphism Phenotyping |
PolyPhen predicts possible impact of an amino acid substitution on the structure and function of a human protein using straightforward physical and comparative considerations. |
5.0 |
BAM |
Binary Alignmentor Map Format |
Binary sequencing alignment file for sequencing runs. |
1.4 |
SAM |
Sequence Alignmentor Map Format |
Binary sequencing alignment file for sequencing runs. |
1.4 |
gVCF |
genome variant call format |
A VCF file following VCF 4.1 specifications combines information on variant calls (SNVs and small-indels) with genotype and read depth information for all non-variant positions in the reference. |
20120906a |
This table is pre-seeded with all the possible chromosome names in a result. The user needs to insert any non-standard chromosome names contained in the results file.
Table Name | Column Name | Description | Values Pre-seeded |
---|---|---|---|
W_EHA_CHROMOSOME |
CHROMOSOME |
Name of the chromosome |
1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,X,Y,MT |
This table is pre-seeded with multiple alias for each of the chromosome record in W_EHA_CHROMOSOME. For example, CHR1 can also be represented as 1, similarly chrM will have alias like CHRMT, MT, and M.
This table is pre-seeded with all the prediction codes for SIFT or PolyPhen annotation loader support
This table stores information about specimen source and is intended to be used primarily with CDM (one record in W_EHA_DATASOURCE). However, if needed, other databases with specimen information can be linked.
Table Name | Column Name | Description | If used together with Oracle Health Sciences Cohort Explorer (OHSCE) Cohort Data Model |
---|---|---|---|
W_EHA_DATASOURCE |
DATASOURCE_CD |
Data source for Specimen |
CDM |
W_EHA_DATASOURCE |
DATASOURCE_NM |
Name of datasource for Specimen |
Cohort Data Model |
W_EHA_DATASOURCE |
DATASOURCE_DESC |
Description of datasource for specimen |
Cohort Data Model |
W_EHA_DATASOURCE |
SCHEMA_NAME |
Name of schema |
CDM |
W_EHA_DATASOURCE |
DB_LINK_NAME |
Link to database if needed |
This table is populated by reference loaders while loading new versions of reference files. Each reference loader can populate a version record for a particular version type associated with it.
Reference Loader | Version_Type | Description of the File Version | Example VERSION_LABEL values |
---|---|---|---|
EMBL Loader |
DNA |
Ensemble genome reference version specific to the genome build release. |
GRCH37.P8 - for embl release 68 files. |
Swiss-Prot loader |
PROTEIN |
Swissprot releases do not have labels assigned to them. However, they have release timestamps. The time stamp of a Uniprot file release is used. |
01012012 |
Pathway loader |
PATHWAY |
Pathway release files do not have version labels. File release timestamps are used. |
03032013 |
Prediction Loader |
POLYPHEN |
Polyphen file version and timestamp. |
POLY_VER1_22042012 |
Prediction Loader |
SIFT |
SIFT file version and timestamp. |
SIFT_VER1_22042012 |
Hugo Loader |
HUGO |
Hugo does not have a file archive. Use the data the file has been downloaded. |
03032013 |
HGMD Loader |
BIOBASE |
The timestamp of the HGMD file release is used. |
HGMD_04282013 |
Probe Loader |
PROBE |
This is a user-specific label. Use a label that describes the target microarray platform. |
Affy_Hs_U133+_2.0_GPL570 |
This table is populated by loaders while loading results.
Table Name | Column Name | Description | If Used Together With Regular Files | If Used Together With SecureFiles |
---|---|---|---|---|
W_EHA_RSLT_FILE |
FILE_STORAGE_FLG |
E for External, S for SecureFiles |
E |
S |
W_EHA_RSLT_FILE |
FILE_PATH |
Path to input file |
For example, C:/inputfile.txt |
|
W_EHA_RSLT_FILE |
VENDOR_NAME |
Name of vendor providing file |
For example, Affymetrix |
For example, Affymetrix |
W_EHA_RSLT_FILE |
FILE_CONTENT_ID |
File identifier utilized by Secure File system |
Not used |
Generated by SecureFiles |
W_EHA_RSLT_FILE |
FILE_TYPE_WID |
FK to W_EHA_RSLT_FILE_TYPE |
Corresponds to WID in RSLT_FILE_TYPE |
Corresponds to WID in RSLT_FILE_TYPE |
This table stores the foreign key to W_EHA_RSLT_FILE and W_EHA_RSLT_SPECIMEN and links a specific file which is loaded to ODB through the loaders to the specimen it belongs to.
The user should populate study details in this table before loading the results. All imported results fall under the specified study name in the command line argument.
The following table indicates result and reference tables or columns that are currently not being populated.
Table A-9 Unpopulated Tables or Columns
Table Name | Column Name | Description |
---|---|---|
W_EHA_PROBE |
SEQUENCE |
|
W_EHA_PROBE |
START_POSITION |
|
W_EHA_RSLT_CNV_X |
No loader for this table |
|
W_EHA_PROBE_ALT_LINK |
No loader for this table |
|
W_EHA_ADF_COMPOSITE_XREF |
Additional reference table |
|
W_EHA_ADF_REPORTER_XREF |
Additional reference table |
|
W_EHA_ANATOMICAL_SITE |
No loader for this table |
|
W_EHA_HISTOLOGY |
No loader for this table |
|
W_EHA_DISEASE_G_VAR_QLFR |
Additional qualifier table |
|
W_EHA_GENE_XREF |
Additional reference table |
|
W_EHA_QLFR_CATEGORY |
One of QC metadata tables |
|
W_EHA_QUALIFIER |
One of QC metadata tables |
|
W_EHA_QLFR_TABLE |
One of QC metadata tables |
|
W_EHA_QLFR_TRANSLATION |
One of QC metadata tables |
|
W_EHA_RSLT_DXP_ANLYS |
No loader for this table |
|
W_EHA_RSLT_DXP_ANLYS_MD |
No loader for this table |
|
W_EHA_RSLT_DXP_GRP |
No loader for this table |
|
W_EHA_RSLT_DXP_GRP_SPEC |
No loader for this table |
|
W_EHA_RSLT_FILE_SPEC_QLFR |
Additional qualifier table |
|
W_EHA_RSLT_SPEC_QLFR |
Additional qualifier table |
|
W_EHA_SOMATIC_VAR_INFO |
No loader for this table |
|
W_EHA_SOMATIC_VAR_QLFR |
Additional qualifier table |
|
W_EHA_SOMATIC_VAR_XREF |
Additional reference table |
|
W_EHA_SOURCE_LIT_REF |
No loader for this table |
|
W_EHA_UNIT_OF_MEASURE |
One of QC metadata tables |
|
W_EHA_VARIANT_QLFR |
Additional variant qualifier table |