From: Clinical decision support systems for improving diagnostic accuracy and achieving precision medicine
Database integration system | Purpose | License | Update method | # of databases | Databases integrated | Company |
---|---|---|---|---|---|---|
Atlas [ 90 ] | “A biological data warehouse called Atlas that locally stores and integrates biological sequences, molecular interactions, homology information, functional annotations of genes, and biological ontologies”. | Open Source | Manual | 13 | GenBank, RefSeq, UniProt, Human Protein Reference Database (HPRD), Biomolecular Interaction Network Database (BIND), Database of Interacting Proteins (DIP), Molecular Interactions Database (MINT), IntAct, NCBI Taxonomy, Gene Ontology (GO), Online Mendelian Inheritance in Man (OMIM), LocusLink, Entrez Gene and HomoloGene | British Columbia University - Vancouver, BC |
Biowarehouse [ 91 ] | “An open source toolkit for constructing bioinformatics database warehouses using the MySQL and Oracle relational database managers. Integrates multiple public bioinformatics databases into a single relational database system within a common bioinformatics schema”. | Open Source | Dependent on the individual databases | 12 | ENZYME, KEGG, BioPax, Eco2dbase, Metacyc, Mage-ML and BioCyc, UniProt, GenBank, NCBI Taxonomy, CMR databases, and Gene Ontology. | Stanford Research Institute – Menlo Park, Ca |
Columba [ 92 ] | “Facilitates the creation of protein structure data sets for many structure-based studies. It allows combining queries on a number of structure-related databases not covered by other projects at present”. | Free-Use | Dependent on the individual databases | 12 | PDB, SCOP, CATH, DSSP, ENZYME, Boehringer, KEGG, Swiss-Prot, GO, GOA, Taxonomy, PISCES | Humboldt-Universität zu Berlin – Berlin Germany |
Systomonas [ 93 ] | “To provide an integrated bioinformatics platform for a systems biology approach to the biology of pseudomonads in infection and biotechnology”. | Free-Use | Unknown | 4 | KEGG, Pseudomonas Genome Database v2, PRODORIC, and BRENDA | Technische Universität Braunschweig - Braunschweig, Germany |
Oncomine [ 94 ] | “A cancer microarray database and web-based data-mining platform aimed at facilitating discovery from genome-wide expression analyses”. | Free-Use, Subscription-based for expanded functionality | Annually for Free Version, Regular data updates for subscription | - | 65 Gene expression datasets, from 4700 microarray experiments. | Life Technologies Corporation |
Biomart [ 95 ] | “BioMart enables scientists to perform advanced querying of biological data sources through a single web interface. The power of the system comes from integrated querying of data sources regardless of their geographical locations”. | Open Source | Unknown | 25 (as of 2009), 46 as of 5/2014 | Ensembl Genes, Ensembl Homology, Ensembl Variation, Ensembl Genomic Features, Vega, HTGT, Gramene, Reactome, Wormbase, Dictybase, RGD, PRIDE, EURATMart, MSD, Uniprot, Pancreatic Expression Database, PepSeeker, ArrayExpress, GermOnLine, DroSpeGe, HapMap, VectorBase, Paramecium, Eurexpress, Europhenome | Collaboration between many institutes and Universities. |
Ondex [ 96 ] | “The Ondex data integration platform enables data from diverse biological datasets to be linked, integrated and visualised through graph analysis techniques. Ondex can be used in a number of important application areas such as transcription analysis, protein interaction analysis, data mining and text mining”. | Open Source | Unknown | 28 | AraCyc, AtRegNet, BioCyc, BioGRID, Brenda, Cytoscape, EcoCyc, GOA, Gramene, Grassius, KEGG, Medline, MetaCyc, O-GlycBase, OMIM, PDB, Pfam, Prolog (limited functionality), SGD, TAIR, TIGR, Transfac, transpath, UniProt, WordNet, ChEBI, ChEMBL, GFF3 | Rothamsted Research Harpenden, UK |
InterMine [ 97 ] | “InterMine is an open-source data warehouse system that facilitates the building of databases with complex data integration requirements and a need for a fast customizable query facility. Using InterMine, large biological databases can be created from a range of heterogeneous data sources, and the extensible data model allows for easy integration of new data types”. | Open Source | Unknown | 23 | GO Annotation, GO OBO, Treefam, Homologene, OrthoDB, Panther, Ensembl, Compara, BioGRID, IntAct, PSI-MI Ontology, KEGG, Reactome, UniProt, Protein Data Bank, InterPro, PubMed, Ensembl SNP, Chado, Ensembl Core, FASTA, GFF3, OMIM, Uberon | University of Cambridge - Cambridge, United Kingdom |
Scan-MarK [ 65 ] | “An integrated, growing biomarker repository of over 2,000 breast, ovarian, colorectal, non-Hodgkin’s lymphoma and melanoma biomarkers mined and manually curated by PhD. scientist from full-text papers. Annotations include 33 critical data elements (CDEs) organized in computable Sophic Cancer Biomarker Objects (SCBOs). SCan-MarK allows researchers to mine, explore and expose complex biomarker, disease, treatment, outcome relationships graphically displayed as knowledge networks”. | Free Trial | Manual | 30 | Examples: TCGA, dbSNP, Cancer Gene Index, Drugbank, PDB, Sophic’s non-redundant Sanger COSMIC, Medline, ENSEMBL, ENZYME, Go, Interpro, Pfam, Pubchem, Unigene, Taxonomy, Uniprot, Refseq, Entrezgene, Reactome Pathway | Sophic Alliance |