Detecting inconsistency in biological molecular databases using ontologies

Chen, Qingfeng; Chen, Yi-Ping Phoebe; Zhang, Chengqi

doi:10.1007/s10618-007-0071-0

Detecting inconsistency in biological molecular databases using ontologies

Published: 11 July 2007

Volume 15, pages 275–296, (2007)
Cite this article

Data Mining and Knowledge Discovery Aims and scope Submit manuscript

Qingfeng Chen¹,
Yi-Ping Phoebe Chen^1,2 &
Chengqi Zhang³

184 Accesses
Explore all metrics

Abstract

The rapid growth of life science databases demands the fusion of knowledge from heterogeneous databases to answer complex biological questions. The discrepancies in nomenclature, various schemas and incompatible formats of biological databases, however, result in a significant lack of interoperability among databases. Therefore, data preparation is a key prerequisite for biological database mining. Integrating diverse biological molecular databases is an essential action to cope with the heterogeneity of biological databases and guarantee efficient data mining. However, the inconsistency in biological databases is a key issue for data integration. This paper proposes a framework to detect the inconsistency in biological databases using ontologies. A numeric estimate is provided to measure the inconsistency and identify those biological databases that are appropriate for further mining applications. This aids in enhancing the quality of databases and guaranteeing accurate and efficient mining of biological databases.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Artificial Intelligence

References

AmiGO browser, (2005) http://www.godatabase.org/dev/
Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT, Harris MA, Hill DP, Issel-Tarver L, Kasarskis A, Lewis S, Matese JC, Richardson JE, Ringwald M, Rubin GM and Sherlock G (2000). The Gene Ontology Consortium, Gene Ontology: tool for the unification of biology. Nat Genet 25(1): 25–29
Article Google Scholar
Baker PG, Goble CA, Bechhofer S, Paton NW, Stevens R and Brass A (1999). An ontology for bioinformatics applications. Bioinformatics 15(6): 510–520
Article Google Scholar
Benson DA, Karsch-Mizrachi I, Lipman DJ, Ostell J and Wheeler DL (2004). GenBank update. Nucleic Acids Res 32(Database issue): 23–26
Article Google Scholar
Chen Y-PP (ed) (2005) Bioinformatics technologies. Springer.
Chen Y-PP, Colomb BM (2003) Database technologies for L-system simulations in virtual plant applications on bioinformatics. Knowledge Inform Syst 5(3):288–314, Springer-Verlag.
Google Scholar
Chen RO, Felciano R, Altman RB (1997) RiboWeb: Linking structural computations to a knowledge base of published experimental data. In: Proceeding of the 5th international conference on intelligent systems for molecular biology. AAAI Press, pp 84–87
DNA data bank of Japan, http://www.ddbj.nig.ac.jp/
EMBL-the European molecular biology laboratory (2005) http://www.ebi.ac.uk/embl/
Etzold T, Ulyanov A and Argos P (1996). SRS: information retrieval system for molecular biology data banks. Methods Enzymol 226: 114–128
Article Google Scholar
Fujibuchi W, Goto S, Migimatsu H, Uchiyama I, Ogiwara A, Akiyama Y, Kanehisa M (1998) DBGET/LinkDB: an integrated database retrieval system. In: Proceeding of the pacific symposium on biocomputing, pp 683–694, Hawaii
Gene ontology (2006) http://www.geneontology.org/
Gene ontology annotation database (2006) http://www.ebi.ac.uk/GOA
Haas LM, Schwarz PM, Kodali P, Kotlar E, Rice JE, Swope WC (2001) DiscoveryLink: a system for integrated access to life sciences data sources. IBM Syst J 40(2): DOI: 10.1147/sj.402.0489
Hunter L (ed) (1993) Artificial intelligence and molecular biology. MIT Press
Hunter A (2002) Measuring inconsistency in knowledge via quasi-classical models. In: Proceedings of AAAI-02, pp 68–73
Hunter A (2003) Evaluating the Significance of Inconsistencies. In: Proceedings of the International Joint Conference on AI (IJCAI’03), pp 468–473
Karp PD (1995) A strategy for database interoperation. J comput Biol 2(4):59–61
Article Google Scholar
Karp PD (2000). An ontology for biological function based on molecular interactions. Bioinformatics 16(3): 269–285
Article Google Scholar
Karp PD, Riley M, Saier M, Paulsen IT, Paley SM and Pellegrini-Toole A (2000). The EcoCyc and MetaCyc databases. Nucleic Acids Res 30(1): 59–61
Article Google Scholar
Kohler J, Philippi S and Lange M (2003). SEMEDA: ontology based semantic integration of biological databases. Bioinformatics 19(18): 2420–2427
Article Google Scholar
Lin JX (1996). Integration of weighted knowledge bases. Artif Int 83(2): 363–378
Article Google Scholar
Miyazaki S, Sugawara H, Gojobori T and Tateno Y (2003). DNA Data Bank of Japan (DDBJ) in XML. Nucleic Acids Res 31(1): 13–16
Article Google Scholar
Oinn TM (2003). Talisman–rapid application development for the grid. Bioinformatics 19(Suppl): 212–214
Article Google Scholar
Philippi S and Kohler J (2004). Using XML technology for the ontology-based semantic integration of life science databases. IEEE Trans Inf Technol Biomed 8(2): 154–160
Article Google Scholar
Stevens R, Goble C, Horrocks I and Bechhofer S (2002). OILing the way to machine understandable bioinformatics resources. IEEE Trans Inf Technol Biomed 6(2): 129–134
Article Google Scholar
The national center for biotechnology information (NCBI) (2005). http://www.ncbi.nlm.nih.gov/
Williams N (1997). Bioinformatics: how to get databases talking the same language. Science 275(5298): 301–302
Article Google Scholar
Yeh I, Karp PD, Noy NF and Altman RB (2003). Knowledge acquisition, consistency checking and concurrency control for Gene Ontology (GO). Bioinformatics 19(2): 241–248
Article Google Scholar
Zhang SC, Yang Q and Zhang CQ (2003). Data preparation for data mining. Appl Artif Intel 17: 375–382
Article Google Scholar
Zhang SC, Zhang CQ and Yang Q (2004). Information enhancement for data mining. IEEE Intelligent Sys 9(2): 12–13
Article Google Scholar

Download references

Author information

Authors and Affiliations

School of Information Technology, Deakin University, Melbourne, VIC, 3125, Australia
Qingfeng Chen & Yi-Ping Phoebe Chen
ARC Centre in Bioinformatics, Melbourne, Australia
Yi-Ping Phoebe Chen
Faculty of Information Technology, University of Technology, P.O. Box 123, Broadway, Sydney, NSW, 2007, Australia
Chengqi Zhang

Authors

Qingfeng Chen
View author publications
You can also search for this author inPubMed Google Scholar
Yi-Ping Phoebe Chen
View author publications
You can also search for this author inPubMed Google Scholar
Chengqi Zhang
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Qingfeng Chen.

Additional information

Responsible editors: Shichao Zhang and M. J. Zaki.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chen, Q., Chen, YP.P. & Zhang, C. Detecting inconsistency in biological molecular databases using ontologies. Data Min Knowl Disc 15, 275–296 (2007). https://doi.org/10.1007/s10618-007-0071-0

Download citation

Received: 31 July 2006
Accepted: 19 March 2007
Published: 11 July 2007
Issue Date: October 2007
DOI: https://doi.org/10.1007/s10618-007-0071-0

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Detecting inconsistency in biological molecular databases using ontologies

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Primer on Ontologies

Semantic Integration and Enrichment of Heterogeneous Biological Databases

Datamining with Ontologies

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Detecting inconsistency in biological molecular databases using ontologies

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Primer on Ontologies

Semantic Integration and Enrichment of Heterogeneous Biological Databases

Datamining with Ontologies

Explore related subjects

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now