skip to main content
10.1145/1497308.1497392acmconferencesArticle/Chapter ViewAbstractPublication PagesiiwasConference Proceedingsconference-collections
research-article

BioRegistry: automatic extraction of metadata for biological database retrieval and discovery

Published: 24 November 2008 Publication History

Abstract

Biological databases are blooming today at an increasing rate to deal with the huge amount of data produced by genomic and post-genomic research. The need for a well-maintained searchable directory is therefore an important issue for a good exploitation of these databases. The BioRegistry repository is automatically generated from a publicly available list of biological databases (The Molecular Biology Database Collection published in Nucleic Acids Research) and aims at associating content metadata with each database in view of database retrieval and/or discovery. Such content metadata are either simple keywords or terms belonging to a medical thesaurus. Querying modalities including a search by semantic similarity are described. The use of conceptual clustering methods is proposed to build a semantic classification of biological databases enabling browsing through the BioRegistry repository and discovering previously unknown databases.

References

[1]
Barbosa L. and Freire J. Combining classifiers to identify online databases. 16th international conference on World Wide Web, pp 431--440, 2007.
[2]
BioMoby Consortium. Interoperability with Moby 1.0-It's better than sharing your toothbrush! Briefings in Bioinformatics 9:3, pp. 220--231. 2008.
[3]
Brazas M. D, Fox J. A., Brown T., McMillan S., Ouellette B. F. Keeping pace with the data: 2008 update on the Bioinformatics Links Directory. Nucl. Acids Res. 36 (Web Server Issue): W2--4, 2008.
[4]
Cannata N., Merelli E., Altmann R. Time to organize the Bioinformatics Resourceome. PLOS Computational Biology 1: e76, 2005.
[5]
Dekkers M., Weibel S. State of the Dublin Core Metadata Initiative. D-Lib Magazine 9:4, 2003
[6]
Discala C., Benigni X., Barillot E., Vaysseix G. DBCAT: a catalog of 500 biological databases. Nucl. Acids Res. 28:8--9, 2000.
[7]
Freier, A., Hofestädt, R., Lange, M., Scholz, U. BioDataServer: A SQL-based service for the online integration of life science data. In Silico Biol. 2, pp. 37--57, 2002.
[8]
Galperin M. Y. The Molecular Biology Database Collection: 2008 update. Nucl. Acids Res. 36: D2--D4, 2008.
[9]
Ganter B., Stumme G., Wille R., eds. Formal Concept Analysis, Foundations and Applications, volume 3626 of LNCS. Springer, 2005.
[10]
Gene Ontology Consortium. The Gene Ontology (GO) database and informatics resource. Nucleic Acids Research 32:D258--D261, 2004.
[11]
Knox C., Shrivastava S., Stothard P., Eisner R., Wishart D. S. BioSpider: a web server for automating metabolome annotations. Pacific Symposium on Biocomputing (PSB 2007) pp. 145--156, 2007.
[12]
Lacroix Z., Boucelma O., Essid M. The biological integration system. Proceedings of the 5th ACM international workshop on Web information and data management. ACM, New Orleans, USA, 2003.
[13]
Lenzerini M. Data Integration: A Theoretical Perspective. PODS 2002: 233--246, 2002.
[14]
Lord P., Alper P., Wroe C., Goble C. Feta: A light-weight architecture for user oriented semantic service discovery. In A. Gómez-Pérez and J. Euzenat, editors, European Semantic Web Conference, pages 17--31. Springer-Verlag. 2005.
[15]
Martin D., Brun C., Remy E., Mouren P., Thieffry D., Jacq B. GOToolBox: functional analysis of gene datasets based on Gene Ontology, Genome biology, 5, R101, 2004
[16]
Messai N., Devignes M.-D., Napoli A., Smail-Tabbone M. Many-Valued Concept Lattices for Conceptual Clustering and Information Retrieval. In Ghallab M. et al, editors, 18th European Conference in Artificial Intelligence, ECAI 2008, pp 127--131, IOS Press, 2008.
[17]
Rodriguez-Tomé P. The BioCatalog. Bioinformatics. 14:469--470, 1998.
[18]
Rousset M. C. and Reynaud C. Knowledge representation for information integration. Information Systems 29:3--22. 2004.
[19]
Shaker R., Mork P., Brickenbrough JS., Domelson L., Tarczy-Hornoch P. The BioMediator System as a tool for integrating biological databases on the web. 30th VLDB Conference, Toronto, 2004.
[20]
Smail-Tabbone M., Osman S., Messai N., Napoli A., Devignes M-D. BioRegistry: a structured metadata repository for bioinformatic databases. In Proc. Computational Life Sciences: First International Symposium, Konstanz, Germany, LNCS, volume 3695, pages 46--56. Springer-Verlag Berlin Heidelberg, 2005.
[21]
Wroe C., Stevens R., Goble C., Roberts A., Greenwood M. A suite of DAML+OIL ontologies to describe bioinformatics web services and data. International Journal of Cooperative Information Systems 12:197--224, 2003.

Cited By

View all
  • (2011)Using Papers Citations for Selecting the Best Genomic DatabasesProceedings of the 2011 30th International Conference of the Chilean Computer Science Society10.1109/SCCC.2011.6(33-42)Online publication date: 9-Nov-2011
  • (2010)Using Domain Knowledge to Guide Lattice-based Complex Data ExplorationProceedings of the 2010 conference on ECAI 2010: 19th European Conference on Artificial Intelligence10.5555/1860967.1861132(847-852)Online publication date: 4-Aug-2010

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
iiWAS '08: Proceedings of the 10th International Conference on Information Integration and Web-based Applications & Services
November 2008
703 pages
ISBN:9781605583495
DOI:10.1145/1497308
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 24 November 2008

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. biological databases
  2. conceptual clustering
  3. metadata
  4. resource discovery

Qualifiers

  • Research-article

Conference

iiWAS08
Sponsor:

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 18 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2011)Using Papers Citations for Selecting the Best Genomic DatabasesProceedings of the 2011 30th International Conference of the Chilean Computer Science Society10.1109/SCCC.2011.6(33-42)Online publication date: 9-Nov-2011
  • (2010)Using Domain Knowledge to Guide Lattice-based Complex Data ExplorationProceedings of the 2010 conference on ECAI 2010: 19th European Conference on Artificial Intelligence10.5555/1860967.1861132(847-852)Online publication date: 4-Aug-2010

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media