skip to main content
10.1145/1066677.1066708acmconferencesArticle/Chapter ViewAbstractPublication PagessacConference Proceedingsconference-collections
Article

A complex biological database querying method

Published: 13 March 2005 Publication History

Abstract

Many biological information systems rely on relational database management systems (RDBMS) to manage high-throughput biological data. While keeping these data well archived, organized, and integrated in a common repository is still a challenging task, performing complex queries, i.e., explorative and abstract ad hoc user questions in biology, is an even formidable task often substituted by writing complicated software programs. In this work, we propose a "complex query modeling" method to address the challenge of complex querying in biological domains. Query modeling consists of four distinct but interdependent phases of activities: representation of high-level problems, transformation of problems into connected query interfaces, designing database query structures, and translating query plans into high-performing SQL statements. At each stage, we use different notations and query modeling practices. Using gene indexing project as a case study, we show that query modeling enables prototypical development of high-quality SQL solutions to an inherently abstract and vague user query question, which requires GeneChip designers to sift through millions of database records, process data in dozens of steps, and make myriads of intermediate decisions. We believe our "complex query modeling" method is applicable to other bioinformatics domains with needs for complex database queries.

References

[1]
Persidis, A., Bioinformatics. Nature biotechnology, 1999. 17: p. 828--30.
[2]
Lawrence, P., Workflow handbook 1997. 1997, Chichester; New York: John Wiley, xxiii, 508.
[3]
Lakshmanan, L.V.S., F. Sadri, and S. N. Subramanian, SchemaSQL: An extension to SQL for multidatabase interoperability. ACM Transactions on Database Systems, 2001.26(4).
[4]
Chen, C. X., J. Kong, and C. Zaniolo. Design and Implementation of a Temporal Extension of SQL. in 19th International Conference on Data Engineering. 2003.
[5]
Egenhofer, M. J., Spatial SQL: A Query and Presentation Language. IEEE Transactions on Knowledge and Data Engineering, 1994: p. 86--95.
[6]
Etzold, T., A. Ulyanov, and P. Argos, SRS: information retrieval system for molecular biology data banks. Methods Enzymol, 1996. 266: p. 114--28.
[7]
Miled, Z. B., et al. SIBIOS: A System for the Integration of Bioinformatics Services. in Second International Workshop on Challenges of Large Applications in Distributed Environments. 2004.
[8]
Stevens, R., et al., TAMBIS: transparent access to multiple bioinformatics information sources. Bioinformatics, 2000. 16(2): p. 184--5.
[9]
Schena, M., et al., Quantitative monitoring of gene expression patterns with a complementary DNA microarray. Science, 1995. 270(5235): p. 467--70.
[10]
Chen, J. Y. and J. V. Carlis. Managing Bioinformatics Challenges in Expression Microarray Sequence Selection Projects, in Proceedings of the Second Chinese Conference on Bioinformatics. 2002. Beijing, China.
[11]
Chen, J. Y. and J. V. Carlis, Genomic Data Modeling. Information Systems, 2003. 28(4): p. 287--310.
[12]
Carlis, J. V. and S. Krieger, Mastering Database Analysis. 2004, (to be published): Addison-Wesley.
[13]
Chen, J. Y., PhD Thesis: A Bioinformatics Discovery-oriented Framework. 2001, University of Minnesota: Minneapolis.

Cited By

View all
  • (2014)A Policy-Based Cleansing and Integration Framework for Labour and Healthcare DataInteractive Knowledge Discovery and Data Mining in Biomedical Informatics10.1007/978-3-662-43968-5_8(141-168)Online publication date: 2014
  • (2009)Fusion of Multimedia Document Intra-Modality Relevancies using Linear Combination ModelAdvanced Techniques in Computing Sciences and Software Engineering10.1007/978-90-481-3660-5_98(575-580)Online publication date: 15-Dec-2009
  • (2007)Middleware Support to the Specification and Execution of Active Rules for Biological Database Constraint Management2007 IEEE International Conference on Information Reuse and Integration10.1109/IRI.2007.4296654(406-411)Online publication date: Aug-2007

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SAC '05: Proceedings of the 2005 ACM symposium on Applied computing
March 2005
1814 pages
ISBN:1581139640
DOI:10.1145/1066677
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 March 2005

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. complex queries
  2. database management system (DBMS)
  3. query modeling

Qualifiers

  • Article

Conference

SAC05
Sponsor:
SAC05: The 2005 ACM Symposium on Applied Computing
March 13 - 17, 2005
New Mexico, Santa Fe

Acceptance Rates

Overall Acceptance Rate 1,650 of 6,669 submissions, 25%

Upcoming Conference

SAC '25
The 40th ACM/SIGAPP Symposium on Applied Computing
March 31 - April 4, 2025
Catania , Italy

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)3
  • Downloads (Last 6 weeks)0
Reflects downloads up to 16 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2014)A Policy-Based Cleansing and Integration Framework for Labour and Healthcare DataInteractive Knowledge Discovery and Data Mining in Biomedical Informatics10.1007/978-3-662-43968-5_8(141-168)Online publication date: 2014
  • (2009)Fusion of Multimedia Document Intra-Modality Relevancies using Linear Combination ModelAdvanced Techniques in Computing Sciences and Software Engineering10.1007/978-90-481-3660-5_98(575-580)Online publication date: 15-Dec-2009
  • (2007)Middleware Support to the Specification and Execution of Active Rules for Biological Database Constraint Management2007 IEEE International Conference on Information Reuse and Integration10.1109/IRI.2007.4296654(406-411)Online publication date: Aug-2007

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media