Article

A complex biological database querying method

Authors:

Jake Yue Chen,

John V. Carlis,

Ning GaoAuthors Info & Claims

SAC '05: Proceedings of the 2005 ACM symposium on Applied computing

Pages 110 - 114

https://doi.org/10.1145/1066677.1066708

Published: 13 March 2005 Publication History

Get Access

Abstract

Many biological information systems rely on relational database management systems (RDBMS) to manage high-throughput biological data. While keeping these data well archived, organized, and integrated in a common repository is still a challenging task, performing complex queries, i.e., explorative and abstract ad hoc user questions in biology, is an even formidable task often substituted by writing complicated software programs. In this work, we propose a "complex query modeling" method to address the challenge of complex querying in biological domains. Query modeling consists of four distinct but interdependent phases of activities: representation of high-level problems, transformation of problems into connected query interfaces, designing database query structures, and translating query plans into high-performing SQL statements. At each stage, we use different notations and query modeling practices. Using gene indexing project as a case study, we show that query modeling enables prototypical development of high-quality SQL solutions to an inherently abstract and vague user query question, which requires GeneChip designers to sift through millions of database records, process data in dozens of steps, and make myriads of intermediate decisions. We believe our "complex query modeling" method is applicable to other bioinformatics domains with needs for complex database queries.

References

[1]

Persidis, A., Bioinformatics. Nature biotechnology, 1999. 17: p. 828--30.

Google Scholar

[2]

Lawrence, P., Workflow handbook 1997. 1997, Chichester; New York: John Wiley, xxiii, 508.

Digital Library

Google Scholar

[3]

Lakshmanan, L.V.S., F. Sadri, and S. N. Subramanian, SchemaSQL: An extension to SQL for multidatabase interoperability. ACM Transactions on Database Systems, 2001.26(4).

Digital Library

Google Scholar

[4]

Chen, C. X., J. Kong, and C. Zaniolo. Design and Implementation of a Temporal Extension of SQL. in 19th International Conference on Data Engineering. 2003.

Crossref

Google Scholar

[5]

Egenhofer, M. J., Spatial SQL: A Query and Presentation Language. IEEE Transactions on Knowledge and Data Engineering, 1994: p. 86--95.

Digital Library

Google Scholar

[6]

Etzold, T., A. Ulyanov, and P. Argos, SRS: information retrieval system for molecular biology data banks. Methods Enzymol, 1996. 266: p. 114--28.

Crossref

Google Scholar

[7]

Miled, Z. B., et al. SIBIOS: A System for the Integration of Bioinformatics Services. in Second International Workshop on Challenges of Large Applications in Distributed Environments. 2004.

Digital Library

Google Scholar

[8]

Stevens, R., et al., TAMBIS: transparent access to multiple bioinformatics information sources. Bioinformatics, 2000. 16(2): p. 184--5.

Google Scholar

[9]

Schena, M., et al., Quantitative monitoring of gene expression patterns with a complementary DNA microarray. Science, 1995. 270(5235): p. 467--70.

Google Scholar

[10]

Chen, J. Y. and J. V. Carlis. Managing Bioinformatics Challenges in Expression Microarray Sequence Selection Projects, in Proceedings of the Second Chinese Conference on Bioinformatics. 2002. Beijing, China.

Google Scholar

[11]

Chen, J. Y. and J. V. Carlis, Genomic Data Modeling. Information Systems, 2003. 28(4): p. 287--310.

Digital Library

Google Scholar

[12]

Carlis, J. V. and S. Krieger, Mastering Database Analysis. 2004, (to be published): Addison-Wesley.

Google Scholar

[13]

Chen, J. Y., PhD Thesis: A Bioinformatics Discovery-oriented Framework. 2001, University of Minnesota: Minneapolis.

Digital Library

Google Scholar

Cited By

View all

Boselli RCesarini MMercorio FMezzanzanica M(2014)A Policy-Based Cleansing and Integration Framework for Labour and Healthcare DataInteractive Knowledge Discovery and Data Mining in Biomedical Informatics10.1007/978-3-662-43968-5_8(141-168)Online publication date: 2014
https://doi.org/10.1007/978-3-662-43968-5_8
Rashid UNiaz IBhatti M(2009)Fusion of Multimedia Document Intra-Modality Relevancies using Linear Combination ModelAdvanced Techniques in Computing Sciences and Software Engineering10.1007/978-90-481-3660-5_98(575-580)Online publication date: 15-Dec-2009
https://doi.org/10.1007/978-90-481-3660-5_98
Jin YTescher MXu H(2007)Middleware Support to the Specification and Execution of Active Rules for Biological Database Constraint Management2007 IEEE International Conference on Information Reuse and Integration10.1109/IRI.2007.4296654(406-411)Online publication date: Aug-2007
https://doi.org/10.1109/IRI.2007.4296654

Recommendations

Query-based Performance Comparison of Graph Database and Relational Database
SoICT '22: Proceedings of the 11th International Symposium on Information and Communication Technology

A graph database is a type of NoSQL database that uses graph structure for semantic queries with nodes, edges, and properties to represent and store data. It has been applied in many fields, such as education, health, business, and social network, with ...
Using object-oriented materialized views to answer selection-based complex queries
An index selection method without repeated optimizer estimations

The index selection problem (ISP) concerns the selection of an appropriate index set to minimize the total cost for a given workload containing read and update queries. Since the ISP has been proven to be an NP-hard problem, most studies focus on ...

Comments

Information & Contributors

Information

Published In

SAC '05: Proceedings of the 2005 ACM symposium on Applied computing

March 2005

1814 pages

ISBN:1581139640

DOI:10.1145/1066677

Conference Chair:
Hisham M. Haddad
Kennesaw State University
,
Editor:
Lorie M. Liebrock
New Mexico Institute of Mining and Technology, Socorro, NM
,
Program Chairs:
Andrea Omicini
Alma Mater Studiorum, Universita di Bologna, Italy
,
Roger L. Wainwright
Univerity of Tulsa, OK

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 March 2005

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Article

Conference

SAC05

Sponsor:

SIGAPP

SAC05: The 2005 ACM Symposium on Applied Computing

March 13 - 17, 2005

New Mexico, Santa Fe

Acceptance Rates

Overall Acceptance Rate 1,650 of 6,669 submissions, 25%

Upcoming Conference

SAC '25

Sponsor:
sigapp

The 40th ACM/SIGAPP Symposium on Applied Computing

March 31 - April 4, 2025

Catania , Italy

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
673
Total Downloads

Downloads (Last 12 months)3
Downloads (Last 6 weeks)0

Reflects downloads up to 16 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

View all

Boselli RCesarini MMercorio FMezzanzanica M(2014)A Policy-Based Cleansing and Integration Framework for Labour and Healthcare DataInteractive Knowledge Discovery and Data Mining in Biomedical Informatics10.1007/978-3-662-43968-5_8(141-168)Online publication date: 2014
https://doi.org/10.1007/978-3-662-43968-5_8
Rashid UNiaz IBhatti M(2009)Fusion of Multimedia Document Intra-Modality Relevancies using Linear Combination ModelAdvanced Techniques in Computing Sciences and Software Engineering10.1007/978-90-481-3660-5_98(575-580)Online publication date: 15-Dec-2009
https://doi.org/10.1007/978-90-481-3660-5_98
Jin YTescher MXu H(2007)Middleware Support to the Specification and Execution of Active Rules for Biological Database Constraint Management2007 IEEE International Conference on Information Reuse and Integration10.1109/IRI.2007.4296654(406-411)Online publication date: Aug-2007
https://doi.org/10.1109/IRI.2007.4296654

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Abstract

References

Cited By

Recommendations

Query-based Performance Comparison of Graph Database and Relational Database

Using object-oriented materialized views to answer selection-based complex queries

An index selection method without repeated optimizer estimations

Comments

Information

Published In

Sponsors

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

Login options

Full Access

View options

PDF

eReader

Share

Share this Publication link

Share on social media

Affiliations