skip to main content
10.1145/1363686.1363983acmconferencesArticle/Chapter ViewAbstractPublication PagessacConference Proceedingsconference-collections
research-article

OrthoSearch: a scientific workflow approach to detect distant homologies on protozoans

Published: 16 March 2008 Publication History

Abstract

Managing bioinformatics experiments is challenging due to the orchestration and interoperation of tools with semantics. An effective approach for managing those experiments is through workflow management systems (WfMS). We present several WfMS features for supporting genome homology workflows and discuss relevant issues for typical genomic experiments. In our evaluation we used OrthoSearch, a real genomic pipeline originally defined as a Perl script. We modeled it as a scientific workflow and implemented it on Kepler WfMS. We show a case study detecting distant homologies on trypanomatids metabolic pathways. Our results reinforce the benefits of WfMS over script languages and point out challenges to WfMS in distributed environments.

References

[1]
Altintas, I., et al. "Kepler: An Extensible System for Design and Execution of Scientific Workflows", In SSDBM, (2004), 423--424.
[2]
Altintas, I., et al. "A Framework for the Design and Reuse of Grid Workflows", In SAG'04, LNCS 3458, Springer (2005), 120--133.
[3]
Cruz, S. M. S., et al. "Monitoring Bioinformatics Web services Requests and Responses through a Log Based Architecture". In SEMISH (2005), 1787--1801.
[4]
Bowers, S., et al., "A Model for User-Oriented Data Provenance in Pipelined Scientific Workflows", In LNCS 4145, Springer, (2006), 133--147.
[5]
Cohen, S., Cohen-Boulakia, S., Davidson, S. B., "Towards a Model of Provenance and User Views in Scientific Workflows", In LNCS 4076, Springer (2006), 264--279.
[6]
Krauter, K., Buyya, R. and Maheswaran M., "A Taxonomy and Survey of Grid Resource Management Systems for Distributed Computing". Software: Practice and Experience, v. 32(2) (2002), 135--164.
[7]
Ludäscher, B., et al. "Scientific workflow management and the Kepler system". Concur. Comput.: Pract. Exper. v. 18(10) (2006), 1039--1065.
[8]
Oinn, T., et al., "Taverna: Lessons in creating a workflow environment for the life sciences". Concur. Comput.: Pract. Exper. v.18 (10) (2006), 1067--1100.
[9]
Stevens R., Zhao J., Goble C., "Using provenance to manage knowledge of in silico experiments". Brifings in Bioinformatics. v. 8 (2007), 183--194.
[10]
Simmhan Y. L., Plale B., Gannon D., "A Survey of Data Provenance in e-Science". ACM SIGMOD Record v. 34 (2005), 31--36.
[11]
Venugopal, S. Buyya, R. and Ramamohanarao. K., "A Taxonomy of Data Grids for Distributed Data Sharing, Management, and Processing". ACM Computing Surveys, v. 38 (2006), 1--53.
[12]
Yu, J. Buyya, R., "A taxonomy of scientific workflow systems for grid computing". ACM SIGMOD Record v. 34, (2005), 44--49.
[13]
Zanikolas, S. Sakellariou R. "A taxonomy of grid monitoring systems". Future Generation Computer Systems v. 21(1) (2005), 163--188.

Cited By

View all
  • (2017)Managing workflows on top of a cloud computing orchestrator for using heterogeneous environments on e-ScienceInternational Journal of Web and Grid Services10.1504/IJWGS.2017.08732613:4(375-402)Online publication date: 1-Jan-2017
  • (2014)A performance/cost model for a CUDA drug discovery application on physical and public cloud infrastructuresConcurrency and Computation: Practice & Experience10.1002/cpe.311726:10(1787-1798)Online publication date: 1-Jul-2014
  • (2012)A framework for readapting and running bioinformatics applications in the cloudProceedings of the 2012 ACM Research in Applied Computation Symposium10.1145/2401603.2401624(86-91)Online publication date: 23-Oct-2012
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SAC '08: Proceedings of the 2008 ACM symposium on Applied computing
March 2008
2586 pages
ISBN:9781595937537
DOI:10.1145/1363686
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 16 March 2008

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. bioinformatics
  2. provenance
  3. scientific workflows

Qualifiers

  • Research-article

Conference

SAC '08
Sponsor:
SAC '08: The 2008 ACM Symposium on Applied Computing
March 16 - 20, 2008
Fortaleza, Ceara, Brazil

Acceptance Rates

Overall Acceptance Rate 1,650 of 6,669 submissions, 25%

Upcoming Conference

SAC '25
The 40th ACM/SIGAPP Symposium on Applied Computing
March 31 - April 4, 2025
Catania , Italy

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)1
  • Downloads (Last 6 weeks)0
Reflects downloads up to 13 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2017)Managing workflows on top of a cloud computing orchestrator for using heterogeneous environments on e-ScienceInternational Journal of Web and Grid Services10.1504/IJWGS.2017.08732613:4(375-402)Online publication date: 1-Jan-2017
  • (2014)A performance/cost model for a CUDA drug discovery application on physical and public cloud infrastructuresConcurrency and Computation: Practice & Experience10.1002/cpe.311726:10(1787-1798)Online publication date: 1-Jul-2014
  • (2012)A framework for readapting and running bioinformatics applications in the cloudProceedings of the 2012 ACM Research in Applied Computation Symposium10.1145/2401603.2401624(86-91)Online publication date: 23-Oct-2012
  • (2012)A Provenance-based Adaptive Scheduling Heuristic for Parallel Scientific Workflows in CloudsJournal of Grid Computing10.1007/s10723-012-9227-210:3(521-552)Online publication date: 1-Sep-2012
  • (2011)Many task computing for orthologous genes identification in protozoan genomes using HydraConcurrency and Computation: Practice & Experience10.1002/cpe.178623:17(2326-2337)Online publication date: 1-Dec-2011
  • (2010)Detecting distant homologies on protozoans metabolic pathways using scientific workflowsInternational Journal of Data Mining and Bioinformatics10.1504/IJDMB.2010.0335204:3(256-280)Online publication date: 1-Jun-2010
  • (2010)Data parallelism in bioinformatics workflows using HydraProceedings of the 19th ACM International Symposium on High Performance Distributed Computing10.1145/1851476.1851550(507-515)Online publication date: 21-Jun-2010

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media