Skip to main content

Generating Data Converters to Help Compose Services in Bioinformatics Workflows

  • Conference paper
Database and Expert Systems Applications (DEXA 2014)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 8644))

Included in the following conference series:

Abstract

Heterogeneity of data and data formats in bioinformatics often entail a mismatch between inputs and outputs of different services, making it difficult to compose them into workflows. To reduce those mismatches bioinformatics platforms propose ad’hoc converters written by hand. This article proposes to systematically detect convertibility from output types to input types. Convertibility detection relies on abstract types, close to XML Schema, allowing to abstract data while precisely accounting for its composite structure. Detection is accompanied by an automatic generation of converters between input and output XML data. Our experiment on bioinformatics services and datatypes, performed with an implementation of our approach, shows that the detected convertibilities and produced converters are relevant from a biological point of view. Furthermore they automatically produce a graph of potentially compatible services with a connectivity higher than with the ad’hoc approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Oinn, T., Greenwood, M., Addis, M., Ferris, J., Glover, K., Goble, C., Hull, D., Marvin, D., Li, P., Lord, P.: Taverna: Lessons in creating a workflow environment for the life sciences. Concurrency and Computation: Practice and Experience 18(10), 1067–1100 (2006)

    Google Scholar 

  2. Gundersen, S., Kalas, M., Abul, O., Frigessi, A., Hovig, E., Sandve, G.K.: Identifying elemental genomic track types and representing them uniformly. BMC Bioinformatics 12, 494 (2011)

    Article  Google Scholar 

  3. Rice, P., Longden, I., Bleasby, A.: Emboss: the european molecular biology open software suite. Trends in Genetics 16(6), 276–277 (2000)

    Article  Google Scholar 

  4. Goecks, J., Nekrutenko, A., Taylor, J., Team, T.G.: Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences. Genome Biology 11(8), R86 (2010)

    Article  Google Scholar 

  5. Ménager, H., Gopalan, V., Néron, B., Larroudé, S., Maupetit, J., Saladin, A., Tufféry, P., Huyen, Y., Caudron, B.: Bioinformatics applications discovery and composition with the mobyle suite and mobyleNet. In: Lacroix, Z., Vidal, M.E. (eds.) RED 2010. LNCS, vol. 6799, pp. 11–22. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  6. Wassink, I.H.C., van der Vet, P.E., Wolstencroft, K., Neerincx, P.B.T., Roos, M., Rauwerda, H., Breit, T.M.: Analysing scientific workflows: Why workflows not only connect web services. In: SERVICES I, pp. 314–321 (2009)

    Google Scholar 

  7. Seibel, P.N., Krüger, J., Hartmeier, S., Schwarzer, K., Löwenthal, K., Mersch, H., Dandekar, T., Giegerich, R.: Xml schemas for common bioinformatic data types and their application in workflow systems. BMC Bioinformatics 7, 490 (2006)

    Article  Google Scholar 

  8. Han, M.V., Zmasek, C.M.: phyloxml: Xml for evolutionary biology and comparative genomics. BMC Bioinformatics 10, 356 (2009)

    Article  Google Scholar 

  9. Kalas, M., Puntervoll, P., Joseph, A., Bartaseviciute, E., Töpfer, A., Venkataraman, P., Pettifer, S., Bryne, J.C., Ison, J.C., Blanchet, C., Rapacki, K., Jonassen, I.: Bioxsd: the common data-exchange format for everyday bioinformatics web services. Bioinformatics 26(18) (2010)

    Google Scholar 

  10. Embley, D.W., Xu, L., Ding, Y.: Automatic direct and indirect schema mapping: Experiences and lessons learned. SIGMOD Record 33(4), 14–19 (2004)

    Article  Google Scholar 

  11. Li, X., Fan, Y., Jiang, F.: A classification of service composition mismatches to support service mediation. In: GCC, pp. 315–321 (2007)

    Google Scholar 

  12. Velasco-Elizondo, P., Dwivedi, V., Garlan, D., Schmerl, B., Fernandes, J.M.: Resolving data mismatches in end-user compositions. In: Dittrich, Y., Burnett, M., Mørch, A., Redmiles, D. (eds.) IS-EUD 2013. LNCS, vol. 7897, pp. 120–136. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  13. Stroulia, E., Wang, Y.: Structural and semantic matching for assessing web-service similarity. Int. J. Cooperative Inf. Syst. 14(4), 407–438 (2005)

    Article  Google Scholar 

  14. DiBernardo, M., Pottinger, R., Wilkinson, M.: Semi-automatic web service composition for the life sciences using the biomoby semantic web framework. Journal of Biomedical Informatics 41(5), 837–847 (2008)

    Article  Google Scholar 

  15. Lebreton, N., Blanchet, C., Claro, D.B., Chabalier, J., Burgun, A., Dameron, O.: Verification of parameters semantic compatibility for semi-automatic web service composition: a generic case study. In: Taniar, D., Pardede, E., Nguyen, H.-Q., Rahayu, J.W., Khalil, I. (eds.) Int. Conf. on Information Integration and Web Based Applications and Services, pp. 845–848. ACM (2010)

    Google Scholar 

  16. Ison, J.C., Kalas, M., Jonassen, I., Bolser, D.M., Uludag, M., McWilliam, H., Malone, J., Lopez, R., Pettifer, S., Rice, P.M.: Edam: an ontology of bioinformatics operations, types of data and identifiers, topics and formats. Bioinformatics 29(10), 1325–1332 (2013)

    Article  Google Scholar 

  17. Wolstencroft, K., Alper, P., Hull, D., Wroe, C., Lord, P.W., Stevens, R.D., Goble, C.A.: The myGrid ontology: bioinformatics service discovery. Int. Journal of Bioinformatics Research and Applications 3(3), 303–325 (2007)

    Article  Google Scholar 

  18. Missier, P., Wolstencroft, K., Tanoh, F., Li, P., Bechhofer, S., Belhajjame, K., Pettifer, S., Goble, C.A.: Functional units: Abstractions for web service annotations. In: SERVICES, pp. 306–313. IEEE Computer Society (2010)

    Google Scholar 

  19. Hosoya, H., Vouillon, J., Pierce, B.C.: Regular expression types for xml. In: ICFP, pp. 11–22 (2000)

    Google Scholar 

  20. Westbrook, J.D., Ito, N., Nakamura, H., Henrick, K., Berman, H.M.: Pdbml: the representation of archival macromolecular structure data in xml. Bioinformatics 21(7), 988–992 (2005)

    Article  Google Scholar 

  21. Dowell, R.D., Jokerst, R.M., Day, A., Eddy, S.R., Stein, L.: The distributed annotation system. BMC Bioinformatics 2, 7 (2001)

    Article  Google Scholar 

  22. The universal protein resource (uniprot) in 2010. Nucleic Acids Research 38(Database-Issue), 142–148 (2010)

    Google Scholar 

  23. McWilliam, H., Valentin, F., Goujon, M., Li, W., Narayanasamy, M., Martin, J., Miyar, T., Lopez, R.: Web services at the european bioinformatics institute-2009. Nucleic Acids Research 37(Web-Server-Issue), 6–10 (2009)

    Article  Google Scholar 

  24. Wilkinson, M.D., Links, M.: Biomoby: An open source biological web services proposal. Briefings in Bioinformatics 3(4), 331–341 (2002)

    Article  Google Scholar 

  25. Sirin, E., Hendler, J., Parsia, B.: Semi-automatic composition of web services using semantic descriptions. In: Web Services: Modeling, Architecture and Infrastructure Workshop in ICEIS, vol. 2003. Citeseer (2003)

    Google Scholar 

  26. Ríos, J., Karlsson, T.J.M., Trelles, O.: Magallanes: a web services discovery and automatic workflow composition tool. BMC Bioinformatics 10, 334 (2009)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Ba, M., Ferré, S., Ducassé, M. (2014). Generating Data Converters to Help Compose Services in Bioinformatics Workflows. In: Decker, H., Lhotská, L., Link, S., Spies, M., Wagner, R.R. (eds) Database and Expert Systems Applications. DEXA 2014. Lecture Notes in Computer Science, vol 8644. Springer, Cham. https://doi.org/10.1007/978-3-319-10073-9_23

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-10073-9_23

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-10072-2

  • Online ISBN: 978-3-319-10073-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics