Abstract
Heterogeneity of data and data formats in bioinformatics often entail a mismatch between inputs and outputs of different services, making it difficult to compose them into workflows. To reduce those mismatches bioinformatics platforms propose ad’hoc converters written by hand. This article proposes to systematically detect convertibility from output types to input types. Convertibility detection relies on abstract types, close to XML Schema, allowing to abstract data while precisely accounting for its composite structure. Detection is accompanied by an automatic generation of converters between input and output XML data. Our experiment on bioinformatics services and datatypes, performed with an implementation of our approach, shows that the detected convertibilities and produced converters are relevant from a biological point of view. Furthermore they automatically produce a graph of potentially compatible services with a connectivity higher than with the ad’hoc approaches.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Oinn, T., Greenwood, M., Addis, M., Ferris, J., Glover, K., Goble, C., Hull, D., Marvin, D., Li, P., Lord, P.: Taverna: Lessons in creating a workflow environment for the life sciences. Concurrency and Computation: Practice and Experience 18(10), 1067–1100 (2006)
Gundersen, S., Kalas, M., Abul, O., Frigessi, A., Hovig, E., Sandve, G.K.: Identifying elemental genomic track types and representing them uniformly. BMC Bioinformatics 12, 494 (2011)
Rice, P., Longden, I., Bleasby, A.: Emboss: the european molecular biology open software suite. Trends in Genetics 16(6), 276–277 (2000)
Goecks, J., Nekrutenko, A., Taylor, J., Team, T.G.: Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences. Genome Biology 11(8), R86 (2010)
Ménager, H., Gopalan, V., Néron, B., Larroudé, S., Maupetit, J., Saladin, A., Tufféry, P., Huyen, Y., Caudron, B.: Bioinformatics applications discovery and composition with the mobyle suite and mobyleNet. In: Lacroix, Z., Vidal, M.E. (eds.) RED 2010. LNCS, vol. 6799, pp. 11–22. Springer, Heidelberg (2012)
Wassink, I.H.C., van der Vet, P.E., Wolstencroft, K., Neerincx, P.B.T., Roos, M., Rauwerda, H., Breit, T.M.: Analysing scientific workflows: Why workflows not only connect web services. In: SERVICES I, pp. 314–321 (2009)
Seibel, P.N., Krüger, J., Hartmeier, S., Schwarzer, K., Löwenthal, K., Mersch, H., Dandekar, T., Giegerich, R.: Xml schemas for common bioinformatic data types and their application in workflow systems. BMC Bioinformatics 7, 490 (2006)
Han, M.V., Zmasek, C.M.: phyloxml: Xml for evolutionary biology and comparative genomics. BMC Bioinformatics 10, 356 (2009)
Kalas, M., Puntervoll, P., Joseph, A., Bartaseviciute, E., Töpfer, A., Venkataraman, P., Pettifer, S., Bryne, J.C., Ison, J.C., Blanchet, C., Rapacki, K., Jonassen, I.: Bioxsd: the common data-exchange format for everyday bioinformatics web services. Bioinformatics 26(18) (2010)
Embley, D.W., Xu, L., Ding, Y.: Automatic direct and indirect schema mapping: Experiences and lessons learned. SIGMOD Record 33(4), 14–19 (2004)
Li, X., Fan, Y., Jiang, F.: A classification of service composition mismatches to support service mediation. In: GCC, pp. 315–321 (2007)
Velasco-Elizondo, P., Dwivedi, V., Garlan, D., Schmerl, B., Fernandes, J.M.: Resolving data mismatches in end-user compositions. In: Dittrich, Y., Burnett, M., Mørch, A., Redmiles, D. (eds.) IS-EUD 2013. LNCS, vol. 7897, pp. 120–136. Springer, Heidelberg (2013)
Stroulia, E., Wang, Y.: Structural and semantic matching for assessing web-service similarity. Int. J. Cooperative Inf. Syst. 14(4), 407–438 (2005)
DiBernardo, M., Pottinger, R., Wilkinson, M.: Semi-automatic web service composition for the life sciences using the biomoby semantic web framework. Journal of Biomedical Informatics 41(5), 837–847 (2008)
Lebreton, N., Blanchet, C., Claro, D.B., Chabalier, J., Burgun, A., Dameron, O.: Verification of parameters semantic compatibility for semi-automatic web service composition: a generic case study. In: Taniar, D., Pardede, E., Nguyen, H.-Q., Rahayu, J.W., Khalil, I. (eds.) Int. Conf. on Information Integration and Web Based Applications and Services, pp. 845–848. ACM (2010)
Ison, J.C., Kalas, M., Jonassen, I., Bolser, D.M., Uludag, M., McWilliam, H., Malone, J., Lopez, R., Pettifer, S., Rice, P.M.: Edam: an ontology of bioinformatics operations, types of data and identifiers, topics and formats. Bioinformatics 29(10), 1325–1332 (2013)
Wolstencroft, K., Alper, P., Hull, D., Wroe, C., Lord, P.W., Stevens, R.D., Goble, C.A.: The myGrid ontology: bioinformatics service discovery. Int. Journal of Bioinformatics Research and Applications 3(3), 303–325 (2007)
Missier, P., Wolstencroft, K., Tanoh, F., Li, P., Bechhofer, S., Belhajjame, K., Pettifer, S., Goble, C.A.: Functional units: Abstractions for web service annotations. In: SERVICES, pp. 306–313. IEEE Computer Society (2010)
Hosoya, H., Vouillon, J., Pierce, B.C.: Regular expression types for xml. In: ICFP, pp. 11–22 (2000)
Westbrook, J.D., Ito, N., Nakamura, H., Henrick, K., Berman, H.M.: Pdbml: the representation of archival macromolecular structure data in xml. Bioinformatics 21(7), 988–992 (2005)
Dowell, R.D., Jokerst, R.M., Day, A., Eddy, S.R., Stein, L.: The distributed annotation system. BMC Bioinformatics 2, 7 (2001)
The universal protein resource (uniprot) in 2010. Nucleic Acids Research 38(Database-Issue), 142–148 (2010)
McWilliam, H., Valentin, F., Goujon, M., Li, W., Narayanasamy, M., Martin, J., Miyar, T., Lopez, R.: Web services at the european bioinformatics institute-2009. Nucleic Acids Research 37(Web-Server-Issue), 6–10 (2009)
Wilkinson, M.D., Links, M.: Biomoby: An open source biological web services proposal. Briefings in Bioinformatics 3(4), 331–341 (2002)
Sirin, E., Hendler, J., Parsia, B.: Semi-automatic composition of web services using semantic descriptions. In: Web Services: Modeling, Architecture and Infrastructure Workshop in ICEIS, vol. 2003. Citeseer (2003)
Ríos, J., Karlsson, T.J.M., Trelles, O.: Magallanes: a web services discovery and automatic workflow composition tool. BMC Bioinformatics 10, 334 (2009)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Ba, M., Ferré, S., Ducassé, M. (2014). Generating Data Converters to Help Compose Services in Bioinformatics Workflows. In: Decker, H., Lhotská, L., Link, S., Spies, M., Wagner, R.R. (eds) Database and Expert Systems Applications. DEXA 2014. Lecture Notes in Computer Science, vol 8644. Springer, Cham. https://doi.org/10.1007/978-3-319-10073-9_23
Download citation
DOI: https://doi.org/10.1007/978-3-319-10073-9_23
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-10072-2
Online ISBN: 978-3-319-10073-9
eBook Packages: Computer ScienceComputer Science (R0)