Skip to main content

Automatic Methods for Integrating Biomedical Data Sources in a Mediator-Based System

  • Conference paper
Data Integration in the Life Sciences (DILS 2008)

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 5109))

Included in the following conference series:

Abstract

The information needed by biologists and physicians for research purposes is distributed over many heterogeneous sources. Integration systems provide a single, centralized and homogeneous interface for users to query multiple information sources simultaneously. The major limitation of integration systems, including mediator-based systems, is that the tasks involved in their creation and maintenance remain mainly manual. To address this limitation, we developed automated methods for facilitating the creation of a mediator-based system. We first implemented an automatic method for acquiring the local schemas of the sources to be integrated. We derived the global schema from the UMLS. Finally, we proposed schema- and instance-based approaches to mapping data elements from the local schemas to the global schema. To illustrate the applicability of our methods, we created a mediator-based system integrating eleven biomedical sources. This prototype is operational, available on the Internet (http://www.med.univ-rennes1.fr/cgi-bin/mougin/These/system.pl) and its evolution is managed semi-automatically.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 69.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Hernandez, T., Kambhampati, S.: Integration of Biological Sources: Current Systems and Challenges Ahead. In: Proc. ACM SIGMOD Conf., vol. 33(3), pp. 51–60 (2004)

    Google Scholar 

  2. Davidson, S.B., Crabtree, J., Brunk, B.P., Schug, J., Tannen, V., Overton, G.C., Stoeckert Jr., C.J.: K2/Kleisli and GUS: experiments in integrated access to genomic data sources. IBM Syst. J. 40(2), 512–531 (2001)

    Article  Google Scholar 

  3. Cohen-Boulakia, S., Davidson, S.B., Froidevaux, C.: A User-Centric Framework for Accessing Biological Sources and Tools. In: Ludäscher, B., Raschid, L. (eds.) DILS 2005. LNCS (LNBI), vol. 3615, pp. 3–18. Springer, Heidelberg (2005)

    Google Scholar 

  4. Stevens, R., Baker, P.G., Bechhofer, S., Ng, G., Jacoby, A., Paton, N.W., Goble, C.A., Brass, A.: TAMBIS: Transparent Access to Multiple Bioinformatics Information Sources. Bioinformatics 16(2), 184–186 (2000)

    Article  Google Scholar 

  5. Karp, P.D.: A Strategy for Database Interoperation. J. of Comput. Biol. 2(4), 573–583 (1995)

    Article  Google Scholar 

  6. Lindberg, D.A., Humphreys, B.L., McCray, A.T.: The Unified Medical Language System. Methods Inf. Med. 32(4), 281–291 (1993)

    Google Scholar 

  7. McCray, A.T., Srinivasan, S., Browne, A.C.: Lexical methods for managing variation in biomedical terminologies. In: Proc Annu. Symp. Comput. Appl. Med. Care, pp. 235–239 (1994)

    Google Scholar 

  8. Aronson, A.R.: Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program. In: Proc. AMIA Symp., pp. 17–21 (2001)

    Google Scholar 

  9. Mougin, F., Burgun, A., Loréal, O., Le Beux, P.: Towards the automatic generation of biomedical sources schema. Medinfo. 11(2), 783–787 (2004)

    Google Scholar 

  10. Markowitz, V.M., Chen, I.M., Kosky, A.S., Szeto, E.: Facilities for exploring molecular biology databases on the web: a comparative study. In: Pac. Symp. Biocomput., pp. 256–267 (1997)

    Google Scholar 

  11. Bodenreider, O.: Circular hierarchical relationships in the UMLS: etiology, diagnosis, treatment, complications and prevention. In: Proc. AMIA Symp., pp. 57–61 (2001)

    Google Scholar 

  12. Rahm, E., Bernstein, P.A.: A survey of approaches to automatic schema matching. The International Journal on Very Large Data Bases 10(4), 334–350 (2001)

    Article  MATH  Google Scholar 

  13. Shvaiko, P., Euzenat, J.: A survey of schema-based matching approaches. In: Spaccapietra, S. (ed.) Journal on Data Semantics IV. LNCS, vol. 3730, pp. 146–171. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  14. Miller, G.A.: WordNet: A Lexical Database for English. ACM Communications 38(11) (1995)

    Google Scholar 

  15. Van Rijsbergen, C.J.: Information retrieval. Butterworth-Heinemann, Newton (1979)

    Google Scholar 

  16. Efthimiadis, E.N.: Query expansion. Annual review of information science and technology 31, 121–187 (1996)

    Google Scholar 

  17. Baker, P.G., Goble, C.A., Bechhofer, S., Paton, N.W., Stevens, R., Brass, A.: An ontology for bioinformatics applications. Bioinformatics 15(6), 510–520 (1999)

    Article  Google Scholar 

  18. Zhang, S., Bodenreider, O.: Alignment of multiple ontologies of anatomy: Deriving indirect mappings from direct mappings to a reference. In: Proc. AMIA Symp., pp. 864–868 (2005)

    Google Scholar 

  19. Sujansky, W.: Heterogeneous database integration in biomedicine. J. Biomed. Inform. 34(4), 285–298 (2001)

    Article  Google Scholar 

  20. Mork, P., Halevy, A., Tarczy-Hornoch, P.: A model for data integration systems of biomedical data applied to online genetic databases. In: Proc. AMIA Symp., pp. 473–477 (2001)

    Google Scholar 

  21. Ben-Miled, Z., Li, N., Liu, Y., He, Y., Lynch, E., Bukhres, O.: On the Integration of a Large Number of Life Science Web Databases. In: Rahm, E. (ed.) DILS 2004. LNCS (LNBI), vol. 2994, pp. 172–186. Springer, Heidelberg (2004)

    Google Scholar 

  22. Köhler, J., Philippi, S., Lange, M.: SEMEDA: ontology based semantic integration of biological databases. Bioinformatics 19(18), 2420–2427 (2003)

    Article  Google Scholar 

  23. Ehrig, M., Sure, Y.: Ontology mapping - an integrated approach. In: Bussler, C.J., Davies, J., Fensel, D., Studer, R. (eds.) ESWS 2004. LNCS, vol. 3053, pp. 76–91. Springer, Heidelberg (2004)

    Google Scholar 

  24. Zhao, H., Ram, S.: Combining schema and instance information for integrating heterogeneous data sources. Data Knowl. Eng. 61(2), 281–303 (2007)

    Article  Google Scholar 

  25. Cohen-Boulakia, S., Davidson, S.B., Froidevaux, C., Lacroix, Z., Vidal, M.E.: Path-based systems to guide scientists in the maze of biological data sources. J. Bioinform. Comput. Biol. 4(5), 1069–1095 (2006)

    Article  Google Scholar 

  26. Kumar, A., Smith, B.: The Unified Medical Language System and the Gene Ontology: Some Critical Reflections. In: Günter, A., Kruse, R., Neumann, B. (eds.) KI 2003. LNCS (LNAI), vol. 2821, pp. 135–148. Springer, Heidelberg (2003)

    Google Scholar 

  27. Miles, A., Matthews, B., Beckett, D., Brickley, D., Wilson, M., Rogers, N.: SKOS: a language to describe simple knowledge structures for the Web. In: XTech 2005: XML, the Web and Beyond (2005)

    Google Scholar 

  28. Baader, F., Calvanese, D., McGuinness, D.L., Nardi, D., Patel-Schneider, P.F. (eds.): The description logic handbook: theory, implementation, and applications. Cambridge University Press, New York (2003)

    MATH  Google Scholar 

  29. Maedche, A., Staab, S.: Measuring Similarity between Ontologies. In: International Conference on Knowledge Engineering and Knowledge Management, pp. 251–263 (2002)

    Google Scholar 

  30. Schulz, S., Hahn, U.: Part-whole representation and reasoning in formal biomedical ontologies. Artificial Intelligence in Medicine 34(3), 179–200 (2005)

    Article  Google Scholar 

  31. Doan, A., Madhavan, J., Domingos, P., Halevy, A.: Ontology matching: A machine learning approach. Handbook on Ontologies in Information Systems, 397–416 (2004)

    Google Scholar 

  32. Halevy, A.Y., Ives, Z.G., Suciu, D., Tatarinov, I.: Schema Mediation in Peer Data Management Systems. In: International Conference on Data Engineering, pp. 505–516 (2003)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Amos Bairoch Sarah Cohen-Boulakia Christine Froidevaux

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Mougin, F., Burgun, A., Bodenreider, O., Chabalier, J., Loréal, O., Le Beux, P. (2008). Automatic Methods for Integrating Biomedical Data Sources in a Mediator-Based System. In: Bairoch, A., Cohen-Boulakia, S., Froidevaux, C. (eds) Data Integration in the Life Sciences. DILS 2008. Lecture Notes in Computer Science(), vol 5109. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-69828-9_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-69828-9_7

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-69827-2

  • Online ISBN: 978-3-540-69828-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics