Skip to main content

Semantic Correspondence in Federated Life Science Data Integration Systems

  • Conference paper
Data Integration in the Life Sciences (DILS 2005)

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 3615))

Included in the following conference series:

Abstract

For execution of complex biological queries, data integration systems often use several intermediate data sources because the domain coverage of individual sources is limited. Quality of intermediate sources differs greatly based on the method used for curation, frequency of updates and breadth of domain coverage, which affects the quality of the results. Therefore, integration systems should provide data provenance; i.e. information about the path used to obtain every record in the result. Furthermore, since query capabilities of web-accessible sources are limited, integration systems need to support refinement queries of finer granularity issued over the integrated data. However, unlike the individual sources, integration systems have to handle the absence of data and conflicts in the integrated data caused by inconsistencies among the sources. This paper describes the solution proposed by BACIIS, the Biological and Chemical Information Integration System, for providing data provenance and for supporting refinement queries over integrated data. Semantic correspondence between records from different sources is defined based on the links connecting these data sources including cross-references. Two characteristics of semantic correspondence, namely degree and cardinality, are identified based on the closeness of the links that exist between data records and based on the mappings between domains of data records respectively. An algorithm based on semantic correspondence is presented to handle absence of data and conflicts in the integrated data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Baxevanis, A.D.: The Molecular Biology Database Collection: 2003 update. Nucleic Acids Res 31(1), 1–12 (2003)

    Article  Google Scholar 

  2. Zdobnov, E.M., Lopez, R., Apweiler, R., Etzold, T.: The EBI SRS server-recent developments. Bioinformatics 18(2), 368–373 (2002)

    Article  Google Scholar 

  3. Goble, C.A., Stevens, R., Ng, G., Bechhofer, S., Paton, N.W., Baker, P.G., Peim, M., Brass, A.: Transparent access to multiple bioinformatics information sources. IBM Systems Journal 40(2), 532–552 (2001)

    Article  Google Scholar 

  4. Hernandez, T., Kambhampati, S.: Integration of Biological Sources: Current Systems and Challenges Ahead. To appear in SIGMOD Record 33(3) (September 2004)

    Google Scholar 

  5. Ben Miled, Z., Bukhres, O., Wang, Y., Li, N., Baumgartner, M., Sipes, B.: Biological and Chemical Information Integration System. In: Network Tools and Applications in Biology, Genoa, Italy (May 2001)

    Google Scholar 

  6. Ben Miled, Z., Webster, Y., Li, N., Liu, Y.: An Ontology for the Semantic Integration of Life Science Web Databases. International Journal of Cooperative Information Systems 12(2) (2003)

    Google Scholar 

  7. Ben-Miled, Z., Li, N., Kellett, G., Sipes, B., Bukhres, O.: Complex Life Science Multidatabase Queries. Proceedings of the IEEE 90(11) (2002)

    Google Scholar 

  8. Sheth, A., Kashyap, V.: So Far (Schematically) yet So Close (Semantically). In: Proceedings of the MT DS-5 Conference on Semantics of Interoperable Database Systems, Lorne, Australia. Elsvier Publishers, Amsterdam (1992)

    Google Scholar 

  9. Kleinberg, J.: Authoritative sources in a hyperlinked environment. In: Proc. 9th ACM-SIAM Symposium on Discrete Algorithms (1998); Extended version in Journal of the ACM 46(1999). Also appears as IBM Research Report RJ 10076 (May 1997)

    Google Scholar 

  10. http://www.blueprint.org/bind/bind.php

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Mahoui, M., Kulkarni, H., Li, N., Ben-Miled, Z., Börner, K. (2005). Semantic Correspondence in Federated Life Science Data Integration Systems. In: Ludäscher, B., Raschid, L. (eds) Data Integration in the Life Sciences. DILS 2005. Lecture Notes in Computer Science(), vol 3615. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11530084_12

Download citation

  • DOI: https://doi.org/10.1007/11530084_12

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-27967-9

  • Online ISBN: 978-3-540-31879-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics