Skip to main content
Log in

Semantic and structural similarities between XML Schemas for integration of ubiquitous healthcare data

  • Original Article
  • Published:
Personal and Ubiquitous Computing Aims and scope Submit manuscript

Abstract

Currently, a lot of recent electronic health records are based on XML documents. In order to integrate these heterogeneous XML medical documents efficiently, studies on finding structure and semantic similarity between XML Schemas have been exploited. The main problem is how to harvest the most appropriate relatedness to combine two schemas as a global XML Schema for reusing and referring purposes. In this paper, we propose the novel resemblance measure that concurrently considers both structural and semantic information of two specific healthcare XML Schemas. Specifically, we introduce new metrics to compute the datatype and cardinality constraint similarities, which improve the quality of the semantic assessment. On the basis of the similarity between each element pair, we put forward an algorithm to calculate the similarity between XML Schema trees. Experimental results lead to the conclusion that our methodology provides better similarity values than the others with regard to the accuracy of semantic and structure similarities.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Notes

  1. http://www.hitsw.com/xml_utilites/.

References

  1. Brown I, Adams A (2007) The ethical challenges of ubiquitous healthcare. Int Rev Inf Ethics 8(12):53–60

    Google Scholar 

  2. Wikipedia, Electronic healthcare record, http://en.wikipedia.org/wiki/Electronic_health_record

  3. Jervis M (2002) XML DTDs vs XML Schema. http://www.sitepoint.com/xml-dtds-xml-schema/

  4. Wu Z, Palmer M (1994) Verbs semantics and lexical selection. In: Proceedings of the 32nd annual meeting on association for computational linguistics, pp 133–138

  5. Do H-H, Rahm E (2002) COMA—a system for flexible combination of schema matching approaches. In: Proceedings of the very large data bases conference (VLDB), pp 610–621

  6. Yang DD, David MW (2005) Powers, measuring semantic similarity in the taxonomy of WordNet. The 28th Australasian computer science conference (ACSC2005), pp 315–322

  7. Lee ML, Yang LH, Hsu W, Yang X (2002) XCLust: clustering XML schemas for effective integration. ACM Press, New York, pp 292–299

    Google Scholar 

  8. Princeton University, WordNet_ A lexical database for English, http://wordnet.princeton.edu/wordnet

  9. Tekli J, Chbeir R, Yetongnon K (2007) A hydrid approach for XML similarity. In: SOFSEM ‘07 proceedings of the 33rd conference on current trends in theory and practice of computer science. Springer, Berlin, pp783–795

  10. Tekli J, Chbeir R, Yetongnon K (2009) An overview on XML similarity: background, current trends and future directions. Comput Sci Rev 3:151–173

    Article  Google Scholar 

  11. Algergawy A, Nayak R, Saake G (2010) Element similarity measures in XML schema matching. Inf Sci 180:4975–4998

    Article  Google Scholar 

  12. Rada R, Mili H, Bicknell E, Blettner M (1989) Development and application of a metric on semantic nets. IEEE Trans Syst Man Cybern 19(1):17–30

    Article  Google Scholar 

  13. Fernandez A, Polleres A, Ossowski S (2007) Towards fine-grained service matchmaking by using concept similarity. In: Workshop on service matchmaking and resource retrieval in the semantic web, pp 31–45

  14. Li Y, Bandar Z, McLean D (2003) An approach for measuring semantic similarity between words using multiple information sources. IEEE Trans Knowl Data Eng 15(4):871–882

    Article  Google Scholar 

  15. Pyshkin E, Kuznetsov A (2010) Approaches for web search user interfaces: how to improve the search quality for various types of information. J Conver 1(1):1–8

    Google Scholar 

  16. Klyuev V, Yokoyama A (2010) Web query expansion: a strategy utilising Japanese WordNet. J Conver 1(1):23–28

    Google Scholar 

  17. Lin D (1998) An information-theoretic definition of similarity. In: Proceedings of international conference on machine learning, pp 296–304

  18. Resnik P (1999) Semantic similarity in a taxonomy an information-based measure and its applications to problems of ambiguity in natural language. J Artif Intell Res 11:95–130

    MATH  Google Scholar 

  19. Ye Y, Li X, Wu B, Li Y (2011) A comparative study of feature weighting methods for document co-clustering. IJITCC 1(2):206–220

    Article  Google Scholar 

  20. Leacock C, Chodorow M (1998) Combining local context and WordNet similarity for word sense identification. In: Fellbaum C (ed) WordNet: an electronic lexical database. MIT Press, Cambridge, pp 265–283

    Google Scholar 

  21. Klyuev V, Oleshchuk V (2011) Semantic retrieval: an approach to representing, searching and summarizing text documents. IJITCC 1(2):221–234

    Article  Google Scholar 

  22. D Vint Productions (2003) XML schema—data types quick reference. http://www.xml.dvint.com

  23. Mebiquitous XML Schema. http://ns.medbiq.org/

  24. Health Level Seven International. http://www.hl7.org/

  25. Do H–H (2005) Schema matching and mapping-based data integration, PhD thesis, University of Leipzig, Interdisciplinary Center for Bioinformatics and Department of Computer Science

Download references

Acknowledgments

This work was supported by a grant from the Kyung Hee University in 2010 (KHU-20101372).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Young-Koo Lee.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Thuy, P.T.T., Lee, YK. & Lee, S. Semantic and structural similarities between XML Schemas for integration of ubiquitous healthcare data. Pers Ubiquit Comput 17, 1331–1339 (2013). https://doi.org/10.1007/s00779-012-0567-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00779-012-0567-5

Keywords

Navigation