Skip to main content
Log in

A Query Interface Matching Approach Based on Extended Evidence Theory for Deep Web

  • Published:
Journal of Computer Science and Technology Aims and scope Submit manuscript

Abstract

Matching query interfaces is a crucial step in data integration across multiple Web databases. Different types of information about query interface schemas have been used to match attributes between schemas. Relying on a single aspect of information is not sufficient and the matching results of individual matchers are often inaccurate and uncertain. The evidence theory is the state-of-the-art approach for combining multiple sources of uncertain information. However, traditional evidence theory has the limitations of treating individual matchers in different matching tasks equally for query interface matching, which reduces matching performance. This paper proposes a novel query interface matching approach based on extended evidence theory for Deep Web. Our approach firstly introduces the dynamic prediction procedure of different matchers' credibilities. Then, it extends traditional evidence theory with the credibilities and uses exponentially weighted evidence theory to combine the results of multiple matchers. Finally, it performs matching decision in terms of some heuristics to obtain the final matches. Our approach overcomes the shortage of traditional method and can adapt to different matching tasks. Experimental results demonstrate the feasibility and effectiveness of our proposed approach.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Dragut E C, Yu C, Meng W. Meaningful labeling of integrated query interfaces. In Proc. the 32nd International Conference on Very Large Data Bases, Seoul, Korea, Sept. 12-15, 2006, pp.679-690.

  2. He B, Chang K C. Statistical schema matching across Web query interfaces. In Proc. the 2003 ACM SIGMOD International Conference on Management of Data, San Diego, USA, June 9-12, 2003, pp.217-228.

  3. Wu W, Yu C, Doan A H, Meng W. An interactive clustering-based approach to integrating source query interfaces on the Deep Web. In Proc. the 2004 ACM SIGMOD International Conference on Management of Data, Paris, France, June 13-18, 2004, pp.95-106.

  4. Wu W, Doan A H, Yu C. Merging interface schemas on the Deep Web via clustering aggregation. In Proc. the Fifth IEEE International Conference on Data Mining, Houston, USA, Nov. 27-30, 2005, pp.801-804.

  5. Hong J, He Z, Bell D. An evidential approach to query interface matching on the deep Web. In Proc. the International Workshop on New Trends in Information Integration, Auckland, New Zealand, Aug. 23, 2008, pp.20-23.

  6. He Z, Hong J, Bell D. Schema matching across query interfaces on the Deep Web. In Proc. the 25th British National Conference on Databases (BNCOD2008), Cardiff, UK, July 7-10, 2008, pp.51-62.

  7. He H, Meng W, Yu C T, Wu Z. Wise-integrator: An automatic integrator of web search interfaces for e-commerce. In Proc the 29th International Conference on Very Large Data Bases, Berlin, Germany, Sept. 9-12, 2003, pp.357-368.

  8. Dempster A P. Upper and lower probabilities induced by multivalued mapping. The Annals of Mathematical Statistics, 1967, 38(2): 325-339.

    Article  MATH  MathSciNet  Google Scholar 

  9. Rahm E, Bernstein P A. A survey of approaches to automatic schema matching. The VLDB Journal, 2001, 10(4): 334-350.

    Article  MATH  Google Scholar 

  10. He B, Chang K C, Han J. Discovering complex matchings across web query interfaces: A correlation mining approach. In Proc. the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Seattle, WA, USA, Aug. 22-25, 2004, pp.148-157.

  11. Do H H, Rahm E. COMA: A system for flexible combination of schema matching approaches. In Proc. the 28th International Conference on Very Large Data Bases, Hong Kong, China, Aug. 20-23, 2002, pp.610-621.

  12. Madhavan J, Bernstein P A, Rahm E. Generic schema matching with cupid. In Proc. the 27th International Conference on Very Large Data Bases, Rome, Italy, Sept. 11-14, 2001, pp.49-58.

  13. Yong K T. CMC: Combining multiple schema-matching strategies based on credibility prediction. In Proc. the 10th International Database Systems for Advanced Applications, Beijing, China, Apr. 17-20, 2005, pp.888-893.

  14. Doan A, Domingos P, Halvey A. Reconciling schemas of disparate data sources: A machine-learning approach. In Proc. the 2001 SIGMOD International Conference on Management of Data, Santa Barbara, USA, May 21-24, 2001, pp.509-520.

  15. Shafer G. A Mathematical Theory of Evidence. Princeton University Press, 1976.

  16. Hall P A, Dowling G R. Approximate string matching. ACM Computing Surveys, 1980, 12(4): 381-402.

    Article  MathSciNet  Google Scholar 

  17. Cohen W, Ravikumar P, Fienberg S. A comparison of string distance metrics for name-matching tasks. In Proc. the 2nd International Workshop on Information Integration on the Web, Acapulco, Mexico, Aug. 9-10, 2003, pp.73-78.

  18. ICQ Query Interfaces dataset. http://metaquerier.cs.uiuc.edu/repository/datasets/icq/index.html.

  19. van Rijsbergen C J. Information Retrieval, Butterworths, 1979.

  20. Wu W, Doan A H, Yu C. WebIQ: Learning from the Web to match Deep-Web query interfaces. In Proc. the 22nd International Conference on Data Engineering, Atlanta, GA, USA, April 3-8, 2006, pp.44-53.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Qing-Zhong Li.

Additional information

Supported by the National Natural Science Foundation of China under Grant No. 90818001 and the Natural Science Foundation of Shandong Province of China under Grant No. Y2007G24.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Dong, YQ., Li, QZ., Ding, YH. et al. A Query Interface Matching Approach Based on Extended Evidence Theory for Deep Web. J. Comput. Sci. Technol. 25, 537–547 (2010). https://doi.org/10.1007/s11390-010-9343-z

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11390-010-9343-z

Keywords

Navigation