Skip to main content

Associating Labels and Elements of Deep Web Query Interface Based on DOM

  • Conference paper
Web Information Systems and Mining (WISM 2012)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 7529))

Included in the following conference series:

Abstract

Query interface schema extraction is an important issue for Deep Web data acquisition and integration. In order to obtain the query interface schema, it is firstly required to associate elements and labels of Deep Web query interface correctly. Due to the fact that query interface on HTML page can be parsed as well structured DOM, we proposed an effective algorithm for associating elements and labels of Deep Web query interface based on hierarchical DOM. Our algorithm mainly adopted the nearest-neighbor-distance and other two useful heuristic rules to associate the most related label of a given control element. The experimental results on real query interfaces show that our proposed algorithm is highly effective.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Liu, W., Meng, X., Meng, W.: A Survey of Deep Web Data Integration. Chinese Journal of Computers 30(9), 1475–1489 (2007)

    MathSciNet  Google Scholar 

  2. Chang, K.C., He, B., Li, C., Patel, M., Zhang, Z.: Structured database on the Web: Observations and Implications. SIGMOD Record, 61–70 (2004)

    Google Scholar 

  3. Jayant, M., Jeffery, S.R., Cohen, S., et al.: Webscale Data Integration: You Call Only Afford to Pay as You Go. In: Proceedings of the 3rd Biennial Conference on Innovative Data Systems Research, Asilomar, pp. 342–350 (2007)

    Google Scholar 

  4. He, H., Meng, W., Yu, C., Wu, Z.: Constructing Interface Schemas for Search Interfaces of Web Databases. In: Ngu, A.H.H., Kitsuregawa, M., Neuhold, E.J., Chung, J.-Y., Sheng, Q.Z. (eds.) WISE 2005. LNCS, vol. 3806, pp. 29–42. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  5. Wu, W.: Integrating Deep Web data sources. University of Illinois at Urbana-Champaign (2006)

    Google Scholar 

  6. Liang, H., Zuo, W., Ren, F.: Attribute extraction of Deep web query interface based on heuristic rule. Computer Research and Development (46), 48–54 (2009)

    Google Scholar 

  7. Wang, H., Yu, J.: Attribute extraction of Deep web interface based on N-Gram. Computer and Modernization 12, 135–138 (2010)

    Google Scholar 

  8. He, H., Meng, W., Yu, C.T., Wu, Z.: WISE—integrator: An automatic integrator of Web search interfaces for e-commerce. In: Proceedings of the 29th International Conference on Very Large Data Bases, Berlin, pp. 357–368 (2003)

    Google Scholar 

  9. Wang, Y., Peng, T., Zuo, W., Zhu, H.: Schema Extraction of Deep Web Query Interface. In: International Conference on Web Information Systems and Mining, Shanghai, pp. 391–395 (2009)

    Google Scholar 

  10. Wu, W., Doan, A., Yu, C.: WebIQ: Learning from the Web to match Deep-Web query interfaces. In: Proceedings of the 22nd IEEE International Conference on Data Engineering, Atlanta, pp. 44–53 (2006)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Qiang, B., Shi, L., Wu, C., He, Q., Shen, C. (2012). Associating Labels and Elements of Deep Web Query Interface Based on DOM. In: Wang, F.L., Lei, J., Gong, Z., Luo, X. (eds) Web Information Systems and Mining. WISM 2012. Lecture Notes in Computer Science, vol 7529. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-33469-6_81

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-33469-6_81

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-33468-9

  • Online ISBN: 978-3-642-33469-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics