Skip to main content

Accessing the Deep Web with Keywords: A Foundational Approach

  • Conference paper
  • First Online:
Semantic Keyword-Based Search on Structured Data Sources (IKC 2017)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10546))

Included in the following conference series:

Abstract

The Deep Web is constituted by data that are generated dynamically as the result of interactions with Web pages. The problem of accessing Deep Web data presents many challenges: it has been shown that answering even simple queries on such data requires the execution of recursive query plans. There is a gap between the theoretical understanding of this problem and the practical approaches to it. The main reason behind this is that the problem is to be studied by considering the database as part of the input, but queries can be processed by accessing data according to limitations, expressed as so-called access patterns. In this paper we embark on the task of closing the above gap by giving a precise definition that reflects the practical nature of accessing Deep Web data sources. In particular, we define the problem of querying Deep Web sources with keywords. We describe two scenarios: in the first, called unrestricted, there query answering algorithm has full access to the data; in the second, called restricted, the algorithm can access the data only according to the access patterns. We formalise the associated decision problem associated to that of query answering in the Deep Web, explaining its relevance in both the aforementioned scenarios. We then present some complexity results.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    In general, there could be more than one annotation for each predicate, that is, more than one way of accessing the corresponding relation. However, in this paper we assume there is exactly one access limitation (or pattern) per predicate. Our results can be extended to the general case.

References

  1. Abiteboul, S., Hull, R., Vianu, V.: Foundations of Databases. Addison-Wesley, Reading (1995)

    MATH  Google Scholar 

  2. Calì, A., Martinenghi, D.: Conjunctive query containment under access limitations. In: Li, Q., Spaccapietra, S., Yu, E., Olivé, A. (eds.) ER 2008. LNCS, vol. 5231, pp. 326–340. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-87877-3_24

    Chapter  Google Scholar 

  3. Calì, A., Martinenghi, D.: Querying data under access limitations. In: Proceedings of ICDE (2008)

    Google Scholar 

  4. Calì, A., Martinenghi, D., Razgon, I., Ugarte, M.: Querying the deep web: back to the foundations. In: Proceedings of AMW (2017). To appear

    Google Scholar 

  5. Calì, A., Razgon, I.: Complexity of conjunctive query answering under access limitations (preliminary report). In: Proceedings of SEBD (2014)

    Google Scholar 

  6. Chang, K.C.-C., He, B., Zhang, Z.: Toward large scale integration: building a metaquerier over databases on the web. In: Proceedings of CIDR (2005)

    Google Scholar 

  7. Li, C.: Computing complete answers to queries in the presence of limited access patterns. Very Large Database J. 12(3), 211–227 (2003)

    Article  Google Scholar 

  8. Li, C., Chang, E.: Query planning with limited source capabilities. In: Proceedings of ICDE (2000)

    Google Scholar 

  9. Madhavan, J., Afanasiev, L., Antova, L., Halevy, A.Y.: Harnessing the deep web: present and future. In: Proceedings of CIDR (2009)

    Google Scholar 

Download references

Acknowledgments

This work was supported by the EU COST Action IC1302 KEYSTONE. Andrea Calì acknowledges partial support by the EPSRC project “Logic-based Integration and Querying of Unindexed Data” (EP/E010865/1).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Andrea Calì .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Calì, A., Ugarte, M. (2018). Accessing the Deep Web with Keywords: A Foundational Approach. In: Szymański, J., Velegrakis, Y. (eds) Semantic Keyword-Based Search on Structured Data Sources. IKC 2017. Lecture Notes in Computer Science(), vol 10546. Springer, Cham. https://doi.org/10.1007/978-3-319-74497-1_18

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-74497-1_18

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-74496-4

  • Online ISBN: 978-3-319-74497-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics