Skip to main content

Mashups over the Deep Web

  • Conference paper

Part of the book series: Lecture Notes in Business Information Processing ((LNBIP,volume 18))

Abstract

Combining information from different Web sources often results in a tedious and repetitive process, e.g. even simple information requests might require to iterate over a result list of one Web query and use each single result as input for a subsequent query. One approach for this chained queries are data-centric mashups, which allow to visually model the data flow as a graph, where the nodes represent the data source and the edges the data flow.

In this paper we combine the benefits of such an intuitive graphical modeling framework for these chained queries with the large class of Web data sources that are only accessible by filling out forms. These so-called Deep Web sites offer a wealth of structured, high-quality data but pose also several challenges. We identify and address the main challenges and propose an integrated framework for answering chained queries.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Raghavan, S., Garcia-Molina, H.: Crawling the Hidden Web. In: [24], pp. 129–138

    Google Scholar 

  2. He, B., Patel, M., Zhang, Z., Chang, K.C.C.: Accessing the Deep Web. Commun. ACM 50, 94–101 (2007)

    Article  Google Scholar 

  3. He, H., Meng, W., Yu, C.T., Wu, Z.: WISE-Integrator: A System for Extracting and Integrating Complex Web Search Interfaces of the Deep Web. In: Böhm, K., Jensen, C.S., Haas, L.M., Kersten, M.L., Larson, P.Å., Ooi, B.C. (eds.) VLDB, pp. 1314–1317. ACM, New York (2005)

    Google Scholar 

  4. Chang, K.C.C., He, B., Zhang, Z.: Toward Large Scale Integration: Building a MetaQuerier over Databases on the Web. In: CIDR, pp. 44–55 (2005)

    Google Scholar 

  5. Davulcu, H., Freire, J., Kifer, M., Ramakrishnan, I.V.: A Layered Architecture for Querying Dynamic Web Content. In: Delis, A., Faloutsos, C., Ghandeharizadeh, S. (eds.) SIGMOD Conference, pp. 491–502. ACM Press, New York (1999)

    Google Scholar 

  6. Wang, Y., Hornung, T.: Deep Web Navigation by Example. In: Flejter, D., Grzonkowski, S., Kaczmarek, T., Kowalkiewicz, M., Nagle, T., Parkes, J. (eds.) BIS (Workshops). CEUR Workshop Proceedings, CEUR-WS.org, vol. 333, pp. 131–140 (2008)

    Google Scholar 

  7. Simon, K., Lausen, G.: ViPER: Augmenting Automatic Information Extraction with Visual Perceptions. In: Herzog, O., Schek, H.J., Fuhr, N., Chowdhury, A., Teiken, W. (eds.) CIKM, pp. 381–388. ACM, New York (2005)

    Google Scholar 

  8. Simon, K., Hornung, T., Lausen, G.: Learning Rules to Pre-process Web Data for Automatic Integration. In: Eiter, T., Franconi, E., Hodgson, R., Stephens, S. (eds.) RuleML, pp. 107–116. IEEE Computer Society, Los Alamitos (2006)

    Google Scholar 

  9. Calì, A., Martinenghi, D.: Querying Data under Access Limitations. In: ICDE, pp. 50–59. IEEE, Los Alamitos (2008)

    Google Scholar 

  10. Brickley, D., Guha, R.: RDF Vocabulary Description Language 1.0: RDF Schema (2004), http://www.w3.org/TR/rdf-schema/

  11. Biron, P.V., Malhotra, A.: XML Schema Part 2: Datatypes Second Edition (2004), http://www.w3.org/TR/xmlschema-2/

  12. Hassan-Montero, Y., Herrero-Solana, V.: Improving Tag-Clouds as Visual Information Retrieval Interfaces. In: InScit 2006 (2006)

    Google Scholar 

  13. Manola, F., Miller, E.: RDF Primer (2004), http://www.w3.org/TR/rdf-primer

  14. Wang, S.Y., Guo, Y., Qasem, A., Heflin, J.: Rapid Benchmarking for Semantic Web Knowledge Base Systems. In: Gil, Y., Motta, E., Benjamins, V.R., Musen, M.A. (eds.) ISWC 2005. LNCS, vol. 3729, pp. 758–772. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  15. Maier, D., Ullman, J.D., Vardi, M.Y.: On the Foundations of the Universal Relation Model. ACM Trans. Database Syst. 9, 283–308 (1984)

    Article  MathSciNet  MATH  Google Scholar 

  16. Prud’hommeaux, E., Seaborne, A.: SPARQL Query Language for RDF (2007), http://www.w3.org/TR/rdf-sparql-query/

  17. Ennals, R., Garofalakis, M.N.: MashMaker: Mashups For the Masses. In: Chan, C.Y., Ooi, B.C., Zhou, A. (eds.) SIGMOD Conference, pp. 1116–1118. ACM, New York (2007)

    Google Scholar 

  18. Laender, A.H.F., Ribeiro-Neto, B.A., da Silva, A.S., Teixeira, J.S.: A Brief Survey of Web Data Extraction Tools. SIGMOD Record 31, 84–93 (2002)

    Article  Google Scholar 

  19. Hogue, A., Karger, D.R.: Thresher: Automating the Unwrapping of Semantic Content from the World Wide Web. In: Ellis, A., Hagino, T. (eds.) WWW, pp. 86–95. ACM, New York (2005)

    Google Scholar 

  20. Karger, D.R., Bakshi, K., Huynh, D., Quan, D., Sinha, V.: Haystack: A General-Purpose Information Management Tool for End Users Based on Semistructured Data. In: CIDR, pp. 13–26 (2005)

    Google Scholar 

  21. Baumgartner, R., Flesca, S., Gottlob, G.: Visual Web Information Extraction with Lixto. In: [24], pp. 119–128

    Google Scholar 

  22. Huynh, D., Mazzocchi, S., Karger, D.R.: Piggy Bank: Experience the Semantic Web Inside Your Web Browser. J. Web Sem. 5, 16–27 (2007)

    Article  Google Scholar 

  23. Nash, A., Ludäscher, B.: Processing Unions of Conjunctive Queries with Negation under Limited Access Patterns. In: Bertino, E., Christodoulakis, S., Plexousakis, D., Christophides, V., Koubarakis, M., Böhm, K., Ferrari, E. (eds.) EDBT 2004. LNCS, vol. 2992, pp. 422–440. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  24. Apers, P.M.G., Atzeni, P., Ceri, S., Paraboschi, S., Ramamohanarao, K., Snodgrass, R.T. (eds.): VLDB 2001, Proceedings of 27th International Conference on Very Large Data Bases, Roma, Italy, September 11-14, 2001. Morgan Kaufmann, San Francisco (2001)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Hornung, T., Simon, K., Lausen, G. (2009). Mashups over the Deep Web. In: Cordeiro, J., Hammoudi, S., Filipe, J. (eds) Web Information Systems and Technologies. WEBIST 2008. Lecture Notes in Business Information Processing, vol 18. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-01344-7_17

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-01344-7_17

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-01343-0

  • Online ISBN: 978-3-642-01344-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics