Abstract
This paper introduces an approach to address the problem of accessing conventional and geographic data from the Deep Web. The approach relies on describing the relevant data through well-structured sentences, and on publishing the sentences as Web pages, following the W3C and the Google recommendations. For conventional data, the sentences are generated with the help of database views. For vector data, the topological relationships between the objects represented are first generated, and then sentences are synthesized to describe the objects and their topological relationships. Lastly, for raster data, the geographic objects overlapping the bounding box of the data are first identified with the help of a gazetteer, and then sentences describing such objects are synthesized. The Web pages thus generated are easily indexed by traditional search engines, but they also facilitated the task of more sophisticated engines that support semantic search based on natural language features.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Bergman, M.K.: The Deep Web: Surfacing Hidden Value. J. Electr. Pub. 7(1) (2001)
Bizer, C., Cyganiak, R.: D2R Server – Publishing Relational Databases on the Web as SPARQL Endpoints. In: Proc. 15th Int’l. WWW Conf., Edinburgh, Scotland (2006)
Caldwell, B., Cooper, M., Reid, L.G., Vanderheiden, G.: Web Con-tent Accessibility Guidelines (WCAG) 2.0. In: W3C Recommendation (2008)
Callan, J.: Distributed information retrieval. In: Advances in Information Retrieval, pp. 127–150. Springer, US (2000)
Costa, L.: Esfinge - Resposta a perguntas usando a Rede. In: Proc. Conf. Ibero-Americana IADIS WWW/Internet, Lisboa, Portugal (2005)
Erling, O., Mikhailov, I.: RDF support in the virtuoso DBMS. In: Proc. 1st Conference on Social Semantic Web, Leipzig, Germany. LNI, vol. 113, pp. 59–68 (2007)
Fliedl, G., Kop, C., Vöhringer, J.: Guideline based evaluation and verbali-zation of OWL class and property labels. Data & Knowledge Eng. 69(4), 331–342 (2010)
Fuchs, N.E., Kaljurand, K., Kuhn, T.: Attempto Controlled English for Knowledge Representation. In: Baroglio, C., Bonatti, P.A., Małuszyński, J., Marchiori, M., Polleres, A., Schaffert, S. (eds.) Reasoning Web. LNCS, vol. 5224, pp. 104–124. Springer, Heidelberg (2008)
Google. In: Google’s Search Engine Optimization Starter Guide, Version 1.1 (2008)
Alexandria Digital Library, Guide to the ADL Gazetteer Content Standard, v. 3.2 (2004)
Hollink, L., Schreiber, G., Wielemaker, J., Wielinga, B.: Semantic Annotation of Image Collections. In: Proc. Knowledge Markup and Semantic Annota-tion Workshop, Sanibel, Florida, USA (2003)
ISO 19115:2003, Geographic Information – Metadata
Kalyanpur, A., Halaschek-Wiener, C., Kolovski, V., Hendler, J.: Effective NL Paraphrasing of Ontologies on the Semantic Web. In: Workshop on End-User Semantic Web Interaction, 4th Int. Semantic Web conference, Galway, Ireland (2005)
Leme, L.A.P.P., Brauner, D.F., Casanova, M.A., Breitman, K.: A Software Architecture for Automated Geographic Metadata Annotation Generation. In: Proc. XXII Simpósio Brasileiro De Banco De Dados, SBBD, João Pessoa, Brazil (2007)
Madhavan, J., Afanasiev, L., Antova, L., Halevy, A.: Harnessing the Deep Web: Present and Future. In: Proc. 4th Biennial Conf. on Innovative Data Systems Research (CIDR), Asilomar, California, USA (2009)
Madhavan, J., Ko, D., Kot, L., Ganapathy, V., Rasmussen, A., Halevy, A.: Google’s Deep-Web Crawl. In: Proc. VLDB, vol. 1(2), pp. 1241–1252 (2008)
MapServer, http://mapserver.org/about.html#about
Meng, W., Yu, C.T., Liu, K.L.: Building efficient and effective metasearch en-gines. ACM Computing. Survey 34(1), 48–89 (2002)
Praninskas, J.: Rapid review of English grammar. Prentice-Hall, NJ (1975)
Raghavan, S., Garcia-Molina, H.: Crawling the HiddenWeb. In: Proc. VLDB, pp. 129–138 (2001)
Rajaraman, A.: Kosmix: HighPerformance Topic Exploration using the Deep Web. In: Proc. VLDB, Lyon, France (2009)
Piccinini, H., Lemos, M., Casanova, M.A., Furtado, A.L.: W-Ray: A Strategy to Publish Deep Web Geographic Data. Tech Rep. 10/10. Dept. Informatics, PUC-Rio (2010)
Sorrentino, S., Bergamaschi, S., Gawinecki, M., Po, L.: Schema Normalization for Improving Schema Matching. In: Laender, A.H.F. (ed.) ER 2009. LNCS, vol. 5829, pp. 280–293. Springer, Heidelberg (2009)
Zheng, Z.: AnswerBus question answering system. In: Proc. 2nd International Con-ference on Human Language, San Diego, California, pp. 399–404 (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Piccinini, H., Lemos, M., Casanova, M.A., Furtado, A.L. (2010). W-Ray: A Strategy to Publish Deep Web Geographic Data. In: Trujillo, J., et al. Advances in Conceptual Modeling – Applications and Challenges. ER 2010. Lecture Notes in Computer Science, vol 6413. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-16385-2_2
Download citation
DOI: https://doi.org/10.1007/978-3-642-16385-2_2
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-16384-5
Online ISBN: 978-3-642-16385-2
eBook Packages: Computer ScienceComputer Science (R0)