ABSTRACT
In this era of "big data", a key challenge facing the database community is to help average users tap into the huge amounts of structured data on the Web. To address this challenge, we propose a novel proactive template-based engine for searching structured data on the Web using natural language. Departing from conventional search engines, the proposed engine organizes questions it can answer using templates and figures out ahead of time which sources can answer which templates and how. Then, at query time, the engine can simply match queries with the templates and retrieve answers using the pre-compiled evaluation plans. While attractive, building such an engine requires innovations in template creation, query evaluation, and system evolution. In this paper, we propose novel techniques to address these challenges.
- G. Agarwal et al. Towards rich query interpretation: walking back and forth for mining query templates. In WWW, 2010. Google ScholarDigital Library
- R. Burke et al. Natural language processing in the faq finder system. In AAAI spring symposium, 1997.Google Scholar
- M. J. Cafarella et al. Webtables: exploring the power of tables on the web. PVLDB, 1(1), 2008. Google ScholarDigital Library
- M. J. Carman et al. Learning semantic descriptions of web information sources. In IJCAI, 2007. Google ScholarDigital Library
- Y. Chen et al. Keyword search on structured and semi-structured data. In SIGMOD Conference, 2009. Google ScholarDigital Library
- A. Doan, A. Halevy, and Z. Ives. Principles of Data Integration. Morgan Kaufmann, 2012. Google ScholarDigital Library
- D. Gildea et al. Automatic labeling of semantic roles. Computational Linguistics, 28(3), 2002. Google ScholarDigital Library
- M. A. Hearst. Search User Interfaces. Cambridge University Press, 2009. Google ScholarDigital Library
- Y. Li et al. A domain-adaptive natural language interface for querying xml. In SIGMOD, 2007. Google ScholarDigital Library
- J. Madhavan et al. Web-scale data integration: You can afford to pay as you go. In CIDR, 2007.Google Scholar
- A. Nandi and H. V. Jagadish. Qunits: queried units in database search. In CIDR, 2009.Google Scholar
- National Science Foundation. NSF Award Search. http://www.nsf.gov/awardsearch/.Google Scholar
- E. Sadikov et al. Clustering query refinements by user intent. In WWW, 2010. Google ScholarDigital Library
- N. Sarkas et al. Structured annotations of web queries. In SIGMOD, 2010. Google ScholarDigital Library
- W. Shen et al. Toward best-effort information extraction. In SIGMOD, 2008. Google ScholarDigital Library
- K. Tjin-Kam-Jet et al. Free-text search versus complex web forms. In ECIR, 2011. Google ScholarDigital Library
- W. Wu et al. An interactive clustering-based approach to integrating source query interfaces on the deep web. In SIGMOD, 2004. Google ScholarDigital Library
Index Terms
Proactive natural language search engine: tapping into structured data on the web
Recommendations
Searching the deep web using proactive phrase queries
WWW '13 Companion: Proceedings of the 22nd International Conference on World Wide WebThis paper proposes ipq, a novel search engine that proactively transforms query forms of Deep Web sources into phrase queries, constructs query evaluation plans, and caches results for popular queries offline. Then at query time, keyword queries are ...
A wireless natural language search engine
SIGIR '05: Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrievalWeb search using stationary (desktop) computers has become a pervasive activity. The mobile user in need of information, however, faces several problems in his or her quest to satisfy an information need. Mobile devices have small displays, and mobile ...
Comments