Abstract
Keyword based search systems are becoming increasingly popular and are considered a key feature in many information management systems. Keyword based search approaches have the significant advantage of not requiring users to know how data is organized or stored. Typical approaches assume the dataset to be modeled as a graph, where answers to queries are sub-graphs ranked according to some criteria. Exploring the graph and building and ranking quality pose a number of challenges. In this paper, we discuss Yaanii, an approach for effective Keyword Search over graph-modeled Web data. Yaanii contains a novel approach to keyword search, by extracting the best results from the first set of answers and then combining a solution building algorithm with a ranking technique. In addition to the algorithms and the processes for building result sets, we provide a detailed study of the computational and ranking complexity of Yaanii and compare it with other approaches. We show that Yaanii is superior in terms of efficiency and quality of returned results from both the experimental and theoretical aspects.
Similar content being viewed by others
References
Bhalotia, G., Hulgeri, A., Nakhe, C., Chakrabarti, S., Sudarshan, S.: Keyword searching and browsing in databases using banks. In: ICDE, pp. 431–440 (2002)
Dalvi, B.B., Kshirsagar, M., Sudarshan, S.: Keyword search on external memory data graphs. Proc. VLDB 1(1), 1189–1204 (2008)
Garey, M.R., Graham, R.L., Johnson, D.S.: The complexity of computing Steiner minimal trees. SIAM J. Appl. Math. 32(4), 835–859 (1977)
Golenberg, K., Kimelfeld, B., Sagiv, Y.: Keyword proximity search in complex data graphs. In: SIGMOD, pp. 927–940 (2008)
He, H., Wang, H., Yang, J., Yu, P.S.: Blinks: ranked keyword searches on graphs. In: SIGMOD (2007)
Hristidis, V., Gravano, L., Papakonstantinou, Y.: Efficient ir-style keyword search over relational databases. In: VLDB, pp. 850–861 (2003)
Kacholia, V., Pandit, S., Chakrabarti, S., Sudarshan, S., Desai, R., Karambelkar, H.: Bidirectional expansion for keyword search on graph databases. In: VLDB (2005)
Kazai, G., Lalmas, M., de Vries, A.P.: The overlap problem in content-oriented XML retrieval evaluation. In: SIGIR, pp. 72–79 (2004)
Khare, R., Rifkin, A.: Weaving a Web of trust. World Wide Web J. 3(2), 77–112 (1997)
Kimelfeld, B., Sagiv, Y.: Finding and approximating top-k answers in keyword proximity search. In: PODS, pp. 173–182 (2006)
Knuth, D.E.: The Art of Computer Programming, vol. I: Fundamental Algorithms, 2nd Edn. Addison-Wesley (1973)
Lalmas, M., Tombros, A.: Inex 2002–2006: understanding XML retrieval evaluation. In: DELOS Conference, pp. 187–196 (2007)
Li, G., Ooi, B.C., Feng, J., Wang, J., Zhou, L.: Ease: an effective 3-in-1 keyword search method for unstructured, semi-structured and structured data. In: SIGMOD (2008)
Li, G., Zhou, X., Feng, J., Wang, J.: Progressive keyword search in relational databases. In: ICDE, pp. 1183–1186 (2009)
Liu, F., Yu, C.T., Meng, W., Chowdhury, A.: Effective keyword search in relational databases. In: SIGMOD (2006)
Luo, Y., Lin, X., Wang, W., Zhou, X.: Spark: top-k keyword query in relational databases. In: SIGMOD (2007)
Piwowarski, B., Dupret, G.: Evaluation in (XML) information retrieval: expected precision-recall with user modelling (eprum). In: SIGIR, pp. 260–267 (2006)
Qin, L., Yu, J.X., Chang, L.: Keyword search in databases: the power of rdbms. In: SIGMOD (2009)
Radev, D.R., Qi, H., Wu, H., Fan, W.: Evaluating Web-based question answering systems (2002)
Singhal, A.: Modern information retrieval: a brief overview. IEEE Data(base) Engineering Bulletin 24(4), 35–43 (2001)
Singhal, A., Buckley, C., Mitra, M.: Pivoted document length normalization. In: SIGIR, pp. 21–29 (1996)
Tran, T., Wang, H., Rudolph, S., Cimiano, P.: Top-k exploration of query candidates for efficient keyword search on graph-shaped (rdf) data. In: ICDE, pp. 405–416 (2009)
Virgilio, R.D., Cappellari, P., Miscione, M.: Cluster-based exploration for effective keyword search over semantic datasets. In: ER, pp. 205–218 (2009)
Voorhees, E.M.: The trec-8 question answering track report (1999)
Zenz, G., Zhou, X., Minack, E., Siberski, W., Nejdl, W.: From keywords to semantic queries–incremental query construction on the semantic Web. Journal of Web Semantics 7(3), 166–176 (2009)
Ziyang, L., Chen, L., Chen, Y.: Processing keyword search on XML: a survey. World Wide Web J. 14(5), 671–707 (2011)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Cappellari, P., De Virgilio, R. & Roantree, M. Path-oriented keyword search over graph-modeled Web data. World Wide Web 15, 631–661 (2012). https://doi.org/10.1007/s11280-011-0153-1
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11280-011-0153-1