Skip to main content
Log in

Path-oriented keyword search over graph-modeled Web data

  • Published:
World Wide Web Aims and scope Submit manuscript

Abstract

Keyword based search systems are becoming increasingly popular and are considered a key feature in many information management systems. Keyword based search approaches have the significant advantage of not requiring users to know how data is organized or stored. Typical approaches assume the dataset to be modeled as a graph, where answers to queries are sub-graphs ranked according to some criteria. Exploring the graph and building and ranking quality pose a number of challenges. In this paper, we discuss Yaanii, an approach for effective Keyword Search over graph-modeled Web data. Yaanii contains a novel approach to keyword search, by extracting the best results from the first set of answers and then combining a solution building algorithm with a ranking technique. In addition to the algorithms and the processes for building result sets, we provide a detailed study of the computational and ranking complexity of Yaanii and compare it with other approaches. We show that Yaanii is superior in terms of efficiency and quality of returned results from both the experimental and theoretical aspects.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Bhalotia, G., Hulgeri, A., Nakhe, C., Chakrabarti, S., Sudarshan, S.: Keyword searching and browsing in databases using banks. In: ICDE, pp. 431–440 (2002)

  2. Dalvi, B.B., Kshirsagar, M., Sudarshan, S.: Keyword search on external memory data graphs. Proc. VLDB 1(1), 1189–1204 (2008)

    Google Scholar 

  3. Garey, M.R., Graham, R.L., Johnson, D.S.: The complexity of computing Steiner minimal trees. SIAM J. Appl. Math. 32(4), 835–859 (1977)

    Article  MathSciNet  MATH  Google Scholar 

  4. Golenberg, K., Kimelfeld, B., Sagiv, Y.: Keyword proximity search in complex data graphs. In: SIGMOD, pp. 927–940 (2008)

  5. He, H., Wang, H., Yang, J., Yu, P.S.: Blinks: ranked keyword searches on graphs. In: SIGMOD (2007)

  6. Hristidis, V., Gravano, L., Papakonstantinou, Y.: Efficient ir-style keyword search over relational databases. In: VLDB, pp. 850–861 (2003)

  7. Kacholia, V., Pandit, S., Chakrabarti, S., Sudarshan, S., Desai, R., Karambelkar, H.: Bidirectional expansion for keyword search on graph databases. In: VLDB (2005)

  8. Kazai, G., Lalmas, M., de Vries, A.P.: The overlap problem in content-oriented XML retrieval evaluation. In: SIGIR, pp. 72–79 (2004)

  9. Khare, R., Rifkin, A.: Weaving a Web of trust. World Wide Web J. 3(2), 77–112 (1997)

    Google Scholar 

  10. Kimelfeld, B., Sagiv, Y.: Finding and approximating top-k answers in keyword proximity search. In: PODS, pp. 173–182 (2006)

  11. Knuth, D.E.: The Art of Computer Programming, vol. I: Fundamental Algorithms, 2nd Edn. Addison-Wesley (1973)

  12. Lalmas, M., Tombros, A.: Inex 2002–2006: understanding XML retrieval evaluation. In: DELOS Conference, pp. 187–196 (2007)

  13. Li, G., Ooi, B.C., Feng, J., Wang, J., Zhou, L.: Ease: an effective 3-in-1 keyword search method for unstructured, semi-structured and structured data. In: SIGMOD (2008)

  14. Li, G., Zhou, X., Feng, J., Wang, J.: Progressive keyword search in relational databases. In: ICDE, pp. 1183–1186 (2009)

  15. Liu, F., Yu, C.T., Meng, W., Chowdhury, A.: Effective keyword search in relational databases. In: SIGMOD (2006)

  16. Luo, Y., Lin, X., Wang, W., Zhou, X.: Spark: top-k keyword query in relational databases. In: SIGMOD (2007)

  17. Piwowarski, B., Dupret, G.: Evaluation in (XML) information retrieval: expected precision-recall with user modelling (eprum). In: SIGIR, pp. 260–267 (2006)

  18. Qin, L., Yu, J.X., Chang, L.: Keyword search in databases: the power of rdbms. In: SIGMOD (2009)

  19. Radev, D.R., Qi, H., Wu, H., Fan, W.: Evaluating Web-based question answering systems (2002)

  20. Singhal, A.: Modern information retrieval: a brief overview. IEEE Data(base) Engineering Bulletin 24(4), 35–43 (2001)

    Google Scholar 

  21. Singhal, A., Buckley, C., Mitra, M.: Pivoted document length normalization. In: SIGIR, pp. 21–29 (1996)

  22. Tran, T., Wang, H., Rudolph, S., Cimiano, P.: Top-k exploration of query candidates for efficient keyword search on graph-shaped (rdf) data. In: ICDE, pp. 405–416 (2009)

  23. Virgilio, R.D., Cappellari, P., Miscione, M.: Cluster-based exploration for effective keyword search over semantic datasets. In: ER, pp. 205–218 (2009)

  24. Voorhees, E.M.: The trec-8 question answering track report (1999)

  25. Zenz, G., Zhou, X., Minack, E., Siberski, W., Nejdl, W.: From keywords to semantic queries–incremental query construction on the semantic Web. Journal of Web Semantics 7(3), 166–176 (2009)

    Article  Google Scholar 

  26. Ziyang, L., Chen, L., Chen, Y.: Processing keyword search on XML: a survey. World Wide Web J. 14(5), 671–707 (2011)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Roberto De Virgilio.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Cappellari, P., De Virgilio, R. & Roantree, M. Path-oriented keyword search over graph-modeled Web data. World Wide Web 15, 631–661 (2012). https://doi.org/10.1007/s11280-011-0153-1

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11280-011-0153-1

Keywords

Navigation