Abstract
Keyword search is integrated in many applications on account of the convenience to convey users’ query intention. Most existing works in keyword search on graphs modeled the query results as individual minimal connected trees or connected graphs that contain the keywords. We observe that significant overlap may exist among those query results, which would affect the result diversification. Besides, most solutions required accessing graph data and pre-built indexes in memory, which is not suitable to process big dataset. In this paper, we define the smallest k-compact tree set as the keyword query result, where no shared graph node exists between any two compact trees. We then develop a progressive A* based scalable solution using MapReduce to compute the smallest k-compact tree set, where the computation process could be stopped once the generated compact tree set is sufficient to compute the keyword query result. We conduct experiments to show the efficiency of our proposed algorithm.
Similar content being viewed by others
References
Bhalotia, G., Hulgeri, A., Nakhe, C., Chakrabarti, S., Sudarshan, S.: Keyword searching and browsing in databases using banks. In: ICDE, pp 431–440 (2002)
Dalvi, B.B., Kshirsagar, M., Sudarshan, S.: Keyword search on external memory data graphs. PVLDB 1(1), 1189–1204 (2008)
Ding, B., Jeffrey X.Y., Wang, S., Qin, L., Zhang, X., Lin, X.: Finding top-k min-cost connected trees in databases. In: ICDE, pp 836–845 (2007)
Elbassuoni, S., Blanco, R.: Keyword search over RDF graphs. In: Proceedings of the 20th ACM Conference on Information and Knowledge Management, CIKM 2011, Glasgow, United Kingdom, October 24-28, 2011, pp 237–242 (2011)
Golenberg, K., Kimelfeld, B., Sagiv, Y.: Keyword proximity search in complex data graphs. In: Wang, J. T.-L. (ed.) SIGMOD Conference ACM, pp 927–940 (2008)
He, H., Wang, H., Yang, J., Philip, S.Y.: Blinks: ranked keyword searches on graphs. In: SIGMOD Conference, pp 305–316 (2007)
Jeffrey X.Y., Qin, L., Chang, L.: Keyword search in relational databases A survey. IEEE Data Eng. Bull. 33(1), 67–78 (2010)
Kacholia, V., Pandit, S., Chakrabarti, S., Sudarshan, S., Desai, R., Karambelkar, H.: Bidirectional expansion for keyword search on graph databases. In: VLDB, pp 505–516 (2005)
Kargar, M., An. A.: Keyword search in graphs finding r-cliques. PVLDB 4(10), 681–692 (2011)
Ley, M.: The dblp computer science bibliography: Evolution, research issues, perspectives. In: SPIRE, pp 1–10 (2002)
Li, G., Ooi, B.C., Feng, J., Wang, J., Zhou, L.: Ease: an effective 3-in-1 keyword search method for unstructured, semi-structured and structured data. In: SIGMOD Conference, pp 903–914 (2008)
Li, J., Liu, C., Islam, Md.S.: Keyword-based correlated network computation over large social media. In: IEEE 30th International Conference on Data Engineering, Chicago, ICDE 2014, IL, USA, March 31 - April 4, 2014, pp 268–279 (2014)
Li, J., Liu, C., Zhou, R., Wang, W.: Top-k keyword search over probabilistic xml data. In: ICDE, pp 673–684 (2011)
Li, J., Liu, C., Zhou, R., Jeffrey X.Y.: Quasi-slca based keyword queryprocessing over probabilistic XML data. IEEE Trans. Knowl. Data Eng. 26(4), 957–969 (2014)
Li, J., Liu, C., Jeffrey X.Y.: Context-based diversification for keyword queries over XML data. IEEE Trans. Knowl. Data Eng. 27(3), 660–672 (2015)
Moussa, R.: Tpc-h benchmark analytics scenarios and performances on hadoop data clouds. In: NDT (1), pp 220–234 (2012)
Qin, L., Jeffrey X.Y., Chang, L., Tao, Y.: Querying communities in relational databases. In: ICDE, pp 724–735 (2009)
Ye, Y., Wang, G., Chen, L., Wang, H., Efficient keyword search on uncertain graph data. IEEE Trans. Knowl. Data Eng. 25(12), 2767–2779 (2013)
Zhou, R., Liu, C., Li, J., Jeffrey X.Y.: ELCA evaluation for keyword search on probabilistic XML data. World Wide Web 16(2), 171–193 (2013)
Acknowledgments
This work is supported by ARC DP120102627, ARC DP140103499 and NSFC 61170007.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Liu, C., Yao, L., Li, J. et al. Finding smallest k-Compact tree set for keyword queries on graphs using mapreduce. World Wide Web 19, 499–518 (2016). https://doi.org/10.1007/s11280-015-0337-1
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11280-015-0337-1