Abstract
Given a large weighted directed graph where nodes are associated with attributes and edges are weighted, we study a new problem, called preferential nearest neighbors (NN) browsing, in this paper. In such browsing, a user may provide one or more source nodes and some keywords to retrieve the nearest neighbors of those source nodes that contain the given keywords. For example, when a tourist has a plan to visit several places (source nodes), he/she would like to search hotels with some preferred features (e.g., Internet and swimming pools). It is highly desirable to recommend a list of near hotels with those preferred features, in order of the road network distance to the places (source nodes) the tourist wants to visit. The existing approach by graph traversal at querying time requires long query processing time, and the approach by maintenance of the pre-computed all-pairs shortest distances requires huge storage space on disk. In this paper, we propose new approaches to support on-line preferential NN browsing. The data graphs we are dealing with are weighted directed graphs where nodes are associated with attributes, and the distances between nodes to be found are the exact distances in the graph. We focus ourselves on two-step approaches. In the first step, we identify a number of reference nodes (also called centers) which exist alone on some shortest paths between a source node and a preferential NN node that contains the user-given keywords. In the second step, we find the preferential NN nodes within a certain distance to the source nodes via the relevant reference nodes, using an index that supports both textural (attributes) and and the distance. Our approach tightly integrates NN search with the preference search, which is confirmed to be efficient and effective to find any preferential NN nodes.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Agrawal, R., Borgida, A., Jagadish, H.V.: Efficient management of transitive relationships in large data and knowledge bases. In: Proc. of SIGMOD 1989 (1989)
Chen, L., Gupta, A., Kurul, M.E.: Stack-based algorithms for pattern matching on dags. In: Proc. of VLDB 2005 (2005)
Cheng, J., Yu, J.X.: On-line exact shortest distance query processing. In: EDBT (2009)
Cheng, J., Yu, J.X., Ding, B., Yu, P.S., Wang, H.: Fast graph pattern matching. In: Proc. of ICDE 2008 (2008)
Cheng, J., Yu, J.X., Lin, X., Wang, H., Yu, P.S.: Fast computing reachability labelings for large graphs with high compression rate. In: Proc. of EDBT 2008 (2008)
Cohen, E., Halperin, E., Kaplan, H., Zwick, U.: Reachability and distance queries via 2-hop labels. In: Proc. of SODA 2002 (2002)
Dabek, F., Cox, R., Kaashoek, F., Morris, R.: Predicting internet network distance with coordinates-based approaches. In: SIGCOMM (2004)
Dijkstra, E.W.: A note on two problems in connection with graphs. Numerische Math. 1, 269–271 (1959)
Gavoille, C., Peleg, D., Pérennes, S., Raz, R.: Distance labeling in graphs. J. Algorithms 53(1), 85–112 (2004)
Goldberg, A.V., Werneck, R.F.: Computing point-to-point shortest paths from external memory. In: ALENEX (2005)
Goldberg, A.V., Werneck, R.F.: Reach for a*: Efficient point-to-point shortest path algorithms. In: ALENEX (2006)
Gou, G., Chirkova, R.: Efficient algorithms for exact ranked twig-pattern matching over graphs. In: Proc. of SIGMOD 2008 (2008)
He, H., Wang, H., Yang, J., Yu, P.S.: Blinks: ranked keyword searches on graphs. In: Proc. of SIGMOD 2007 (2007)
Hu, H., Lee, D.L., Lee, V.C.S.: Distance indexing on road networks. In: VLDB (2006)
Jagadish, H.V.: A compression technique to materialize transitive closure. ACM Trans. Database Syst. 15(4), 558–598 (1990)
Jiang, B.: I/o-efficiency of shortest path algorithms: An analysis. In: ICDE (1992)
Jin, R., Xiang, Y., Ruan, N., Wang, H.: Efficiently answering reachability queries on very large directed graphs. In: Proc. of SIGMOD 2008 (2008)
Ng, T.S.E., Zhang, H.: Predicting internet network distance with coordiantes-based approaches. In: INFOCOM (2001)
Papadias, D., Zhang, J., Mamoulis, N., Tao, Y.: Query processing in spatial network databases. In: VLDB (2003)
Peleg, D.: Proximity-preserving labeling schemes. J. Graph Theory 33, 167–176 (2000)
Rattigan, M.J., Maier, M., Jensen, D.: Using structure indices for efficient approximation of network properties. In: KDD (2006)
Samet, H., Sankaranarayanan, J., Alborzi, H.: Scalable network distance browsing in spatial databases. In: SIGMOD (2008)
Sankaranarayanan, J., Samet, H.: Distance oracles for spatial networks. In: ICDE (2009)
Schenkel, R., Theobald, A., Weikum, G.: Efficient creation and incremental maintenance of the HOPI index for complex XML document collections. In: Proc. of ICDE 2005 (2005)
Thorup, M., Zwick, U.: Approximate distance oracles. In: Proc. of STOC 2001 (2001)
TrißI, S., Leser, U.: Fast and practical indexing and querying of very large graphs. In: Proc. of SIGMOD 2007 (2007)
Yuster, R., Zwick, U.: Answering distance queries in directed graphs using fast matrix multiplication. In: Proc. of FOCS 2005 (2005)
Zou, L., Chen, L., Özsu, M.T.: Distancejoin: Pattern match query in a large graph database. In: VLDB (2009)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Cheng, J., Yu, J.X., Cheng, R.C.K. (2010). On-Line Preferential Nearest Neighbor Browsing in Large Attributed Graphs. In: Yoshikawa, M., Meng, X., Yumoto, T., Ma, Q., Sun, L., Watanabe, C. (eds) Database Systems for Advanced Applications. DASFAA 2010. Lecture Notes in Computer Science, vol 6193. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-14589-6_2
Download citation
DOI: https://doi.org/10.1007/978-3-642-14589-6_2
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-14588-9
Online ISBN: 978-3-642-14589-6
eBook Packages: Computer ScienceComputer Science (R0)