Abstract
Structural relationships between objects are used to model as graphs in many applications. In this paper, we study the problem of identifying relevant subgraphs in large networks. Relevant subgraphs in large networks contain network elements which are maintained by network administrators. We formalize the problem and propose a framework consisting of two major phases. The relevance scores of all vertex pairs are computed in the offline phase, while relevant subgraphs are identified in the online phase. We analyze the relevance score measure carefully and design an efficient algorithm for relevant subgraph identification by repeatedly expanding candidate subgraphs and merging overlapping ones. Our experiments based on real data sets show that our relevant subgraphs are of high quality and can be found efficiently, which are useful for network administrators during network operation and maintenance.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Information Technology Infrastructure Library, http://www.axelos.com/itil.
- 2.
ITU Telecommunication Standardization Sector, http://www.itu.int/en/ITU-T.
- 3.
TM Forum, https://www.tmforum.org.
References
Çamoğlu, O., Can, T., Singh, A.K.: Integrating multi-attribute similarity networks for robust representation of the protein space. Bioinformatics 22(13), 1585–1592 (2006)
Chakrabarti, D.: AutoPart: parameter-free graph partitioning and outlier detection. In: Boulicaut, J.-F., Esposito, F., Giannotti, F., Pedreschi, D. (eds.) PKDD 2004. LNCS (LNAI), vol. 3202, pp. 112–124. Springer, Heidelberg (2004)
Cheng, J., Ke, Y., Ng, W., Yu, J.X.: Context-aware object connection discovery in large graphs. In: ICDE, pp. 856–867. IEEE (2009)
Dourisboure, Y., Geraci, F., Pellegrini, M.: Extraction and classification of dense communities in the web. In: Proceedings of the 16th WWW, pp. 461–470. ACM (2007)
Faloutsos, C., McCurley, K.S., Tomkins, A.: Fast discovery of connection subgraphs. In: Proceedings of the Tenth ACM SIGKDD, pp. 118–127. ACM (2004)
Gibson, D., Kumar, R., Tomkins, A.: Discovering large dense subgraphs in massive graphs. In: Proceedings of the 31st VLDB, pp. 721–732. VLDB Endowment (2005)
Hintsanen, P., Toivonen, H., Sevon, P.: Fast discovery of reliable subnetworks. In: ASONAM, pp. 104–111. IEEE (2010)
Jeh, G., Widom, J.: Simrank: a measure of structural-context similarity. In: Proceedings of the Eighth ACM SIGKDD, pp. 538–543. ACM (2002)
Jeh, G., Widom, J.: Scaling personalized web search. In: Proceedings of the 12th WWW, pp. 271–279. ACM (2003)
Koren, Y., North, S.C., Volinsky, C.: Measuring and extracting proximity graphs in networks. ACM TKDD 1(3), 12 (2007)
Lovász, L., et al.: Random walks on graphs: a survey. Comb. Paul Erdos Eighty 2, 353–398 (1996)
Palmer, C.R., Faloutsos, C.: Electricity based external similarity of categorical attributes. In: Whang, K.-Y., Jeon, J., Shim, K., Srivastava, J. (eds.) PAKDD 2003. LNCS, vol. 2637, pp. 486–500. Springer, Heidelberg (2003)
Pan, J.-Y., Yang, H.-J., Faloutsos, C., Duygulu, P.: Automatic multimedia cross-modal correlation discovery. In: Proceedings of the Tenth ACM SIGKDD, pp. 653–658. ACM (2004)
Pons, P., Latapy, M.: Computing communities in large networks using random walks. In: Yolum, I., Güngör, T., Gürgen, F., Özturan, C. (eds.) ISCIS 2005. LNCS, vol. 3733, pp. 284–293. Springer, Heidelberg (2005)
Ramakrishnan, C., Milnor, W.H., Perry, M., Sheth, A.P.: Discovering informative connection subgraphs in multi-relational graphs. ACM SIGKDD Explor. Newslett. 7(2), 56–63 (2005)
Tang, L., Li, T., Shwartz, L., Pinel, F., Grabarnik, G.Y.: An integrated framework for optimizing automatic monitoring systems in large it infrastructures. In: Proceedings of the 19th ACM SIGKDD, pp. 1249–1257. ACM (2013)
Acknowledgments
This work was supported in part by Nanjing University of Posts and Telecommunications under Grants No. NY215045 and NY214135, and Ministry of Education/China Mobile joint research grant under Project No. 5–10.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendices
Appendix
A Relationship Between Expected f-Distance and Random Walk with Restart
The vertex relevance matrix using the expected f-distance is not much different from one using random walk with restart. The proof is presented below. Based on the iterative form of the definition of random walk with restart, the vertex relevance score matrix \({\varPi '^l}\) of graph \(G_i\) can be expressed as following.
where c is the restart probability, P is the transition matrix of G and I is identity matrix. Then we have
The last line of Eq. (7) contains three items. The first item is the vertex relevance matrix \({\varPi ^l_i}\) using the expected f-distance. The third item cI affects only the diagonal entries of the vertex relevance matrix, which is ignored since we do not consider the vertex self-relevance. Then, the difference using random walk with restart and the expected f-distance results in the second item \((1-c)^{l+1}P^l_i\). When l goes to infinity, the vertex relevance matrices using expected f-Distance and random walk with restart are the same except the diagonal entries. Even when l is small, the corresponding entries of two matrices do not differ so much since \((1-c)^{l+1}P^l\) is very small comparing with \(\varPi ^l = \sum _{\gamma =1}^{l}c(1-c)^\gamma P^\gamma \).
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Liu, Z., Guo, S., Li, T., Chen, W. (2016). Identifying Relevant Subgraphs in Large Networks. In: Morishima, A., et al. Web Technologies and Applications. APWeb 2016. Lecture Notes in Computer Science(), vol 9865. Springer, Cham. https://doi.org/10.1007/978-3-319-45835-9_13
Download citation
DOI: https://doi.org/10.1007/978-3-319-45835-9_13
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-45834-2
Online ISBN: 978-3-319-45835-9
eBook Packages: Computer ScienceComputer Science (R0)