Abstract
In this big data age, extensive requirements emerge in data management and data analysis fields. Heterogeneous information networks (HIN) are widely used as data models due to their rich semantics in expressing complex data correlations. The data similarities other than the exact matches are required in many data mining, data analysis and machine learning algorithms. Graph edit distance (GED) is one of the feasible methods on HIN similarity measuring. In this paper, we firstly extend the concept of GED in homogeneous graphs to the heterogeneous information networks by introducing newly defined edit operations. The metapath-based approximation method is then proposed to improve the performance of full database similarity search, in which a upper bound and a lower bound, both of polynomial time complexity, are utilized as filters. Finally, comprehensive experimental results show the proposed method outperforms the existed method in terms of computational efficiency, bound tightness and similarity filtering capability.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Ghosh, R., Lerman, K.: Structure of Heterogeneous Networks. arXiv 9(6) (2009)
Justice, D., Hero, A.: A binary linear programming formulation of the graph edit distance. In: IEEE Transactions on PAMI, vol. 28, no. 8, pp. 1200–1214 (2006)
Singh, A.K.: Closure-tree: an index structure for graph queries. In: ICDE 2006, pp. 38–47 (2006)
Zeng, Z., Tung, A.K., Wang, J., Feng, J., Zhou, L.: Comparing stars: on approximating graph edit distance. Proc. VLDB Endow. 2(1), 25 (2009)
Sun, Y., Han, J., Yan, X., Yu, P.S., Wu, T.: PathSim: meta path based top-k similarity search in heterogeneous information networks. In: VLDB 2011, pp. 992–1003 (2011)
Kuramochi, M., Karypis, G.: Frequent subgraph discovery. In: IEEE Transactions on ICDM 2001, p. 313 (2001)
Sun, Y., Han, J.: Mining heterogeneous information networks: a structural analysis approach. In: SIGKDD 2012, vol. 14, pp. 20–28 (2012)
Shi, C., Li, Y.: A survey of heterogeneous information network analysis. J. Latex Cl. Files 14, 17–37 (2015)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Lu, J., Lu, N., Ma, S., Zhang, B. (2018). Edit Distance Based Similarity Search of Heterogeneous Information Networks. In: Hartmann, S., Ma, H., Hameurlain, A., Pernul, G., Wagner, R. (eds) Database and Expert Systems Applications. DEXA 2018. Lecture Notes in Computer Science(), vol 11030. Springer, Cham. https://doi.org/10.1007/978-3-319-98812-2_16
Download citation
DOI: https://doi.org/10.1007/978-3-319-98812-2_16
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-98811-5
Online ISBN: 978-3-319-98812-2
eBook Packages: Computer ScienceComputer Science (R0)