Abstract
In real world, a heterogeneous information network (HIN) is often dynamic due to the time varying features of the nodes, and uncertain due to missing values and noise. In this paper, we investigate the problem of reducing the uncertainty of a dynamic HIN, which is an important task for HIN analysis. The challenges are three-fold, the heterogeneity of features, the heterogeneity of constraints, and the dynamic uncertainty. We propose a novel approach, called fusing reconstruction (FRec), which reconstructs the uncertain snapshots of a dynamic HIN in a homogeneous feature space combining two fusions, the fusion of heterogeneous features and the fusion of heterogeneous constraints. To address the challenge of the heterogeneity of features, we propose an invertible fusing transformation (IFT) as the first part of FRec. IFT is a bidirectional transformation, which is able to learn unified latent homogeneous feature representations for heterogeneous nodes and transform them back to the raw heterogeneous feature space by its invertibility. To address the challenge of the heterogeneity of constraints and the challenge of dynamic uncertainty, we propose a heterogeneous constraints fusion based tensor reconstruction model (HCF-TRM) as the second part of FRec. HCF-TRM is able to denoise the uncertain snapshots of a dynamic HIN and recovers the missing values by fusing the spatial smoothness constraint and the temporal smoothness constraint into the tensor reconstruction. At last, the extensive experiments conducted on real datasets and synthetic datasets verify the effectiveness and scalability of FRec.








Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Achlioptas D, Mcsherry F (2007) Fast computation of low-rank matrix approximations. J ACM (JACM) 54(2):9
Ahmed A, Shervashidze N, Narayanamurthy S, Josifovski V (2013) Smo: Distributed large-scale natural graph factorization. In: Proceedings of the 22nd international conference on World Wide Web, WWW ’13. International World Wide Web Conferences Steering Committee, pp 37–48
Anandkumar A, Ge R, Hsu D, Kakade SM, Telgarsky M (2014) Tensor decompositions for learning latent variable models. J Mach Learn Res 15(1):2773–283
Bao J, Zheng Y, Mokbel MF (2012) Location-based and preference-aware recommendation using sparse geo-social networking data. In: Proceedings of the 20th international conference on advances in geographic information systems, SIGSPATIAL ’12. ACM, New York, pp 199–208. doi:10.1145/2424321.2424348
Brand M (2003) Continuous nonlinear dimensionality reduction by kernel eigenmaps. In: IJCAI, pp 547–554
Cao S, Lu W, Xu Q (2015) Grarep: Learning graph representations with global structural information. In: Proceedings of the 24th ACM international on conference on information and knowledge management, CIKM ’15. ACM, pp 891–900
Chang S, Han W, Tang J, Qi GJ, Aggarwal CC, Huang TS (2015) Heterogeneous network embedding via deep architectures. In: Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining, KDD’15. ACM, pp 119–128
Chen J, Saad Y (2009) On the tensor svd and the optimal low rank orthogonal approximation of tensors. SIAM J Matrix Anal Appl 30(4):1709–1734
Goldfarb D, Qin Z (2014) Robust low-rank tensor recovery: models and algorithms. SIAM J Matrix Anal Appl 35(1):225–253
Golub GH, Van Loan CF (2013) Matrix computations, vol 3. JHU Press, Baltimore
He X, Cai D, Yan S, Zhang HJ (2005) Neighborhood preserving embedding. In: Tenth IEEE international conference on computer vision, 2005. ICCV 2005, vol 2. IEEE, pp 1208–1213
Jia C, Zhong G, Fu Y (2014) Low-rank tensor learning with discriminant analysis for action classification and image recovery. In: Twenty-eighth AAAI conference on artificial intelligence
Koch O, Lubich C (2010) Dynamical tensor approximation. SIAM J Matrix Anal Appl 31(5):2360–2375
Kolda TG, Bader BW (2009) Tensor decompositions and applications. SIAM Rev 51(3):455–500
Kong X, Yu PS, Ding Y, Wild DJ (2012) Meta path-based collective classification in heterogeneous information networks. In: Proceedings of the 21st ACM international conference on information and knowledge management, CIKM ’12. ACM, pp 1567–1571
Kruskal JB (1989) Rank, decomposition, and uniqueness for 3-way and n-way arrays. Multiway Data Anal 33:7–18
Liu J, Musialski P, Wonka P, Ye J (2013) Tensor completion for estimating missing values in visual data. IEEE Trans Pattern Anal Mach Intell 35(1):208–220
Lubich C, Rohwedder T, Schneider R, Vandereycken B (2013) Dynamical approximation by hierarchical tucker and tensor-train tensors. SIAM J Matrix Anal Appl 34(2):470–494
Ou M, Cui P, Pei J, Zhang Z, Zhu W (2016) Asymmetric transitivity preserving graph embedding. In: Proceedings of the 22Nd ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’16. ACM, New York, pp 1105–1114. doi:10.1145/2939672.2939751
Phan AH, Cichocki A (2011) Parafac algorithms for large-scale problems. Neurocomputing 74(11):1970–1984
Shaw B, Jebara T (2009) Structure preserving embedding. In: Proceedings of the 26th annual international conference on machine learning. ACM, pp 937–944
Shi C, Kong X, Yu PS, Xie S, Wu B (2012) Relevance search in heterogeneous networks. In: Proceedings of the 15th international conference on extending database technology, EDBT ’12. ACM, pp 180–191
Sun J, Tao D, Papadimitriou S, Yu PS, Faloutsos C (2008) Incremental tensor analysis: theory and applications. ACM Trans Knowl Discov Data 2(3):11:1–11:37
Sun Y, Han J, Aggarwal CC, Chawla NV (2012) When will it happen?: Relationship prediction in heterogeneous information networks. In: Proceedings of the fifth ACM international conference on web search and data mining, WSDM ’12. ACM, pp 663–672
Sun Y, Norick B, Han J, Yan X, Yu PS, Yu X (2012) Integrating meta-path selection with user-guided object clustering in heterogeneous information networks. In: Proceedings of the 18th ACM SIGKDD international conference on knowledge discovery and data mining, KDD’12. ACM, pp 1348–1356
Sun Y, Yu Y, Han J (2009) Ranking-based clustering of heterogeneous information networks with star network schema. In: Proceedings of the 15th ACM SIGKDD international conference on knowledge discovery and data mining, KDD’09. ACM, pp 797–806
Tang J, Qu M, Wang M, Zhang M, Yan J, Mei Q (2015) Line: large-scale information network embedding. In: Proceedings of the 24th international conference on World Wide Web, WWW’15. International World Wide Web Conferences Steering Committee, pp 1067–1077
Wang D, Cui P, Zhu W (2016) Structural deep network embedding. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’16. ACM, New York, pp 1225–1234. doi:10.1145/2939672.2939753
Wang Y, Zheng Y, Xue Y (2014) Travel time estimation of a path using sparse trajectories. In: Proceedings of the 20th ACM SIGKDD international conference on knowledge discovery and data mining, KDD’14. ACM, pp 25–34
Wen Z, Yin W (2013) A feasible method for optimization with orthogonality constraints. Math Program 142(1–2):397–434
Xiong Y, Zhu Y, Yu P (2015) Top-k similarity join in heterogeneous information networks. IEEE Trans Knowl Data Eng 27(6):1710–1723
Yan S, Xu D, Zhang B, Zhang HJ, Yang Q, Lin S (2007) Graph embedding and extensions: a general framework for dimensionality reduction. IEEE Trans Pattern Anal Mach Intell 29(1):40–51
Yang Y, Chawla N, Sun Y, Hani J (2012) Predicting links in multi-relational and heterogeneous networks. In: 2012 IEEE 12th international conference on data mining (ICDM). IEEE, pp 755–764
Yu Y, Cheng H, Zhang X (2014) Approximate low-rank tensor learning. In: 7th NIPS workshop on optimization for machine learning
Yuan Z, Sang J, Liu Y, Xu C (2013) Latent feature learning in social media network. In: Proceedings of the 21st ACM international conference on multimedia, MM’13. ACM, pp 253–262
Zheng Y, Liu F, Hsieh H.P (2013) U-air: When urban air quality inference meets big data. In: Proceedings of the 19th ACM SIGKDD international conference on knowledge discovery and data mining, KDD’13. ACM, pp 1436–1444
Zheng Y, Liu T, Wang Y, Zhu Y, Liu Y, Chang E (2014) Diagnosing new york city’s noises with ubiquitous data. In: Proceedings of the 2014 ACM international joint conference on pervasive and ubiquitous computing. ACM, pp 715–725
Zhou Y, Liu L (2013) Social influence based clustering of heterogeneous information networks. In: Proceedings of the 19th ACM SIGKDD international conference on knowledge discovery and data mining, KDD’13. ACM, pp 338–346
Acknowledgements
This work is supported by National Science Foundation of China through Grants 61173099, 61672313, and U1333113, and in part by NSF through grants IIS-1526499, and CNS-1626432.
Author information
Authors and Affiliations
Corresponding author
Additional information
Responsible editor: Charu Aggarwal.
Rights and permissions
About this article
Cite this article
Yang, N., He, L., Li, Z. et al. Reducing uncertainty of dynamic heterogeneous information networks: a fusing reconstructing approach. Data Min Knowl Disc 31, 879–906 (2017). https://doi.org/10.1007/s10618-017-0492-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10618-017-0492-3