Abstract
Most models for unsupervised network feature selection use first-order proximity and reconstruction loss together as a guiding principle in the selection process. However, the first-order proximity is very sparse and insufficient in most cases. Moreover, redundant features, which can significantly hamper the performance of many machine learning algorithms, have seldom been taken into account. To address these issues, we propose an unsupervised network feature selection model called Multiple order proximity and feature Diversity guiding network Feature Selection model (MDFS), which uses multiple order proximity and feature diversity to guide the selection process. We use multi-order proximities based on the random walk model to capture linkage information between nodes. Moreover, we use an auto-encoder to capture the content information of nodes. As a last step, we design a redundancy loss to alleviate selecting highly-overlapping features. Experiment results on two real-world network datasets show the competitive ability of our model to select high-quality features among state-of-the-art models.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
References
Chen, J., Zhang, Q., Huang, X.: Incorporate group information to enhance network embedding. In: Mukhopadhyay, S., et al. (eds.) Proceedings of the 25th ACM International Conference on Information and Knowledge Management, CIKM 2016, Indianapolis, IN, USA, 24–28 October 2016, pp. 1901–1904. ACM (2016)
Ding, C.H.Q., Peng, H.: Minimum redundancy feature selection from microarray gene expression data. In: 2nd IEEE Computer Society Bioinformatics Conference, CSB 2003, Stanford, CA, USA, 11–14 August 2003, pp. 523–529. IEEE Computer Society (2003)
Grover, A., Leskovec, J.: node2vec: scalable feature learning for networks. In: Krishnapuram, B., Shah, M., Smola, A.J., Aggarwal, C.C., Shen, D., Rastogi, R. (eds.) Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016, pp. 855–864. ACM (2016)
He, X., Cai, D., Niyogi, P.: Laplacian score for feature selection. In: Advances in Neural Information Processing Systems 18, Neural Information Processing Systems, NIPS 2005, Vancouver, British Columbia, Canada, 5–8 December 2005, pp. 507–514 (2005)
Hinton, G.E., Zemel, R.S.: Autoencoders, minimum description length and helmholtz free energy. In: Cowan, J.D., Tesauro, G., Alspector, J. (eds.) Advances in Neural Information Processing Systems 6, 7th NIPS Conference, Denver, Colorado, USA, pp. 3–10. Morgan Kaufmann (1993)
Li, J., Hu, X., Tang, J., Liu, H.: Unsupervised streaming feature selection in social media. In: Bailey, J., et al. (eds.) Proceedings of the 24th ACM International Conference on Information and Knowledge Management, CIKM 2015, Melbourne, VIC, Australia, 19–23 October 2015, pp. 1041–1050. ACM (2015)
Li, J., Hu, X., Wu, L., Liu, H.: Robust unsupervised feature selection on networked data. In: Venkatasubramanian Jr, S.C., Meira, W. (eds.) Proceedings of the 2016 SIAM International Conference on Data Mining, Miami, Florida, USA, 5–7 May 2016, pp. 387–395. SIAM (2016)
Li, J., Tang, J., Liu, H.: Reconstruction-based unsupervised feature selection: an embedded approach. In: Sierra, C. (ed.) Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, IJCAI 2017, Melbourne, Australia, 19–25 August 2017, pp. 2159–2165. ijcai.org (2017)
Li, J., Wu, L., Dani, H., Liu, H.: Unsupervised personalized feature selection. In: McIlraith, S.A., Weinberger, K.Q. (eds.) Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, (AAAI 2018), the 30th Innovative Applications of Artificial Intelligence (IAAI 2018), and the 8th AAAI Symposium on Educational Advances in Artificial Intelligence (EAAI 2018), New Orleans, Louisiana, USA, 2–7 February 2018, pp. 3514–3521. AAAI Press (2018)
Li, J., Zhu, J., Zhang, B.: Discriminative deep random walk for network classification. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics Volume 1: Long Papers, ACL 2016, Berlin, Germany, 7–12 August 2016. The Association for Computer Linguistics (2016)
Li, Z., Yang, Y., Liu, J., Zhou, X., Lu, H.: Unsupervised feature selection using nonnegative spectral analysis. In: Hoffmann, J., Selman, B. (eds.) Proceedings of the Twenty-Sixth AAAI Conference on Artificial Intelligence, Toronto, Ontario, Canada, 22–26 July 2012. AAAI Press (2012)
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Burges, C.J.C., Bottou, L., Ghahramani, Z., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013, Lake Tahoe, Nevada, United States, 5–8 December 2013, pp. 3111–3119 (2013)
Mu, C., Yang, G., Yan, Z.: Revisiting skip-gram negative sampling model with regularization. CoRR abs/1804.00306 (2018)
Perozzi, B., Al-Rfou, R., Skiena, S.: Deepwalk: online learning of social representations. In: Macskassy, S.A., Perlich, C., Leskovec, J., Wang, W., Ghani, R. (eds.) The 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2014, New York, NY, USA, 24–27 August 2014, pp. 701–710. ACM (2014)
Ribeiro, L.F.R., Saverese, P.H.P., Figueiredo, D.R.: struc2vec: learning node representations from structural identity. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Halifax, NS, Canada, 13–17 August 2017, pp. 385–394. ACM (2017)
Sen, P., Namata, G., Bilgic, M., Getoor, L., Galligher, B., Eliassi-Rad, T.: Collective classification in network data. AI Mag. 29(3), 93 (2008)
Shi, C., Li, Y., Zhang, J., Sun, Y., Yu, P.S.: A survey of heterogeneous information network analysis. IEEE Trans. Knowl. Data Eng. 29(1), 17–37 (2017)
Tang, J., Liu, H.: Unsupervised feature selection for linked social media data. In: Yang, Q., Agarwal, D., Pei, J. (eds.) The 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2012, Beijing, China, 12–16 August 2012, pp. 904–912. ACM (2012)
Wei, X., Cao, B., Yu, P.S.: Unsupervised feature selection on networks: a generative view. In: Schuurmans, D., Wellman, M.P. (eds.) Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, Phoenix, Arizona, USA, 12–17 February 2016, pp. 2215–2221. AAAI Press (2016)
Wei, X., Xie, S., Yu, P.S.: Efficient partial order preserving unsupervised feature selection on networks. In: Venkatasubramanian, S., Ye, J. (eds.) Proceedings of the 2015 SIAM International Conference on Data Mining, Vancouver, BC, Canada, April 30–May 2 2015, pp. 82–90. SIAM (2015)
Yang, Y., Shen, H.T., Ma, Z., Huang, Z., Zhou, X.: \({{\rm l}}_{2,1}\)-norm regularized discriminative feature selection for unsupervised learning. In: Walsh, T. (ed.) IJCAI 2011, Proceedings of the 22nd International Joint Conference on Artificial Intelligence, Barcelona, Catalonia, Spain, 16–22 July 2011, pp. 1589–1594. IJCAI/AAAI (2011)
Zhang, D., Yin, J., Zhu, X., Zhang, C.: Network representation learning: a survey. CoRR abs/1801.05852 (2018)
Zhao, Z., Liu, H.: Spectral feature selection for supervised and unsupervised learning. In: Ghahramani, Z. (ed.) Machine Learning, Proceedings of the Twenty-Fourth International Conference (ICML 2007), Corvallis, Oregon, USA, 20–24 June 2007. ACM International Conference Proceeding Series, vol. 227, pp. 1151–1157. ACM (2007)
Zhao, Z., Wang, L., Liu, H.: Efficient spectral feature selection with minimum redundancy. In: Fox, M., Poole, D. (eds.) Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence, AAAI 2010, Atlanta, Georgia, USA, 11–15 July 2010. AAAI Press (2010)
Acknowledgements
This work was partly supported by the National Natural Science Foundation of China under Grant No.61572002, No.61170300, No. 61690201, and No.61732001.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Wang, H., Li, Y., Zhao, C., Mu, K. (2019). An Approach with Low Redundancy to Network Feature Selection Based on Multiple Order Proximity. In: Nayak, A., Sharma, A. (eds) PRICAI 2019: Trends in Artificial Intelligence. PRICAI 2019. Lecture Notes in Computer Science(), vol 11671. Springer, Cham. https://doi.org/10.1007/978-3-030-29911-8_14
Download citation
DOI: https://doi.org/10.1007/978-3-030-29911-8_14
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-29910-1
Online ISBN: 978-3-030-29911-8
eBook Packages: Computer ScienceComputer Science (R0)