Unsupervised Hierarchical Feature Selection on Networked Data

Zhang, Yuzhe; Chen, Chen; Luo, Minnan; Li, Jundong; Yan, Caixia; Zheng, Qinghua

doi:10.1007/978-3-030-59419-0_9

Yuzhe Zhang¹⁴,
Chen Chen¹⁵,
Minnan Luo¹⁴,
Jundong Li^16,17,
Caixia Yan¹⁴ &
…
Qinghua Zheng^14,18

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12114))

Included in the following conference series:

International Conference on Database Systems for Advanced Applications

2155 Accesses
1 Citations

Abstract

Networked data is commonly observed in many high-impact domains, ranging from social networks, collaboration platforms to biological systems. In such systems, the nodes are often associated with high dimensional features while remain connected to each other through pairwise interactions. Recently, various unsupervised feature selection methods have been developed to distill actionable insights from such data by finding a subset of relevant features that are highly correlated with the observed node connections. Although practically useful, those methods predominantly assume that the nodes on the network are organized in a flat structure, which is rarely the case in reality. In fact, the nodes in most, if not all, of the networks can be organized into a hierarchical structure. For example, in a collaboration network, researchers can be clustered into different research areas at the coarsest level and are further specified into different sub-areas at a finer level. Recent studies have shown that such hierarchical structure can help advance various learning problems including clustering and matrix completion. Motivated by the success, in this paper, we propose a novel unsupervised feature selection framework (HNFS) on networked data. HNFS can simultaneously learn the implicit hierarchical structure among the nodes and embed the hierarchical structure into the feature selection process. Empirical evaluations on various real-world datasets validate the superiority of our proposed framework.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

References

Alelyani, S., Tang, J., Liu, H.: Feature selection for clustering: a review. Data Cluster.: Algorithms Appl. 21, 110–121 (2013)
Google Scholar
Belkin, M., Niyogi, P.: Laplacian eigenmaps and spectral techniques for embedding and clustering. In: NIPS (2001)
Google Scholar
Boutsidis, C., Mahoney, M.W., Drineas, P.: Unsupervised feature selection for the $k$-means clustering problem. In: NIPS (2009)
Google Scholar
Boyd, S., Parikh, N., Chu, E., Peleato, B., Eckstein, J.: Distributed optimization and statistical learning via the alternating direction method of multipliers. Foundations & Trends in Machine Learning 3, 1–122 (2011)
Article Google Scholar
Cai, D., He, X.: Unsupervised feature selection for multi-cluster data. In: KDD (2010)
Google Scholar
Cai, D., He, X., Han, J.: Spectral regression for efficient regularized subspace learning. In: ICCV (2007)
Google Scholar
Dong, X., Zhu, L., Song, X., Li, J., Cheng, Z.: Adaptive collaborative similarity learning for unsupervised multi-view feature selection. In: IJCAI (2018)
Google Scholar
Fan, M., Chang, X., Zhang, X., Wang, D., Du, L.: Top-k supervise feature selection via ADMM for integer programming. In: IJCAI (2017)
Google Scholar
Farahat, A.K., Ghodsi, A., Kamel, M.S.: An efficient greedy method for unsupervised feature selection. In: ICDM (2011)
Google Scholar
Farahat, A.K., Ghodsi, A., Kamel, M.S.: Efficient greedy feature selection for unsupervised learning. Knowl. Inf. Syst. 2, 285–310 (2013). https://doi.org/10.1007/s10115-012-0538-1
Article Google Scholar
Gabay, D.: A dual algorithm for the solution of nonlinear variational problems via finite element approximation. Comput. Math. Appl. 2, 17–40 (1976)
Article Google Scholar
He, X., Cai, D., Niyogi, P.: Laplacian score for feature selection. In: NIPS (2005)
Google Scholar
Huang, J., Nie, F., Ding, C.: Robust manifold nonnegative matrix factorization. ACM Trans. Knowl. Discov. Data 8, 11:1–11:21 (2014)
Article Google Scholar
Li, J., et al.: Feature selection: a data perspective. ACM Comput. Surv. 50, 94:1–94:45 (2017)
Google Scholar
Li, J., Hu, X., Wu, L., Liu, H.: Robust unsupervised feature selection on networked data. In: SDM (2016)
Google Scholar
Li, Z., Yang, Y., Zhou, X., Lu, H.: Unsupervised feature selection using nonnegative spectral analysis. In: AAAI (2012)
Google Scholar
Lin, C.: Projected gradient methods for nonnegative matrix factorization. Neural Comput. 19, 2756–2779 (2007)
Article MathSciNet Google Scholar
Liu, H., Shao, M., Fu, Y.: Consensus guided unsupervised feature selection. In: AAAI (2016)
Google Scholar
Liu, J., Ji, S., Ye, J.: Multi-task feature learning via efficient $\ell _{2,1}$-norm minimization. In: UAI (2009)
Google Scholar
Liu, J., Ye, J.: Moreau-Yosida regularization for grouped tree structure learning. In: NIPS (2010)
Google Scholar
Liu, Y., Wang, J., Ye, J.: An efficient algorithm for weak hierarchical lasso. ACM Trans. Knowl. Discov. Data 10, 32:1–32:24 (2014)
Google Scholar
Nie, F., Huang, H., Cai, X., Ding, C.H.: Efficient and robust feature selection via joint $ell_{2,1}$-norms minimization. In: NIPS (2010)
Google Scholar
Pan, W., Yang, Q.: Transfer learning in heterogeneous collaborative filtering domains. Artif. Intell. 197, 39–55 (2013)
Article MathSciNet Google Scholar
Qian, M., Zhai, C.: Robust unsupervised feature selection. In: IJCAI (2013)
Google Scholar
Tang, J.: Feature selection with linked data in social media. In: SDM (2012)
Google Scholar
Tang, J., Li, H.: Unsupervised feature selection for linked social media data. In: KDD (2012)
Google Scholar
Trigeorgis, G., Bousmalis, K., Zafeiriou, S.P., Schuller, B.W.: A deep semi-NMF model for learning hidden representations. In: ICML (2014)
Google Scholar
Wang, S., Liu, H.: Embedded unsupervised feature selection. In: AAAI (2015)
Google Scholar
Wang, S., Tang, J., Wang, Y., Liu, H.: Exploring implicit hierarchical structures for recommender systems. In: IJCAI (2015)
Google Scholar
Wang, S., Wang, Y., Tang, J., Aggarwal, C., Ranganath, S., Liu, H.: Exploiting hierarchical structures for unsupervised feature selection. In: SDM (2017)
Google Scholar
Wang, X., Tang, L., Gao, H., Liu, H.: Discovering overlapping groups in social media. In: ICDM (2010)
Google Scholar
Wang, X., Tang, L., Liu, H., Wang, L.: Learning with multi-resolution overlapping communities. Knowl. Inf. Syst. 36, 517–535 (2013). https://doi.org/10.1007/s10115-012-0555-0
Article Google Scholar
Yang, Y., Shen, H., Ma, Z., Huang, Z., Zhou, X.: $\ell _{2,1}$-norm regularized discriminative feature selection for unsupervised learning. In: IJCAI (2011)
Google Scholar
Ye, F., Chen, C., Zheng, Z.: Deep autoencoder-like nonnegative matrix factorization for community detection. In: CIKM (2018)
Google Scholar
Yu, L., Liu, H.: Feature selection for high-dimensional data: a fast correlation-based filter solution. In: ICML (2003)
Google Scholar
Zhao, Z., Liu, H.: Spectral feature selection for supervised and unsupervised learning. In: ICML (2007)
Google Scholar

Download references

Acknowledgement

This work was supported by National Nature Science Foundation of China (No. 61872287, No. 61532015, and No. 61872446), Innovative Research Group of the National Natural Science Foundation of China (No. 61721002), Innovation Research Team of Ministry of Education (IRT_17R86), and Project of China Knowledge Center for Engineering Science and Technology. Besides, this research was funded by National Science and Technology Major Project of the Ministry of Science and Technology of China (No. 2018AAA0102900).

Author information

Authors and Affiliations

School of Computer Science and Technology, Xi’an Jiaotong University, Xi’an, China
Yuzhe Zhang, Minnan Luo, Caixia Yan & Qinghua Zheng
Google Inc., Menlo Park, USA
Chen Chen
Department of Electrical and Computer Engineering, University of Virginia, Charlottesville, USA
Jundong Li
Department of Computer Science and School of Data Science, University of Virginia, Charlottesville, USA
Jundong Li
National Engineering Lab for Big Data Analytics, Xi’an Jiaotong University, Xi’an, China
Qinghua Zheng

Authors

Yuzhe Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Chen Chen
View author publications
You can also search for this author in PubMed Google Scholar
Minnan Luo
View author publications
You can also search for this author in PubMed Google Scholar
Jundong Li
View author publications
You can also search for this author in PubMed Google Scholar
Caixia Yan
View author publications
You can also search for this author in PubMed Google Scholar
Qinghua Zheng
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Minnan Luo .

Editor information

Editors and Affiliations

Dankook University, Yongin, Korea (Republic of)
Yunmook Nah
Peking University, Haidian, China
Bin Cui
Sungkyunkwan University, Suwon, Korea (Republic of)
Sang-Won Lee
Department of Systems Engineering and En, The Chinese University of Hong Kong, Hong Kong, Hong Kong
Jeffrey Xu Yu
Kangwon National University, Chunchon, Korea (Republic of)
Yang-Sae Moon
Korea Advanced Institute of Science and, Daejeon, Korea (Republic of)
Steven Euijong Whang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhang, Y., Chen, C., Luo, M., Li, J., Yan, C., Zheng, Q. (2020). Unsupervised Hierarchical Feature Selection on Networked Data. In: Nah, Y., Cui, B., Lee, SW., Yu, J.X., Moon, YS., Whang, S.E. (eds) Database Systems for Advanced Applications. DASFAA 2020. Lecture Notes in Computer Science(), vol 12114. Springer, Cham. https://doi.org/10.1007/978-3-030-59419-0_9

Download citation

DOI: https://doi.org/10.1007/978-3-030-59419-0_9
Published: 22 September 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-59418-3
Online ISBN: 978-3-030-59419-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics