Skip to main content

Unsupervised Hierarchical Feature Selection on Networked Data

  • Conference paper
  • First Online:
Book cover Database Systems for Advanced Applications (DASFAA 2020)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12114))

Included in the following conference series:

Abstract

Networked data is commonly observed in many high-impact domains, ranging from social networks, collaboration platforms to biological systems. In such systems, the nodes are often associated with high dimensional features while remain connected to each other through pairwise interactions. Recently, various unsupervised feature selection methods have been developed to distill actionable insights from such data by finding a subset of relevant features that are highly correlated with the observed node connections. Although practically useful, those methods predominantly assume that the nodes on the network are organized in a flat structure, which is rarely the case in reality. In fact, the nodes in most, if not all, of the networks can be organized into a hierarchical structure. For example, in a collaboration network, researchers can be clustered into different research areas at the coarsest level and are further specified into different sub-areas at a finer level. Recent studies have shown that such hierarchical structure can help advance various learning problems including clustering and matrix completion. Motivated by the success, in this paper, we propose a novel unsupervised feature selection framework (HNFS) on networked data. HNFS can simultaneously learn the implicit hierarchical structure among the nodes and embed the hierarchical structure into the feature selection process. Empirical evaluations on various real-world datasets validate the superiority of our proposed framework.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://movie.douban.com/.

  2. 2.

    https://github.com/thunlp/OpenNE/tree/master/data/wiki.

  3. 3.

    http://dmml.asu.edu/users/xufei/datasets.html.

  4. 4.

    https://www.aminer.cn/citation.

References

  1. Alelyani, S., Tang, J., Liu, H.: Feature selection for clustering: a review. Data Cluster.: Algorithms Appl. 21, 110–121 (2013)

    Google Scholar 

  2. Belkin, M., Niyogi, P.: Laplacian eigenmaps and spectral techniques for embedding and clustering. In: NIPS (2001)

    Google Scholar 

  3. Boutsidis, C., Mahoney, M.W., Drineas, P.: Unsupervised feature selection for the $k$-means clustering problem. In: NIPS (2009)

    Google Scholar 

  4. Boyd, S., Parikh, N., Chu, E., Peleato, B., Eckstein, J.: Distributed optimization and statistical learning via the alternating direction method of multipliers. Foundations & Trends in Machine Learning 3, 1–122 (2011)

    Article  Google Scholar 

  5. Cai, D., He, X.: Unsupervised feature selection for multi-cluster data. In: KDD (2010)

    Google Scholar 

  6. Cai, D., He, X., Han, J.: Spectral regression for efficient regularized subspace learning. In: ICCV (2007)

    Google Scholar 

  7. Dong, X., Zhu, L., Song, X., Li, J., Cheng, Z.: Adaptive collaborative similarity learning for unsupervised multi-view feature selection. In: IJCAI (2018)

    Google Scholar 

  8. Fan, M., Chang, X., Zhang, X., Wang, D., Du, L.: Top-k supervise feature selection via ADMM for integer programming. In: IJCAI (2017)

    Google Scholar 

  9. Farahat, A.K., Ghodsi, A., Kamel, M.S.: An efficient greedy method for unsupervised feature selection. In: ICDM (2011)

    Google Scholar 

  10. Farahat, A.K., Ghodsi, A., Kamel, M.S.: Efficient greedy feature selection for unsupervised learning. Knowl. Inf. Syst. 2, 285–310 (2013). https://doi.org/10.1007/s10115-012-0538-1

    Article  Google Scholar 

  11. Gabay, D.: A dual algorithm for the solution of nonlinear variational problems via finite element approximation. Comput. Math. Appl. 2, 17–40 (1976)

    Article  Google Scholar 

  12. He, X., Cai, D., Niyogi, P.: Laplacian score for feature selection. In: NIPS (2005)

    Google Scholar 

  13. Huang, J., Nie, F., Ding, C.: Robust manifold nonnegative matrix factorization. ACM Trans. Knowl. Discov. Data 8, 11:1–11:21 (2014)

    Article  Google Scholar 

  14. Li, J., et al.: Feature selection: a data perspective. ACM Comput. Surv. 50, 94:1–94:45 (2017)

    Google Scholar 

  15. Li, J., Hu, X., Wu, L., Liu, H.: Robust unsupervised feature selection on networked data. In: SDM (2016)

    Google Scholar 

  16. Li, Z., Yang, Y., Zhou, X., Lu, H.: Unsupervised feature selection using nonnegative spectral analysis. In: AAAI (2012)

    Google Scholar 

  17. Lin, C.: Projected gradient methods for nonnegative matrix factorization. Neural Comput. 19, 2756–2779 (2007)

    Article  MathSciNet  Google Scholar 

  18. Liu, H., Shao, M., Fu, Y.: Consensus guided unsupervised feature selection. In: AAAI (2016)

    Google Scholar 

  19. Liu, J., Ji, S., Ye, J.: Multi-task feature learning via efficient \(\ell _{2,1}\)-norm minimization. In: UAI (2009)

    Google Scholar 

  20. Liu, J., Ye, J.: Moreau-Yosida regularization for grouped tree structure learning. In: NIPS (2010)

    Google Scholar 

  21. Liu, Y., Wang, J., Ye, J.: An efficient algorithm for weak hierarchical lasso. ACM Trans. Knowl. Discov. Data 10, 32:1–32:24 (2014)

    Google Scholar 

  22. Nie, F., Huang, H., Cai, X., Ding, C.H.: Efficient and robust feature selection via joint \(ell_{2,1}\)-norms minimization. In: NIPS (2010)

    Google Scholar 

  23. Pan, W., Yang, Q.: Transfer learning in heterogeneous collaborative filtering domains. Artif. Intell. 197, 39–55 (2013)

    Article  MathSciNet  Google Scholar 

  24. Qian, M., Zhai, C.: Robust unsupervised feature selection. In: IJCAI (2013)

    Google Scholar 

  25. Tang, J.: Feature selection with linked data in social media. In: SDM (2012)

    Google Scholar 

  26. Tang, J., Li, H.: Unsupervised feature selection for linked social media data. In: KDD (2012)

    Google Scholar 

  27. Trigeorgis, G., Bousmalis, K., Zafeiriou, S.P., Schuller, B.W.: A deep semi-NMF model for learning hidden representations. In: ICML (2014)

    Google Scholar 

  28. Wang, S., Liu, H.: Embedded unsupervised feature selection. In: AAAI (2015)

    Google Scholar 

  29. Wang, S., Tang, J., Wang, Y., Liu, H.: Exploring implicit hierarchical structures for recommender systems. In: IJCAI (2015)

    Google Scholar 

  30. Wang, S., Wang, Y., Tang, J., Aggarwal, C., Ranganath, S., Liu, H.: Exploiting hierarchical structures for unsupervised feature selection. In: SDM (2017)

    Google Scholar 

  31. Wang, X., Tang, L., Gao, H., Liu, H.: Discovering overlapping groups in social media. In: ICDM (2010)

    Google Scholar 

  32. Wang, X., Tang, L., Liu, H., Wang, L.: Learning with multi-resolution overlapping communities. Knowl. Inf. Syst. 36, 517–535 (2013). https://doi.org/10.1007/s10115-012-0555-0

    Article  Google Scholar 

  33. Yang, Y., Shen, H., Ma, Z., Huang, Z., Zhou, X.: \(\ell _{2,1}\)-norm regularized discriminative feature selection for unsupervised learning. In: IJCAI (2011)

    Google Scholar 

  34. Ye, F., Chen, C., Zheng, Z.: Deep autoencoder-like nonnegative matrix factorization for community detection. In: CIKM (2018)

    Google Scholar 

  35. Yu, L., Liu, H.: Feature selection for high-dimensional data: a fast correlation-based filter solution. In: ICML (2003)

    Google Scholar 

  36. Zhao, Z., Liu, H.: Spectral feature selection for supervised and unsupervised learning. In: ICML (2007)

    Google Scholar 

Download references

Acknowledgement

This work was supported by National Nature Science Foundation of China (No. 61872287, No. 61532015, and No. 61872446), Innovative Research Group of the National Natural Science Foundation of China (No. 61721002), Innovation Research Team of Ministry of Education (IRT_17R86), and Project of China Knowledge Center for Engineering Science and Technology. Besides, this research was funded by National Science and Technology Major Project of the Ministry of Science and Technology of China (No. 2018AAA0102900).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Minnan Luo .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zhang, Y., Chen, C., Luo, M., Li, J., Yan, C., Zheng, Q. (2020). Unsupervised Hierarchical Feature Selection on Networked Data. In: Nah, Y., Cui, B., Lee, SW., Yu, J.X., Moon, YS., Whang, S.E. (eds) Database Systems for Advanced Applications. DASFAA 2020. Lecture Notes in Computer Science(), vol 12114. Springer, Cham. https://doi.org/10.1007/978-3-030-59419-0_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-59419-0_9

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-59418-3

  • Online ISBN: 978-3-030-59419-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics