Low-Rank Feature Reduction and Sample Selection for Multi-output Regression

Zhang, Shichao; Yang, Lifeng; Li, Yonggang; Luo, Yan; Zhu, Xiaofeng

doi:10.1007/978-3-319-49586-6_9

Low-Rank Feature Reduction and Sample Selection for Multi-output Regression

Shichao Zhang^18,19,
Lifeng Yang^18,19,
Yonggang Li^18,19,
Yan Luo^18,19 &
…
Xiaofeng Zhu^18,19

Conference paper
First Online: 13 November 2016

2599 Accesses
1 Citations

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10086))

Abstract

There are always varieties of inherent relational structures in the observations, which is crucial to perform multi-output regression task for high-dimensional data. Therefore, this paper proposes a new multi-output regression method, simultaneously taking into account three kinds of relational structures, \(i.e. \), the relationships between output and output, feature and output, sample and sample. Specially, the paper seeks the correlation of output variables by using a low-rank constraint, finds the correlation between features and outputs by imposing an \(\ell _{2,1}\)-norm regularization on coefficient matrix to conduct feature selection, and discovers the correlation of samples by designing the \(\ell _{2,1}\)-norm on the loss function to conduct sample selection. Furthermore, an effective iterative optimization algorithm is proposed to settle the convex objective function but not smooth problem. Finally, experimental results on many real datasets showed the proposed method outperforms all comparison algorithms in aspect of aCC and aRMSE.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Anderson, T.W.: Estimating linear restrictions on regression coefficients for multivariate normal distributions. Ann. Math. Stat. 22, 327–351 (1951)
Article MathSciNet MATH Google Scholar
Argyriou, A., Evgeniou, T., Pontil, M.: Convex multi-task feature learning. Mach. Learn. 73(3), 243–272 (2008)
Article Google Scholar
Bache, K., Lichman, M.: Uci machine learning repository (2015)
Google Scholar
Borchani, H., Varando, G., Bielza, C., Larrañaga, P.: A survey on multi-output regression. Wiley Interdisc. Rev. Data Mining Knowl. Discov. 5(5), 216–233 (2015)
Article Google Scholar
Cai, X., Ding, C., Nie, F., Huang, H.: On the equivalent of low-rank linear regressions and linear discriminant analysis based regressions. In: ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1124–1132 (2013)
Google Scholar
Cai, X., Nie, F., Cai, W., Huang, H.: New graph structured sparsity model for multi-label image annotations, pp. 801–808 (2013)
Google Scholar
Cands, E.J., Recht, B.: Exact matrix completion via convex optimization. Found. Comput. Math. 9(6), 717–772 (2008)
Article MathSciNet Google Scholar
Cao, J., Wu, Z., Wu, J.: Scaling up cosine interesting pattern discovery: a depth-first method. Inf. Sci. 266(5), 31–46 (2014)
Article Google Scholar
Cao, J., Wu, Z., Wu, J., Xiong, H.: Sail: Summation-based incremental learning for information-theoretic text clustering. IEEE Trans. Cybern. 43(2), 570–584 (2013)
Article Google Scholar
Chang, X., Nie, F., Yang, Y., Huang, H.: A convex formulation for semi-supervised multi-label feature selection. In: AAAI Conference on Artificial Intelligence, pp. 1171–1177 (2014)
Google Scholar
Cheng, B., Liu, G., Wang, J., Huang, Z., Yan, S.: Multi-task low-rank affinity pursuit for image segmentation. In: International Conference on Computer Vision, pp. 2439–2446 (2011)
Google Scholar
Džeroski, S., Demšar, D., Grbović, J.: Predicting chemical parameters of river water quality from bioindicator data. Appl. Intell. 13(1), 7–17 (2000)
Article Google Scholar
Gao, L., Song, J., Nie, F., Yan, Y.: Optimal graph learning with partial tags and multiple features for image and video annotation. In: CVPR (2015)
Google Scholar
Gao, L., Song, J., Shao, J., Zhu, X., Shen, H.: Zero-shot image categorization by image correlation exploration. In: ICMR, pp. 487–490 (2015)
Google Scholar
Gower, J.C., Dijksterhuis, G.B.: Procrustes problems. Oxford University Press (2004)
Google Scholar
Izenman, A.J.: Reduced-rank regression for the multivariate linear model. J. Multivar. Anal. 5(2), 248–264 (1975)
Article MathSciNet MATH Google Scholar
Karali, A., Bratko, I.: First order regression. Mach. Learn. 26(26), 147–176 (1997)
Article MATH Google Scholar
Nie, F., Huang, H., Cai, X., Ding, C.H.Q.: Efficient and robust feature selection via joint l2,1-norms minimization. In: Conference on Neural Information Processing Systems 2010, pp. 1813–1821 (2010)
Google Scholar
Qin, Y., Zhang, S., Zhu, X., Zhang, J., Zhang, C.: Semi-parametric optimization for missing data imputation. Appl. Intell. 27(1), 79–88 (2007)
Article MATH Google Scholar
Rai, P., Kumar, A., Iii, H.D.: Simultaneously leveraging output and task structures for multiple-output regression. In: Advances in Neural Information Processing Systems, pp. 3185–3193 (2012)
Google Scholar
Rothman, A.J., Ji, Z.: Sparse multivariate regression with covariance estimation. J. Comput. Graphical Stat. 19(4), 947–962 (2010)
Article MathSciNet Google Scholar
Spyromitros-Xioufis, E., Tsoumakas, G., Groves, W., Vlahavas, I.: Multi-label classification methods for multi-target regression. Computer Science (2014)
Google Scholar
Spyromitros-Xioufis, E., Tsoumakas, G., Groves, W., Vlahavas, I.: Multi-target regression via input space expansion: treating targets as inputs. Mach. Learn., 1–44 (2016)
Google Scholar
Wang, H., Nie, F., Huang, H., Risacher, S., Ding, C., Saykin, A.J., Shen, L.: Sparse multi-task regression and feature selection to identify brain imaging predictors for memory performance. In: IEEE International Conference on Computer Vision, pp. 557–562 (2010)
Google Scholar
Wu, X., Zhang, C., Zhang, S.: Efficient mining of both positive and negative association rules. ACM Trans. Inf. Syst. (TOIS) 22(3), 381–405 (2004)
Article Google Scholar
Wu, X., Zhang, C., Zhang, S.: Database classification for multi-database mining. Inf. Syst. 30(1), 71–88 (2005)
Article MATH Google Scholar
Wu, X., Zhang, S.: Synthesizing high-frequency rules from different data sources. IEEE Trans. Knowl. Data Eng. 15(2), 353–367 (2003)
Article Google Scholar
Zhang, C., Qin, Y., Zhu, X., Zhang, J., Zhang, S.: Clustering-based missing value imputation for data preprocessing. In: IEEE International Conference on Industrial Informatics, pp. 1081–1086 (2006)
Google Scholar
Zhang, S., Cheng, D., Zong, M., Gao, L.: Self-representation nearest neighbor search for classification. Neurocomputing 195, 137–142 (2016)
Article Google Scholar
Zhang, S., Li, X., Zong, M., Cheng, D., Gao, L.: Learning k for knn classification. ACM Trans. Intell. Syst. Technol. (2016, Accepted)
Google Scholar
Zhang, S., Qin, Z., Ling, C.X., Sheng, S.: “missing is useful”: Missing values in cost-sensitive decision trees. IEEE Trans. Knowl. Data Eng. 17(12), 1689–1693 (2005)
Article Google Scholar
Zhang, S., Wu, X., Zhang, C.: Multi-database mining. IEEE Comput. Intell. Bull. 2(1), 5–13 (2003)
Google Scholar
Zhang, S., Zhang, C., Yang, Q.: Data preparation for data mining. Appl. Artif. Intell. 17(5–6), 375–381 (2003)
Article Google Scholar
Zhang, S., Zhang, J., Zhang, C.: Edua: an efficient algorithm for dynamic database mining. Inf. Sci. 177(13), 2756–2767 (2007)
Article Google Scholar
Zhu, X., Huang, Z., Cheng, H., Cui, J., Shen, H.T.: Sparse hashing for fast multimedia search. ACM Trans. Inf. Syst. (TOIS) 31(2), 9 (2013)
Article Google Scholar
Zhu, X., Huang, Z., Yang, Y., Shen, H.T., Xu, C., Luo, J.: Self-taught dimensionality reduction on the high-dimensional small-sized data. Pattern Recogn. 46(1), 215–229 (2013)
Article MATH Google Scholar
Zhu, X., Li, X., Zhang, S.: Block-row sparse multiview multilabel learning for image classification. IEEE Trans. Cybern. 46(2), 450–461 (2016)
Article Google Scholar
Zhu, X., Li, X., Zhang, S., Ju, C., Wu, X.: Robust joint graph sparse coding for unsupervised spectral feature selection. IEEE Trans. Neural Netw. Learn. Syst., 1–13 (2016)
Google Scholar
Zhu, X., Wu, X., Ding, W., Zhang, S.: Feature selection by joint graph sparse coding. In: Proceedings of the 2013 Siam International Conference on Data Mining, pp. 803–811. SIAM (2013)
Google Scholar
Zhu, X., Zhang, J., Zhang, S.: Mixed-norm regression for visual classification. In: Motoda, H., Wu, Z., Cao, L., Zaiane, O., Yao, M., Wang, W. (eds.) ADMA 2013. LNCS (LNAI), vol. 8346, pp. 265–276. Springer, Heidelberg (2013). doi:10.1007/978-3-642-53914-5_23
Chapter Google Scholar
Zhu, X., Zhang, S., Jin, Z., Zhang, Z., Xu, Z.: Missing value estimation for mixed-attribute data sets. IEEE Trans. Knowl. Data Eng. 23(1), 110–121 (2011)
Article Google Scholar
Zhu, X., Zhang, S., Zhang, J., Zhang, C.: Cost-sensitive imputing missing values with ordering. AAAI Press 2, 1922–1923 (2007)
Google Scholar
Zhu, Y., Lucey, S.: Convolutional sparse coding for trajectory reconstruction. IEEE Trans. Pattern Anal. Mach. Intell. 37(3), 529–540 (2015)
Article Google Scholar

Download references

Acknowledgement

This work was supported in part by the China “1000-Plan” National Distinguished Professorship; the National Natural Science Foundation of China (Grants No: 61263035, 61573270, and 61672177); the China 973 Program (Grant No: 2013CB329404); the China Key Research Program (Grant No: 2016YFB1000905); the Guangxi Natural Science Foundation (Grant No: 2015GXNSFCB139011); the Innovation Project of Guangxi Graduate Education (Grants No: YCSZ2016046 and YCSZ2016045); the Guangxi Higher Institutions Program of Introducing 100 High-Level Overseas Talents; the Guangxi Collaborative Innovation Center of Multi-Source Information Integration and Intelligent Processing; and the Guangxi Bagui Scholar Teams for Innovation and Research Project.

Author information

Authors and Affiliations

Guangxi Key Lab of Multi-source Information Mining and Security, Guangxi Normal University, Guilin, 541004, Guangxi, People’s Republic of China
Shichao Zhang, Lifeng Yang, Yonggang Li, Yan Luo & Xiaofeng Zhu
College of CS & IT, Guangxi Normal University, Guilin, 541004, Guangxi, People’s Republic of China
Shichao Zhang, Lifeng Yang, Yonggang Li, Yan Luo & Xiaofeng Zhu

Authors

Shichao Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Lifeng Yang
View author publications
You can also search for this author in PubMed Google Scholar
Yonggang Li
View author publications
You can also search for this author in PubMed Google Scholar
Yan Luo
View author publications
You can also search for this author in PubMed Google Scholar
Xiaofeng Zhu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shichao Zhang .

Editor information

Editors and Affiliations

University of Technology , Sydney, New South Wales, Australia
Jinyan Li
University of Queensland , Brisbane, Australia
Xue Li
Beijing Institute of Technology , Beijing, China
Shuliang Wang
University of Western Australia , Crawley, West Australia, Australia
Jianxin Li
University of Adelaide , Adelaide, South Australia, Australia
Quan Z. Sheng

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhang, S., Yang, L., Li, Y., Luo, Y., Zhu, X. (2016). Low-Rank Feature Reduction and Sample Selection for Multi-output Regression. In: Li, J., Li, X., Wang, S., Li, J., Sheng, Q. (eds) Advanced Data Mining and Applications. ADMA 2016. Lecture Notes in Computer Science(), vol 10086. Springer, Cham. https://doi.org/10.1007/978-3-319-49586-6_9

Download citation

DOI: https://doi.org/10.1007/978-3-319-49586-6_9
Published: 13 November 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-49585-9
Online ISBN: 978-3-319-49586-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics