Improving recommendation quality through outlier removal

Xu, Yuan-Yuan; Gu, Shen-Ming; Min, Fan

doi:10.1007/s13042-021-01490-7

Improving recommendation quality through outlier removal

Original Article
Published: 19 January 2022

Volume 13, pages 1819–1832, (2022)
Cite this article

International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

333 Accesses
1 Citation
1 Altmetric
Explore all metrics

Abstract

Rating data collected by recommendation systems contain noise caused by human uncertainty and malicious attacks. Existing outlier removal approaches usually aim at detecting noise inserted into ground-truth ratings. However, in real applications, the ground-truth of the training data are unavailable, or even unimportant for the prediction task. In this paper, we propose an efficient and effective outlier removal algorithm to improve the quality of the training data. The noise is modeled by the mixture of Gaussian distribution, which can approximate any continuous distribution. First, we employ the expectation-maximization algorithm to calculate the low-rank matrices, whose product forms the recovered ratings. Second, we compare the original and recovered ratings to solicit suspected outliers. This process is repeated a number of times, and ratings that are suspected enough times will be treated as outliers. To validate the effectiveness of our algorithm, we compared the prediction quality of four popular recommendation algorithms. Results showed that several measures on the algorithms were improved with the new training data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Design of electronic-commerce recommendation systems based on outlier mining

Article 06 August 2020

Detecting Anomalous Ratings Using Matrix Factorization for Recommender Systems

Magic barrier estimation models for recommended systems under normal distribution

Article 20 July 2018

Notes

References

Soros G (2013) Fallibility, reflexivity, and the human uncertainty principle. J Econ Methodol 20(4):309–329
Article Google Scholar
Beinhocker ED (2013) Reflexivity, complexity, and the nature of social science. J Econ Methodol 20(4):330–342
Article Google Scholar
Aggarwal CC et al (2016) Recommender systems. Springer, New York
Book Google Scholar
Gunes I, Kaleli C, Bilge A, Polat H (2014) Shilling attacks against recommender systems: a comprehensive survey. Artif Intell Rev 42(4):767–799
Article Google Scholar
O’Mahony MP, Hurley NJ, Silvestre G (2006) Detecting noise in recommender system databases. In: Proceedings of the 11th international conference on intelligent user interfaces, ACM pp 109–115
Wu C, Zhang Q, Zhao F, Cheng Y, Wang G (2021) Three-way recommendation model based on shadowed set with uncertainty invariance. Int J Approx Reason 135:53–70
Article MathSciNet Google Scholar
Wang YP, Yu H, Wang GY, Xie YF (2020) Cross-domain recommendation based on sentiment analysis and latent feature mapping. Entropy 22(4):473
Article Google Scholar
Zhang HR, Min F, Wu YX, Fu ZL, Gao L (2018) Magic barrier estimation models for recommended systems under normal distribution. Appl Intell 48(12):4678–4693
Article Google Scholar
Sah RK (1991) Fallibility in human organizations and political systems. J Econ Perspect 5(2):67–88
Article Google Scholar
Xu YS, Zhang FZ (2019) Detecting shilling attacks in social recommender systems based on time series analysis and trust features. Knowl-Based Syst 178:25–47
Article Google Scholar
Pang M, Gao W, Tao M, Zhou ZH (2018) Unorganized malicious attacks detection. In: NIPS. pp 6976–6985
Lam SK, Riedl J (2004) Shilling recommender systems for fun and profit. In: WWW. pp 393–402
Luca M.: Reviews, reputation, and revenue: the case of yelp. com. Harvard Business School Working Papers 12-016, Harvard Business School (2016)
Jasberg K, Sizov S (2017) The magic barrier revisited: accessing natural limitations of recommender assessment. In: RecSys. pp 55–64
Ling G, King I, Lyu MR (2013) A unified framework for reputation estimation in online rating systems. IJCA I:2670–2676
Google Scholar
Williams CA, Mobasher B, Burke R (2007) Defending recommender systems: detection of profile injection attacks. Serv Orient Comput Appl 1(3):157–170
Article Google Scholar
Yap GE, Tan AH, Pang HH (2007) Discovering and exploiting causal dependencies for robust mobile context-aware recommenders. IEEE Trans Knowl Data Eng 19(7):977–992
Article Google Scholar
Li B, Chen L, Zhu XQ, Zhang CQ (2013) Noisy but non-malicious user detection in social recommender systems. World Wide Web 16(5–6):677–699
Article Google Scholar
Kim E, Pyo S, Park E, Kim M (2011) An automatic recommendation scheme of TV program contents for (IP)TV personalization. IEEE Trans Broadcast 57(3):674–684
Article Google Scholar
Chen XA, Han Z, Wang Y, Zhao Q, Meng DY, Tang YD (2016) Robust tensor factorization with unknown noise. In: CVPR pp 5213–5221
McLachlan GJ, Basford KE (1988) Mixture models: inference and applications to clustering. Marcel Dekker
Meng DY, De La Torre F (2013) Robust matrix factorization with unknown noise. In: ICCV. pp 1337–1344
Hofmann T (2003) Collaborative filtering via Gaussian probabilistic latent semantic analysis. In: SIGIR pp 259–266
Si L, Jin R (2003) Flexible mixture model for collaborative filtering. In: ICML pp 704–711
Moon TK (1996) The expectation-maximization algorithm. IEEE Signal Process Mag 13(6):47–60
Article Google Scholar
Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc Ser B 85:1–38
MathSciNet MATH Google Scholar
Liu D (2021) The effectiveness of three-way classification with interpretable perspective. Inform Sci 567:237–255
Article MathSciNet Google Scholar
Xu YY, Zhang HR, Min F (2017) A three-way recommender system for popularity-based costs. In: Proceedings of international joint conference on rough set. pp 278–289
Gemmell J, Schimoler T, Ramezani M, Mobasher B (2009) Adapting k-nearest neighbor for tag recommendation in folksonomies. In: ITWP
Zhang HR, Min F, Zhang ZH, Wang S (2018) Efficient collaborative filtering recommendations with multi-channel feature vectors. Int J Mach Learn Cybernet 10:1–8
Google Scholar
Tsai CF, Hung C (2012) Cluster ensembles in collaborative filtering recommendation. Appl Soft Comput 12:1417–1425
Article Google Scholar
Liu D, Ye XQ (2020) A matrix factorization based dynamic granularity recommendation with three-way decisions. Knowl Based Syst 191:105243
Article Google Scholar
Nilashi M, Ibrahim O, Bagherifard K (2018) A recommender system based on collaborative filtering using ontology and dimensionality reduction techniques. Expert Syst Appl 92:507–520
Article Google Scholar
Panagiotakis C, Papadakis H, Papagrigoriou A, Fragopoulou P (2021) Improving recommender systems via a dual training error based correction approach. Expert Syst Appl 183:115386
Article Google Scholar
Zhang HR, Min F (2016) Three-way recommender systems based on random forests. Knowl-Based Syst 91:275–286
Article Google Scholar
Zhang HR, Min F, Shi B (2017) Regression-based three-way recommendation. Inform Sci 378:444–461
Article Google Scholar
Ye XQ, Liu D (2021) An interpretable sequential three-way recommendation based on collaborative topic regression. Expert Syst Appl 168:114454
Article Google Scholar
Revaud J, Almazán J, Rezende RS, Souza CRD (2019) Learning with average precision: training image retrieval with a listwise loss. In: ICCV. pp 5107–5116
Chen WS, Zhao Y, Pan B, Chen B (2019) Supervised kernel nonnegative matrix factorization for face recognition. Neurocomputing 205:165–181
Article Google Scholar
Devooght R, Kourtellis N, Mantrach A (2015) Dynamic matrix factorization with priors on unknown values. In: SIGKDD. pp 189–198
He X, Zhang H, Kan MY, Chua TS (2016) Fast matrix factorization for online recommendation with implicit feedback. In: SIGIR. pp 549–558
Funk S (2006) Netflix update: try this at home
Hu Y, Koren Y, Volinsky C (2008) Collaborative filtering for implicit feedback datasets. In: ICDM. pp 263–272
Davoudi A, Chatterjee M (2017) Detection of profile injection attacks in social recommender systems using outlier analysis. In: ICBD. pp 2714–2719
Panagiotakis C, Papadakis H, Fragopoulou P (2020) Unsupervised and supervised methods for the detection of hurriedly created profiles in recommender systems. Int J Mach Learn Cybernet 11(9):2165–2179
Article Google Scholar
Toledo RY, Mota YC, Martínez L (2015) Correcting noisy ratings in collaborative recommender systems. Knowl-Based Syst 76:96–108
Article Google Scholar
Chakraborty PS (2020) Attack detection in recommender systems using subspace outlier detection algorithm. In: Proceedings of the 2nd international conference on communication, devices and computing. pp 679—685
Scheunders P, De Backer S (2007) Wavelet denoising of multicomponent images using Gaussian scale mixture models and a noise-free image as priors. IEEE Trans Image Process 16(7):1865–1872
Article MathSciNet Google Scholar
Hansen F, Pedersen GK (1982) Jensen’s inequality for operators and löwner’s theorem. Math Ann 258(3):229–241
Article MathSciNet Google Scholar
Peajcariaac JE, Tong YL (1992) Convex functions, partial orderings, and statistical applications. Academic Press, San Diego
Google Scholar
Yu H, Zhou B, Deng MY, Hu F (2018) Tag recommendation method in folksonomy based on user tagging status. J Intell Inform Syst 14:1–22
Google Scholar
Ma TH, Zhou JJ, Tang ML, Tian Y, Al-Dhelaan A, Al-Rodhaan M, Lee S (2015) Social network and tag sources based augmenting collaborative recommender system. IEICE Trans Inform Syst 98(4):902–910
Article Google Scholar
Harper FM, Konstan JA (2016) The movielens datasets: history and context. Acm Trans Interact Intell Syst 5(4):1–19
Article Google Scholar
Sarwar B. Karypis G, Konstan J, Riedl J (2001) Item-based collaborative filtering recommendation algorithms. In: WWW. pp 285–295
Adeniyi D, Wei ZQ, Yang YQ (2016) Automated web usage data mining and recommendation system using K-nearest neighbor (KNN) classification method. Appl Comput Inform 12(1):90–108
Article Google Scholar
Kannan R, Woo H, Aggarwal CC, Park H (2017) Outlier detection for text data. In: Proceedings of the 2017 SIAM international conference on data mining. pp 489–497
Marnissi Y, Zheng Y, Chouzenoux E, Pesquet JC (2017) A variational Bayesian approach for image restoration—application to image deblurring with poisson-gaussian noise. IEEE Trans Comput Imaging 3(4):722–737
Article MathSciNet Google Scholar
Cao XY, Chen Y, Zhao Q, Meng DY, Wang Y, Wang D, Xu ZB (2015) Low-rank matrix factorization under general mixture noise distributions. In: ICCV. pp 1493–1501
Yang ZZ, Fan L, Yang YP, Yang Z, Gui G (2020) Generalized nuclear norm and Laplacian scale mixture based low-rank and sparse decomposition for video foreground-background separation. Signal Process 172:107527
Article Google Scholar

Download references

Acknowledgements

This work is supported in part by the National Natural Scientific Foundation of China (61976194, 41631179), the Open project of Key Laboratory of Oceanographic Big Data Mining and Application of Zhejiang Province (OBMA202005), the Zhejiang Provincial Natural Science Foundation of China (LY18F030017), the Natural Science Foundation of Sichuan Province (2019YJ0314).

Author information

Authors and Affiliations

School of Computer Science, Southwest Petroleum University, Chengdu, 610500, People’s Republic of China
Yuan-Yuan Xu & Fan Min
Key Laboratory of Oceanographic Big Data Mining & Application of Zhejiang Province, Zhejiang Ocean University, Zhoushan, 316022, People’s Republic of China
Shen-Ming Gu

Authors

Yuan-Yuan Xu
View author publications
You can also search for this author in PubMed Google Scholar
Shen-Ming Gu
View author publications
You can also search for this author in PubMed Google Scholar
Fan Min
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Fan Min.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Xu, YY., Gu, SM. & Min, F. Improving recommendation quality through outlier removal. Int. J. Mach. Learn. & Cyber. 13, 1819–1832 (2022). https://doi.org/10.1007/s13042-021-01490-7

Download citation

Received: 27 May 2021
Accepted: 06 December 2021
Published: 19 January 2022
Issue Date: July 2022
DOI: https://doi.org/10.1007/s13042-021-01490-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Improving recommendation quality through outlier removal

Abstract

Access this article

Similar content being viewed by others

Design of electronic-commerce recommendation systems based on outlier mining

Detecting Anomalous Ratings Using Matrix Factorization for Recommender Systems

Magic barrier estimation models for recommended systems under normal distribution

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Improving recommendation quality through outlier removal

Abstract

Access this article

Similar content being viewed by others

Design of electronic-commerce recommendation systems based on outlier mining

Detecting Anomalous Ratings Using Matrix Factorization for Recommender Systems

Magic barrier estimation models for recommended systems under normal distribution

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation