Skip to main content
Log in

Real-time recommendation with locality sensitive hashing

  • Published:
Journal of Intelligent Information Systems Aims and scope Submit manuscript

Abstract

Neighborhood-based collaborative filtering (CF) methods are widely used in recommender systems because they are easy-to-implement and highly effective. One of the significant challenges of these methods is the ability to scale with the increasing amount of data since finding nearest neighbors requires a search over all of the data. Approximate nearest neighbor (ANN) methods eliminate this exhaustive search by only looking at the data points that are likely to be similar. Locality sensitive hashing (LSH) is a well-known technique for ANN search in high dimensional spaces. It is also effective in solving the scalability problem of neighborhood-based CF. In this study, we provide novel improvements to the current LSH based recommender algorithms and make a systematic evaluation of LSH in neighborhood-based CF. Besides, we make extensive experiments on real-life datasets to investigate various parameters of LSH and their effects on multiple metrics used to evaluate recommender systems. Our proposed algorithms have better running time performance than the standard LSH-based applications while preserving the prediction accuracy in reasonable limits. Also, the proposed algorithms have a large positive impact on aggregate diversity which has recently become an important evaluation measure for recommender algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

Notes

  1. sparsity of a dataset = #ratings/(#items#users)

  2. http://research.yahoo.com/Academic_Relations

  3. http://jmcauley.ucsd.edu/data/amazon/

References

  • Adomavicius, G., & Kwon, Y. (2012). Improving aggregate recommendation diversity using Ranking-Based techniques. IEEE Transactions on Knowledge and Data Engineering, 24(5), 896–911.

    Article  Google Scholar 

  • Anand, R., & Jeffrey David, U. (2011). Mining of massive datasets, (pp. 73–126). New York: Cambridge University Press.

    Google Scholar 

  • Andoni, A., & Indyk, P. (2008). Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. Communications of the ACM, 51(1), 117–122.

    Article  Google Scholar 

  • Aytekin, T., & Karakaya, M.O. (2014). Clustering-based diversity improvement in recommendation. Journal of Intelligent Information System, 42(1), 1–18.

    Article  Google Scholar 

  • Bahmani, B., Goel, A., Shinde, R. (2012). Efficient distributed locality sensitive hashing. In 21st ACM international conference on information and knowledge management, CIKM’12, Maui, HI, USA (pp. 2174–2178). ACM.

  • Billsus, D., & Pazzani, M.G. (1998). Learning collaborative information filters. In Proceedings of the fifteenth international conference on machine learning (ICML 1998), Madison, Wisconsin, USA (pp. 46–54).

  • Cacheda, F., Carneiro, V., Fernȧndez, D., Formoso, V. (2011). Comparison of collaborative filtering algorithms: limitations of current techniques and proposals for scalable, high-performance recommender systems. TWEB, 5(1), 2.

    Article  Google Scholar 

  • Charikar, M. (2002). Similarity estimation techniques from rounding algorithms. In Proceedings on 34th annual ACM symposium on theory of computing, Montréal, Québec, Canada (pp. 380–388). ACM.

  • Das, A., Datar, M., Garg, A., Rajaram, S.S. (2007). Google news personalization: scalable online collaborative filtering. In Proceedings of the 16th international conference on world wide web, WWW 2007, Banff, Alberta, Canada (pp. 271–280).

  • Deshpande, M., & Karypis, G. (2004). Item-based top-N recommendation algorithms. ACM Transactions on Information Systems, 22(1), 143–177.

    Article  Google Scholar 

  • Desrosiers, C., & Karypis, G. (2011). A comprehensive survey of neighborhood-based recommendation methods, recommender systems handbook, (pp. 107–144). Berlin: Springer.

    Book  Google Scholar 

  • Ekstrand, M.D., Riedl, J., Konstan, J.A. (2011). Collaborative filtering recommender systems. Foundations and Trends in Human-Computer Interaction, 4(2), 175–243.

    Article  Google Scholar 

  • Gionis, A., Indyk, P., Motwani, R. (1999). Similarity search in high dimensions via hashing. In VLDB’99, Proceedings of 25th international conference on very large data bases, Edinburgh, Scotland, UK (pp. 518–529).

  • Gong, S. (2010). A collaborative filtering recommendation algorithm based on user clustering and item clustering. JSW, 5(7), 745–752.

    Article  Google Scholar 

  • Herlocker, J.L., Konstan, J.A., Riedl, J. (2002). An empirical analysis of design choices in neighborhood-based collaborative filtering algorithms. Information Retrieval, 5(4), 287–310.

    Article  Google Scholar 

  • Herlocker, J.L., Konstan, J.A., Terveen, L.G., Riedl, J. (2004). Evaluating collaborative filtering recommender systems. ACM Transactions on Information Systems, 22(1), 5–53.

    Article  Google Scholar 

  • Huizhi, L., Haoran, D., Qing, W. (2014). Real-time collaborative filtering recommender systems. In Proceedings of the 12nd Australasian data mining conference (AusDM).

  • Jiang, J., Lu, J., Zhang, G., Long, G. (2011). Scaling-up item-based collaborative filtering recommendation algorithm based on hadoop. In World congress on services, SERVICES 2011, Washington, DC, USA (pp. 490–497).

  • Karypis, G. (2001). Evaluation of item-based top-N recommendation algorithms. In Proceedings of the 2001 ACM CIKM international conference on information and knowledge management, Atlanta, Georgia, USA, November 5-10 (pp. 247–254).

  • Kannan, R., Ishteva, M., Park, H. (2014). Bounded matrix factorization for recommender system. Knowledge and Information Systems, 39(3), 491–511.

    Article  Google Scholar 

  • Koga, H., Ishibashi, T., Watanabe, T. (2007). Fast agglomerative hierarchical clustering algorithm using locality-sensitive hashing. Knowledge and Information Systems, 12(1), 25–53.

    Article  MATH  Google Scholar 

  • McAuley, J.J., Targett, C., Shi, Q., Hengel, A.V.D. (2015). Image-based recommendations on styles and substitutes. CoRR, arXiv:1506.04757.

  • Pazzani, M.J., & Billsus, D. (2007). Content-based recommendation systems. In The adaptive web (pp. 325–341). Springer.

  • Rashid, A.M., Lam, S.K., LaPitz, A., Karypis, G., Riedl, J. (2006). Towards a scalable k NN CF algorithm: exploring effective applications of clustering. In Advances in web mining and web usage analysis, WebKDD 2006, Philadelphia, PA, USA (pp. 147–166).

  • Shani, G., & Gunawardana, A. (2011). Evaluating recommendation systems, recommender systems handbook, (pp. 257–297). Berlin: Springer.

    Book  Google Scholar 

  • Suchal, J., & Nȧvrat P. (2010). Full text search engine as scalable k-nearest neighbor recommendation system. In Artificial intelligence in theory and practice III - third IFIP TC 12 international conference on artificial intelligence, IFIP AI 2010, Brisbane, Australia (pp. 165–173).

  • Yu, H.F., Hsieh, C.J., Si, S., Dhillon, I.S. (2012). Scalable coordinate descent approaches to parallel matrix factorization for recommender systems. In 12th IEEE international conference on data mining, ICDM 2012, Brussels, Belgium (pp. 765–774).

  • Zhang, YC, Ó Séaghdha, D, Quercia, D, Jambor, T. (2012). Auralist: introducing serendipity into music recommendation. In Proceedings of the fifth international conference on web search and web data mining, WSDM 2012, Seattle, WA, USA (pp. 13–22). ACM.

  • Zhao, X., Niu, Z., Chen, W., Shi, C., Niu, K., Liu, D. (2015). A hybrid approach of topic model and matrix factorization based on two-step recommendation framework. Journal of Intelligent Information System, 44(3), 335–353.

    Article  Google Scholar 

  • Zhou, T., Kuscsik, Z., Liu, J.G., Medo, M., Wakeling, J.R., Zhang, Y.C. (2010). Solving the apparent diversity-accuracy dilemma of recommender systems. Proceedings of the National Academy of Sciences, 107(10), 4511–4515.

    Article  Google Scholar 

Download references

Acknowledgements

This research was supported by Central Securities Depository Institution (MKK) of Turkish Capital Markets.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ahmet Maruf Aytekin.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Aytekin, A.M., Aytekin, T. Real-time recommendation with locality sensitive hashing. J Intell Inf Syst 53, 1–26 (2019). https://doi.org/10.1007/s10844-019-00552-1

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10844-019-00552-1

Keywords

Navigation