skip to main content
10.1145/3459637.3482414acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

BNN: Boosting Neural Network Framework Utilizing Limited Amount of Data

Authors Info & Claims
Published:30 October 2021Publication History

ABSTRACT

Deep learning (DL) algorithms have played a major role in achieving state-of-the-art (SOTA) performance in various learning applications, including computer vision, natural language processing, and recommendation systems (RSs). However, these methods are based on a vast amount of data and do not perform as well when there is a limited amount of data available. Moreover, some of these applications (e.g., RSs) suffer from other issues such as data sparsity and the cold-start problem. While recent research on RSs used DL models based on side information (SI) (e.g., product reviews, film plots, etc.) to tackle these challenges, we propose boosting neural network (BNN), a new DL framework for capturing complex patterns, which requires just a limited amount of data. Unlike conventional boosting, BNN does not sum the predictions generated by its components. Instead, it uses these predictions as new SI features which enhances accuracy. Our framework can be utilized for many problems, including classification, regression, and ranking. In this paper, we demonstrate BNN's use for addressing a classification task. Comprehensive experiments conducted to illustrate BNN's effectiveness on three real-world datasets demonstrated its ability to outperform existing SOTA models for classification tasks (e.g., clickthrough rate prediction).

Skip Supplemental Material Section

Supplemental Material

CIKM21-rgfp1291.mp4

mp4

36.4 MB

References

  1. Sarkhan Badirli, Xuanqing Liu, Zhengming Xing, Avradeep Bhowmik, Khoa Doan, and Sathiya S Keerthi. 2020. Gradient boosting neural networks: Grownet. arXiv preprint arXiv:2002.07971 (2020).Google ScholarGoogle Scholar
  2. Robert M Bell and Yehuda Koren. 2007. Lessons from the Netflix prize challenge. Acm Sigkdd Explorations Newsletter 9, 2 (2007), 75--79. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. James Bennett, Charles Elkan, Bing Liu, Padhraic Smyth, and Domonkos Tikk. 2007. KDD Cup and Workshop 2007. SIGKDD Explor. Newsl. 9, 2, 51--52. https://doi.org/10.1145/1345448.1345459 Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Christopher Bishop. 2006. Pattern Recognition and Machine Learning. Springer. https://www.microsoft.com/en-us/research/publication/patternrecognition-machine-learning/ Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Tianqi Chen and Carlos Guestrin. 2016. XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (San Francisco, California, USA) (KDD '16). Association for Computing Machinery, New York, NY, USA, 785--794. https://doi.org/10.1145/2939672.2939785 Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Chen Cheng, Fen Xia, Tong Zhang, Irwin King, and Michael R. Lyu. 2014. Gradient Boosting Factorization Machines. In Proceedings of the 8th ACM Conference on Recommender Systems (Foster City, Silicon Valley, California, USA) (RecSys '14). Association for Computing Machinery, New York, NY, USA, 265--272. https://doi.org/10.1145/2645710.2645730 Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Heng-Tze Cheng, Levent Koc, Jeremiah Harmsen, Tal Shaked, Tushar Chandra, Hrishi Aradhye, Glen Anderson, Greg Corrado, Wei Chai, Mustafa Ispir, Rohan Anil, Zakaria Haque, Lichan Hong, Vihan Jain, Xiaobing Liu, and Hemal Shah. 2016. Wide & Deep Learning for Recommender Systems. In Proceedings of the 1st Workshop on Deep Learning for Recommender Systems (Boston, MA, USA) (DLRS 2016). Association for Computing Machinery, New York, NY, USA, 7--10. https://doi.org/10.1145/2988450.2988454 Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018).Google ScholarGoogle Scholar
  9. Yoav Freund and Robert E. Schapire. 1995. A desicion-theoretic generalization of on-line learning and an application to boosting. In Computational Learning Theory, Paul Vitányi (Ed.). Springer Berlin Heidelberg, Berlin, Heidelberg, 23--37. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Jerome H Friedman. 2001. Greedy function approximation: a gradient boosting machine. Annals of statistics (2001), 1189--1232.Google ScholarGoogle Scholar
  11. David Guillamet and Jordi Vitria. 2002. Non-negative Matrix Factorization for Face Recognition. In Topics in Artificial Intelligence, M. Teresa Escrig, Francisco Toledo, and Elisabet Golobardes (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 336--344. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Huifeng Guo, Ruiming Tang, Yunming Ye, Zhenguo Li, and Xiuqiang He. 2017. DeepFM: a factorization-machine based neural network for CTR prediction. arXiv preprint arXiv:1703.04247 24, 3 (2017), 262--290.Google ScholarGoogle Scholar
  13. Jie Hu, Li Shen, and Gang Sun. 2018. Squeeze-and-Excitation Networks. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Google ScholarGoogle Scholar
  14. Tongwen Huang, Zhiqi Zhang, and Junlin Zhang. 2019. FiBiNET: Combining Feature Importance and Bilinear Feature Interaction for Click-through Rate Prediction. In Proceedings of the 13th ACM Conference on Recommender Systems (Copenhagen, Denmark) (RecSys '19). Association for Computing Machinery, New York, NY, USA, 169--177. https://doi.org/10.1145/3298689.3347043 Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Rie Johnson and Tong Zhang. 2013. Learning nonlinear functions using regularized greedy forest. IEEE transactions on pattern analysis and machine intelligence 36, 5 (2013), 942--954.Google ScholarGoogle Scholar
  16. Yuchin Juan, Yong Zhuang, Wei-Sheng Chin, and Chih-Jen Lin. 2016. FieldAware Factorization Machines for CTR Prediction. In Proceedings of the 10th ACM Conference on Recommender Systems (Boston, Massachusetts, USA) (RecSys '16). Association for Computing Machinery, New York, NY, USA, 43--50. https://doi.org/10.1145/2959100.2959134 Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Diederik Kingma and Jimmy Ba. 2014. Adam: A Method for Stochastic Optimization. International Conference on Learning Representations (12 2014).Google ScholarGoogle Scholar
  18. Yehuda Koren. 2009. The bellkor solution to the netflix grand prize. Netflix prize documentation 81, 2009 (2009), 1--10.Google ScholarGoogle Scholar
  19. Jianxun Lian, Xiaohuan Zhou, Fuzheng Zhang, Zhongxia Chen, Xing Xie, and Guangzhong Sun. 2018. XDeepFM: Combining Explicit and Implicit Feature Interactions for Recommender Systems. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (London, United Kingdom) (KDD '18). Association for Computing Machinery, New York, NY, USA, 1754--1763. https://doi.org/10.1145/3219819.3220023 Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Blerina Lika, Kostas Kolomvatsos, and Stathes Hadjiefthymiades. 2014. Facing the cold start problem in recommender systems. Expert Systems with Applications 41, 4 (2014), 2065--2073. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Xiaoliang Ling, Weiwei Deng, Chen Gu, Hucheng Zhou, Cui Li, and Feng Sun. 2017. Model Ensemble for Click Prediction in Bing Search Ads. In Proceedings of the 26th International Conference on World Wide Web Companion (Perth, Australia) (WWW '17 Companion). International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, CHE, 689--698. https://doi.org/10.1145/3041021.3054192 Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Bin Liu, Ruiming Tang, Yingzhi Chen, Jinkai Yu, Huifeng Guo, and Yuzhou Zhang. 2019. Feature Generation by Convolutional Neural Network for ClickThrough Rate Prediction. In The World Wide Web Conference (San Francisco, CA, USA) (WWW '19). Association for Computing Machinery, New York, NY, USA, 1119--1129. https://doi.org/10.1145/3308558.3313497 Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Alan Mosca and George D Magoulas. 2017. Deep Incremental Boosting.Google ScholarGoogle Scholar
  24. Sinno Jialin Pan and Qiang Yang. 2009. A survey on transfer learning. IEEE Transactions on knowledge and data engineering 22, 10 (2009), 1345--1359. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Liudmila Prokhorenkova, Gleb Gusev, Aleksandr Vorobev, Anna Veronika Dorogush, and Andrey Gulin. 2018. CatBoost: Unbiased Boosting with Categorical Features. In Proceedings of the 32nd International Conference on Neural Information Processing Systems (Montréal, Canada) (NIPS'18). Curran Associates Inc., Red Hook, NY, USA, 6639--6649. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Steffen Rendle. 2010. Factorization Machines. In Proceedings of the 2010 IEEE International Conference on Data Mining (ICDM '10). IEEE Computer Society, USA, 995--1000. https://doi.org/10.1109/ICDM.2010.127 Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Francesco Ricci, Lior Rokach, and Bracha Shapira. 2015. Recommender Systems: Introduction and Challenges. In Recommender Systems Handbook, Francesco Ricci, Lior Rokach, and Bracha Shapira (Eds.). Springer US, Boston, MA, 1--34. https://doi.org/10.1007/978--1--4899--7637--6_1Google ScholarGoogle ScholarCross RefCross Ref
  28. Abhijit Guha Roy, Sailesh Conjeti, Debdoot Sheet, Amin Katouzian, Nassir Navab, and Christian Wachinger. 2017. Error Corrective Boosting for Learning Fully Convolutional Networks with Limited Data. In Medical Image Computing and Computer Assisted Intervention - MICCAI 2017, Maxime Descoteaux, Lena MaierHein, Alfred Franz, Pierre Jannin, D. Louis Collins, and Simon Duchesne (Eds.). Springer International Publishing, Cham, 231--239.Google ScholarGoogle ScholarCross RefCross Ref
  29. Andrew I Schein, Alexandrin Popescul, Lyle H Ungar, and David M Pennock. 2002. Methods and metrics for cold-start recommendations. In Proceedings of the 25th ACM SIGIR Conference on Research and Development in Information Retrieval. 253--260. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Elena Smirnova and Flavian Vasile. 2017. Contextual sequence modeling for recommendation with recurrent neural networks. In Proceedings of the 2nd workshop on deep learning for recommender systems. 2--9. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Gábor Takács, István Pilászy, Bottyán Németh, and Domonkos Tikk. 2008. Matrix Factorization and Neighbor Based Algorithms for the Netflix Prize Problem. In Proceedings of the 2008 ACM Conference on Recommender Systems (Lausanne, Switzerland) (RecSys '08). Association for Computing Machinery, New York, NY, USA, 267--274. https://doi.org/10.1145/1454008.1454049 Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Leslie G Valiant. 1984. A theory of the learnable. Commun. ACM 27, 11 (1984), 1134--1142. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Athanasios Voulodimos, Nikolaos Doulamis, Anastasios Doulamis, and Eftychios Protopapadakis. 2018. Deep learning for computer vision: A brief review. Computational intelligence and neuroscience 2018 (2018).Google ScholarGoogle Scholar
  34. Xiaochen Wang, Gang Hu, Haoyang Lin, and Jiayu Sun. 2019. A Novel Ensemble Approach for Click-Through Rate Prediction Based on Factorization Machines and Gradient Boosting Decision Trees. In Web and Big Data, Jie Shao, Man Lung Yiu, Masashi Toyoda, Dongxiang Zhang, Wei Wang, and Bin Cui (Eds.). Springer International Publishing, Cham, 152--162.Google ScholarGoogle Scholar
  35. Karl Weiss, Taghi M Khoshgoftaar, and DingDing Wang. 2016. A survey of transfer learning. Journal of Big data 3, 1 (2016), 9.Google ScholarGoogle ScholarCross RefCross Ref
  36. Zhilin Yang, Zihang Dai, Yiming Yang, Jaime Carbonell, Ruslan Salakhutdinov, and Quoc V Le. 2019. Xlnet: Generalized autoregressive pretraining for language understanding. arXiv preprint arXiv:1906.08237 (2019).Google ScholarGoogle Scholar
  37. Shuai Zhang, Lina Yao, Aixin Sun, and Yi Tay. 2019. Deep learning based recommender system: A survey and new perspectives. ACM Computing Surveys (CSUR) 52, 1 (2019), 1--38. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Feipeng Zhao, Min Xiao, and Yuhong Guo. 2016. Predictive Collaborative Filtering with Side Information.. In IJCAI. 2385--2391. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Feng Zhou, Hua Yin, Lizhang Zhan, Huafei Li, Yeliang Fan, and Liu Jiang. 2018. A Novel Ensemble Strategy Combining Gradient Boosted Decision Trees and Factorization Machine Based Neural Network for Clicks Prediction. In 2018 International Conference on Big Data and Artificial Intelligence (BDAI). IEEE, 29-- 33. https://doi.org/10.1109/BDAI.2018.8546685Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. BNN: Boosting Neural Network Framework Utilizing Limited Amount of Data

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          CIKM '21: Proceedings of the 30th ACM International Conference on Information & Knowledge Management
          October 2021
          4966 pages
          ISBN:9781450384469
          DOI:10.1145/3459637

          Copyright © 2021 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 30 October 2021

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          Overall Acceptance Rate1,861of8,427submissions,22%

          Upcoming Conference

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader