research-article

BNN: Boosting Neural Network Framework Utilizing Limited Amount of Data

Authors:
Amit Livne

Ben-Gurion University of the Negev, Be'er Sheva, Israel

Ben-Gurion University of the Negev, Be'er Sheva, Israel
View Profile

,
Roy Dor

Ben-Gurion University of the Negev, Be'er Sheva, Israel

Ben-Gurion University of the Negev, Be'er Sheva, Israel
View Profile

,
Bracha Shapira

Ben-Gurion University of the Negev, Be'er Sheva, Israel

Ben-Gurion University of the Negev, Be'er Sheva, Israel
View Profile

,
Lior Rokach

Ben-Gurion University of the Negev, Be'er Sheva, Israel

Ben-Gurion University of the Negev, Be'er Sheva, Israel
View Profile

CIKM '21: Proceedings of the 30th ACM International Conference on Information & Knowledge ManagementOctober 2021Pages 1150–1159https://doi.org/10.1145/3459637.3482414

Published:30 October 2021Publication History

CIKM '21: Proceedings of the 30th ACM International Conference on Information & Knowledge Management

Pages 1150–1159

ABSTRACT

Deep learning (DL) algorithms have played a major role in achieving state-of-the-art (SOTA) performance in various learning applications, including computer vision, natural language processing, and recommendation systems (RSs). However, these methods are based on a vast amount of data and do not perform as well when there is a limited amount of data available. Moreover, some of these applications (e.g., RSs) suffer from other issues such as data sparsity and the cold-start problem. While recent research on RSs used DL models based on side information (SI) (e.g., product reviews, film plots, etc.) to tackle these challenges, we propose boosting neural network (BNN), a new DL framework for capturing complex patterns, which requires just a limited amount of data. Unlike conventional boosting, BNN does not sum the predictions generated by its components. Instead, it uses these predictions as new SI features which enhances accuracy. Our framework can be utilized for many problems, including classification, regression, and ranking. In this paper, we demonstrate BNN's use for addressing a classification task. Comprehensive experiments conducted to illustrate BNN's effectiveness on three real-world datasets demonstrated its ability to outperform existing SOTA models for classification tasks (e.g., clickthrough rate prediction).

Supplemental Material

CIKM21-rgfp1291.mp4

mp4

36.4 MB

Download

References

Sarkhan Badirli, Xuanqing Liu, Zhengming Xing, Avradeep Bhowmik, Khoa Doan, and Sathiya S Keerthi. 2020. Gradient boosting neural networks: Grownet. arXiv preprint arXiv:2002.07971 (2020).Google Scholar
Robert M Bell and Yehuda Koren. 2007. Lessons from the Netflix prize challenge. Acm Sigkdd Explorations Newsletter 9, 2 (2007), 75--79. Google ScholarDigital Library
James Bennett, Charles Elkan, Bing Liu, Padhraic Smyth, and Domonkos Tikk. 2007. KDD Cup and Workshop 2007. SIGKDD Explor. Newsl. 9, 2, 51--52. https://doi.org/10.1145/1345448.1345459 Google ScholarDigital Library
Christopher Bishop. 2006. Pattern Recognition and Machine Learning. Springer. https://www.microsoft.com/en-us/research/publication/patternrecognition-machine-learning/ Google ScholarDigital Library
Tianqi Chen and Carlos Guestrin. 2016. XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (San Francisco, California, USA) (KDD '16). Association for Computing Machinery, New York, NY, USA, 785--794. https://doi.org/10.1145/2939672.2939785 Google ScholarDigital Library
Chen Cheng, Fen Xia, Tong Zhang, Irwin King, and Michael R. Lyu. 2014. Gradient Boosting Factorization Machines. In Proceedings of the 8th ACM Conference on Recommender Systems (Foster City, Silicon Valley, California, USA) (RecSys '14). Association for Computing Machinery, New York, NY, USA, 265--272. https://doi.org/10.1145/2645710.2645730 Google ScholarDigital Library
Heng-Tze Cheng, Levent Koc, Jeremiah Harmsen, Tal Shaked, Tushar Chandra, Hrishi Aradhye, Glen Anderson, Greg Corrado, Wei Chai, Mustafa Ispir, Rohan Anil, Zakaria Haque, Lichan Hong, Vihan Jain, Xiaobing Liu, and Hemal Shah. 2016. Wide & Deep Learning for Recommender Systems. In Proceedings of the 1st Workshop on Deep Learning for Recommender Systems (Boston, MA, USA) (DLRS 2016). Association for Computing Machinery, New York, NY, USA, 7--10. https://doi.org/10.1145/2988450.2988454 Google ScholarDigital Library
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018).Google Scholar
Yoav Freund and Robert E. Schapire. 1995. A desicion-theoretic generalization of on-line learning and an application to boosting. In Computational Learning Theory, Paul Vitányi (Ed.). Springer Berlin Heidelberg, Berlin, Heidelberg, 23--37. Google ScholarDigital Library
Jerome H Friedman. 2001. Greedy function approximation: a gradient boosting machine. Annals of statistics (2001), 1189--1232.Google Scholar
David Guillamet and Jordi Vitria. 2002. Non-negative Matrix Factorization for Face Recognition. In Topics in Artificial Intelligence, M. Teresa Escrig, Francisco Toledo, and Elisabet Golobardes (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 336--344. Google ScholarDigital Library
Huifeng Guo, Ruiming Tang, Yunming Ye, Zhenguo Li, and Xiuqiang He. 2017. DeepFM: a factorization-machine based neural network for CTR prediction. arXiv preprint arXiv:1703.04247 24, 3 (2017), 262--290.Google Scholar
Jie Hu, Li Shen, and Gang Sun. 2018. Squeeze-and-Excitation Networks. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Google Scholar
Tongwen Huang, Zhiqi Zhang, and Junlin Zhang. 2019. FiBiNET: Combining Feature Importance and Bilinear Feature Interaction for Click-through Rate Prediction. In Proceedings of the 13th ACM Conference on Recommender Systems (Copenhagen, Denmark) (RecSys '19). Association for Computing Machinery, New York, NY, USA, 169--177. https://doi.org/10.1145/3298689.3347043 Google ScholarDigital Library
Rie Johnson and Tong Zhang. 2013. Learning nonlinear functions using regularized greedy forest. IEEE transactions on pattern analysis and machine intelligence 36, 5 (2013), 942--954.Google Scholar
Yuchin Juan, Yong Zhuang, Wei-Sheng Chin, and Chih-Jen Lin. 2016. FieldAware Factorization Machines for CTR Prediction. In Proceedings of the 10th ACM Conference on Recommender Systems (Boston, Massachusetts, USA) (RecSys '16). Association for Computing Machinery, New York, NY, USA, 43--50. https://doi.org/10.1145/2959100.2959134 Google ScholarDigital Library
Diederik Kingma and Jimmy Ba. 2014. Adam: A Method for Stochastic Optimization. International Conference on Learning Representations (12 2014).Google Scholar
Yehuda Koren. 2009. The bellkor solution to the netflix grand prize. Netflix prize documentation 81, 2009 (2009), 1--10.Google Scholar
Jianxun Lian, Xiaohuan Zhou, Fuzheng Zhang, Zhongxia Chen, Xing Xie, and Guangzhong Sun. 2018. XDeepFM: Combining Explicit and Implicit Feature Interactions for Recommender Systems. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (London, United Kingdom) (KDD '18). Association for Computing Machinery, New York, NY, USA, 1754--1763. https://doi.org/10.1145/3219819.3220023 Google ScholarDigital Library
Blerina Lika, Kostas Kolomvatsos, and Stathes Hadjiefthymiades. 2014. Facing the cold start problem in recommender systems. Expert Systems with Applications 41, 4 (2014), 2065--2073. Google ScholarDigital Library
Xiaoliang Ling, Weiwei Deng, Chen Gu, Hucheng Zhou, Cui Li, and Feng Sun. 2017. Model Ensemble for Click Prediction in Bing Search Ads. In Proceedings of the 26th International Conference on World Wide Web Companion (Perth, Australia) (WWW '17 Companion). International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, CHE, 689--698. https://doi.org/10.1145/3041021.3054192 Google ScholarDigital Library
Bin Liu, Ruiming Tang, Yingzhi Chen, Jinkai Yu, Huifeng Guo, and Yuzhou Zhang. 2019. Feature Generation by Convolutional Neural Network for ClickThrough Rate Prediction. In The World Wide Web Conference (San Francisco, CA, USA) (WWW '19). Association for Computing Machinery, New York, NY, USA, 1119--1129. https://doi.org/10.1145/3308558.3313497 Google ScholarDigital Library
Alan Mosca and George D Magoulas. 2017. Deep Incremental Boosting.Google Scholar
Sinno Jialin Pan and Qiang Yang. 2009. A survey on transfer learning. IEEE Transactions on knowledge and data engineering 22, 10 (2009), 1345--1359. Google ScholarDigital Library
Liudmila Prokhorenkova, Gleb Gusev, Aleksandr Vorobev, Anna Veronika Dorogush, and Andrey Gulin. 2018. CatBoost: Unbiased Boosting with Categorical Features. In Proceedings of the 32nd International Conference on Neural Information Processing Systems (Montréal, Canada) (NIPS'18). Curran Associates Inc., Red Hook, NY, USA, 6639--6649. Google ScholarDigital Library
Steffen Rendle. 2010. Factorization Machines. In Proceedings of the 2010 IEEE International Conference on Data Mining (ICDM '10). IEEE Computer Society, USA, 995--1000. https://doi.org/10.1109/ICDM.2010.127 Google ScholarDigital Library
Francesco Ricci, Lior Rokach, and Bracha Shapira. 2015. Recommender Systems: Introduction and Challenges. In Recommender Systems Handbook, Francesco Ricci, Lior Rokach, and Bracha Shapira (Eds.). Springer US, Boston, MA, 1--34. https://doi.org/10.1007/978--1--4899--7637--6_1Google ScholarCross Ref
Abhijit Guha Roy, Sailesh Conjeti, Debdoot Sheet, Amin Katouzian, Nassir Navab, and Christian Wachinger. 2017. Error Corrective Boosting for Learning Fully Convolutional Networks with Limited Data. In Medical Image Computing and Computer Assisted Intervention - MICCAI 2017, Maxime Descoteaux, Lena MaierHein, Alfred Franz, Pierre Jannin, D. Louis Collins, and Simon Duchesne (Eds.). Springer International Publishing, Cham, 231--239.Google ScholarCross Ref
Andrew I Schein, Alexandrin Popescul, Lyle H Ungar, and David M Pennock. 2002. Methods and metrics for cold-start recommendations. In Proceedings of the 25th ACM SIGIR Conference on Research and Development in Information Retrieval. 253--260. Google ScholarDigital Library
Elena Smirnova and Flavian Vasile. 2017. Contextual sequence modeling for recommendation with recurrent neural networks. In Proceedings of the 2nd workshop on deep learning for recommender systems. 2--9. Google ScholarDigital Library
Gábor Takács, István Pilászy, Bottyán Németh, and Domonkos Tikk. 2008. Matrix Factorization and Neighbor Based Algorithms for the Netflix Prize Problem. In Proceedings of the 2008 ACM Conference on Recommender Systems (Lausanne, Switzerland) (RecSys '08). Association for Computing Machinery, New York, NY, USA, 267--274. https://doi.org/10.1145/1454008.1454049 Google ScholarDigital Library
Leslie G Valiant. 1984. A theory of the learnable. Commun. ACM 27, 11 (1984), 1134--1142. Google ScholarDigital Library
Athanasios Voulodimos, Nikolaos Doulamis, Anastasios Doulamis, and Eftychios Protopapadakis. 2018. Deep learning for computer vision: A brief review. Computational intelligence and neuroscience 2018 (2018).Google Scholar
Xiaochen Wang, Gang Hu, Haoyang Lin, and Jiayu Sun. 2019. A Novel Ensemble Approach for Click-Through Rate Prediction Based on Factorization Machines and Gradient Boosting Decision Trees. In Web and Big Data, Jie Shao, Man Lung Yiu, Masashi Toyoda, Dongxiang Zhang, Wei Wang, and Bin Cui (Eds.). Springer International Publishing, Cham, 152--162.Google Scholar
Karl Weiss, Taghi M Khoshgoftaar, and DingDing Wang. 2016. A survey of transfer learning. Journal of Big data 3, 1 (2016), 9.Google ScholarCross Ref
Zhilin Yang, Zihang Dai, Yiming Yang, Jaime Carbonell, Ruslan Salakhutdinov, and Quoc V Le. 2019. Xlnet: Generalized autoregressive pretraining for language understanding. arXiv preprint arXiv:1906.08237 (2019).Google Scholar
Shuai Zhang, Lina Yao, Aixin Sun, and Yi Tay. 2019. Deep learning based recommender system: A survey and new perspectives. ACM Computing Surveys (CSUR) 52, 1 (2019), 1--38. Google ScholarDigital Library
Feipeng Zhao, Min Xiao, and Yuhong Guo. 2016. Predictive Collaborative Filtering with Side Information.. In IJCAI. 2385--2391. Google ScholarDigital Library
Feng Zhou, Hua Yin, Lizhang Zhan, Huafei Li, Yeliang Fan, and Liu Jiang. 2018. A Novel Ensemble Strategy Combining Gradient Boosted Decision Trees and Factorization Machine Based Neural Network for Clicks Prediction. In 2018 International Conference on Big Data and Artificial Intelligence (BDAI). IEEE, 29-- 33. https://doi.org/10.1109/BDAI.2018.8546685Google ScholarCross Ref

Index Terms

BNN: Boosting Neural Network Framework Utilizing Limited Amount of Data

Recommendations

Ensembling neural networks: many could be better than all

Neural network ensemble is a learning paradigm where many neural networks are jointly used to solve a problem. In this paper, the relationship between the ensemble and its component neural networks is analyzed from the context of both regression and ...
Read More
Class-switching neural network ensembles

This article investigates the properties of class-switching ensembles composed of neural networks and compares them to class-switching ensembles of decision trees and to standard ensemble learning methods, such as bagging and boosting. In a class-...
Read More
A Neural Collaborative Filtering Model with Interaction-based Neighborhood
CIKM '17: Proceedings of the 2017 ACM on Conference on Information and Knowledge Management

Recently, deep neural networks have been widely applied to recommender systems. A representative work is to utilize deep learning for modeling complex user-item interactions. However, similar to traditional latent factor models by factorizing user-item ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
CIKM '21: Proceedings of the 30th ACM International Conference on Information & Knowledge Management
October 2021
4966 pages
ISBN:9781450384469
DOI:10.1145/3459637
General Chairs:
Gianluca Demartini
The University of Queensland, Australia
,
Guido Zuccon
The University of Queensland, Australia
,
Program Chairs:
J. Shane Culpepper
RMIT University, Australia
,
Zi Huang
The University of Queensland, Australia
,
Hanghang Tong
University of Illinois at Urbana-Champaign, USA
Copyright © 2021 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 30 October 2021
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
boosting
click-through rate prediction
deep neural network
recommender systems
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate1,861of8,427submissions,22%
Upcoming Conference
CIKM '24

Sponsor:

sigir

sigir

The 33rd ACM International Conference on Information and Knowledge Management

October 21 - 25, 2024

Boise , ID , USA
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 1
  Total Citations
  View Citations
- 172
  Total Downloads
- Downloads (Last 12 months)20
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

BNN: Boosting Neural Network Framework Utilizing Limited Amount of Data

CIKM '21: Proceedings of the 30th ACM International Conference on Information & Knowledge Management

ABSTRACT

Supplemental Material

References

Cited By

Index Terms

Recommendations

Ensembling neural networks: many could be better than all

Class-switching neural network ensembles

A Neural Collaborative Filtering Model with Interaction-based Neighborhood