skip to main content
10.1145/3543507.3583444acmconferencesArticle/Chapter ViewAbstractPublication PageswwwConference Proceedingsconference-collections
research-article

Automatic Feature Selection By One-Shot Neural Architecture Search In Recommendation Systems

Authors Info & Claims
Published:30 April 2023Publication History

ABSTRACT

Feature selection is crucial in large-scale recommendation system, which can not only reduce the computational cost, but also improve the recommendation efficiency. Most existing works rank the features and then select the top-k ones as the final feature subset. However, they assess feature importance individually and ignore the interrelationship between features. Consequently, multiple features with high relevance may be selected simultaneously, resulting in sub-optimal result. In this work, we solve this problem by proposing an AutoML-based feature selection framework that can automatically search the optimal feature subset. Specifically, we first embed the search space into a weight-sharing Supernet. Then, a two-stage neural architecture search method is employed to evaluate the feature quality. In the first stage, a well-designed sampling method considering feature convergence fairness is applied to train the Supernet. In the second stage, a reinforcement learning method is used to search for the optimal feature subset efficiently. The Experimental results on two real datasets demonstrate the superior performance of new framework over other solutions. Our proposed method obtain significant improvement with a 20% reduction in the amount of features on the Criteo. More validation experiments demonstrate the ability and robustness of the framework.

Skip Supplemental Material Section

Supplemental Material

3583444_AutoFSS.mp4

mp4

14.1 MB

3583444_AutoFSS.mp4

Video summary of the WWW'23 paper: Automatic Feature Selection By One-Shot Neural Architecture Search In Recommendation Systems.

mp4

14.1 MB

References

  1. [1] Heng-Tze Cheng, Levent Koc, Jeremiah Harmsen, Tal Shaked, Tushar Chandra, Hrishi Aradhye, Glen Anderson, Greg Corrado, Wei Chai, Mustafa Ispir, 2016. Wide & deep learning for recommender systems. In Proceedings of the 1st workshop on deep learning for recommender systems. 7–10.Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. [2] Xiangxiang Chu, Bo Zhang, and Ruijun Xu. 2021. Fairnas: Rethinking evaluation fairness of weight sharing neural architecture search. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 12239–12248.Google ScholarGoogle ScholarCross RefCross Ref
  3. [3] Paul Covington, Jay Adams, and Emre Sargin. 2016. Deep neural networks for youtube recommendations. In Proceedings of the 10th ACM conference on recommender systems. 191–198.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. [4] Naoual El Aboudi and Laila Benhlima. 2016. Review on wrapper feature selection approaches. In 2016 International Conference on Engineering & MIS (ICEMIS). IEEE, 1–5.Google ScholarGoogle ScholarCross RefCross Ref
  5. [5] Ali El Akadi, Abdeljalil El Ouardighi, and Driss Aboutajdine. 2008. A powerful feature selection approach based on mutual information. International Journal of Computer Science and Network Security 8, 4 (2008), 116.Google ScholarGoogle Scholar
  6. [6] Jerome H Friedman. 2001. Greedy function approximation: a gradient boosting machine. Annals of statistics (2001), 1189–1232.Google ScholarGoogle Scholar
  7. [7] Antonio A Ginart, Maxim Naumov, Dheevatsa Mudigere, Jiyan Yang, and James Zou. 2021. Mixed dimension embeddings with application to memory-efficient recommendation systems. In 2021 IEEE International Symposium on Information Theory (ISIT). IEEE, 2786–2791.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. [8] Huifeng Guo, Ruiming Tang, Yunming Ye, Zhenguo Li, and Xiuqiang He. 2017. DeepFM: a factorization-machine based neural network for CTR prediction. arXiv preprint arXiv:1703.04247 (2017).Google ScholarGoogle Scholar
  9. [9] Zichao Guo, Xiangyu Zhang, Haoyuan Mu, Wen Heng, Zechun Liu, Yichen Wei, and Jian Sun. 2020. Single path one-shot neural architecture search with uniform sampling. In European conference on computer vision. Springer, 544–560.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. [10] Isabelle Guyon and André Elisseeff. 2003. An introduction to variable and feature selection. Journal of machine learning research 3, Mar (2003), 1157–1182.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. [11] Jie Hu, Li Shen, and Gang Sun. 2018. Squeeze-and-excitation networks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 7132–7141.Google ScholarGoogle ScholarCross RefCross Ref
  12. [12] Xin Jin, Anbang Xu, Rongfang Bie, and Ping Guo. 2006. Machine learning techniques and chi-square feature selection for cancer classification using SAGE gene expression profiles. In International workshop on data mining for biomedical applications. Springer, 106–115.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. [13] Manas R Joglekar, Cong Li, Mei Chen, Taibai Xu, Xiaoming Wang, Jay K Adams, Pranav Khaitan, Jiahui Liu, and Quoc V Le. 2020. Neural input search for large scale recommendation models. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2387–2397.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. [14] Bin Liu, Niannan Xue, Huifeng Guo, Ruiming Tang, Stefanos Zafeiriou, Xiuqiang He, and Zhenguo Li. 2020. AutoGroup: Automatic feature grouping for modelling explicit high-order feature interactions in CTR prediction. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval. 199–208.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. [15] Bin Liu, Chenxu Zhu, Guilin Li, Weinan Zhang, Jincai Lai, Ruiming Tang, Xiuqiang He, Zhenguo Li, and Yong Yu. 2020. Autofis: Automatic feature interaction selection in factorization models for click-through rate prediction. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2636–2645.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. [16] Hanxiao Liu, Karen Simonyan, and Yiming Yang. 2018. Darts: Differentiable architecture search. arXiv preprint arXiv:1806.09055 (2018).Google ScholarGoogle Scholar
  17. [17] Siyi Liu, Chen Gao, Yihong Chen, Depeng Jin, and Yong Li. 2021. Learnable embedding sizes for recommender systems. arXiv preprint arXiv:2101.07577 (2021).Google ScholarGoogle Scholar
  18. [18] Yaqing Liu, Yong Mu, Keyu Chen, Yiming Li, and Jinghuan Guo. 2020. Daily activity feature selection in smart homes based on pearson correlation coefficient. Neural Processing Letters 51, 2 (2020), 1771–1787.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. [19] Xu Ma, Pengjie Wang, Hui Zhao, Shaoguo Liu, Chuhan Zhao, Wei Lin, Kuang-Chih Lee, Jian Xu, and Bo Zheng. 2021. Towards a Better Tradeoff between Effectiveness and Efficiency in Pre-Ranking: A Learnable Feature Selection based Approach. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval. 2036–2040.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. [20] Andrew Y Ng. 2004. Feature selection, L 1 vs. L 2 regularization, and rotational invariance. In Proceedings of the twenty-first international conference on Machine learning. 78.Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. [21] Hieu Pham, Melody Guan, Barret Zoph, Quoc Le, and Jeff Dean. 2018. Efficient neural architecture search via parameters sharing. In International conference on machine learning. PMLR, 4095–4104.Google ScholarGoogle Scholar
  22. [22] Liang Qu, Yonghong Ye, Ningzhi Tang, Lixin Zhang, Yuhui Shi, and Hongzhi Yin. 2022. Single-shot Embedding Dimension Search in Recommender System. arXiv preprint arXiv:2204.03281 (2022).Google ScholarGoogle Scholar
  23. [23] Claude Elwood Shannon. 1948. A mathematical theory of communication. The Bell system technical journal 27, 3 (1948), 379–423.Google ScholarGoogle Scholar
  24. [24] Qingquan Song, Dehua Cheng, Hanning Zhou, Jiyan Yang, Yuandong Tian, and Xia Hu. 2020. Towards automated neural interaction discovery for click-through rate prediction. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 945–955.Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. [25] Robert Tibshirani. 1996. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58, 1 (1996), 267–288.Google ScholarGoogle ScholarCross RefCross Ref
  26. [26] Dimitrios Ververidis and Constantine Kotropoulos. 2005. Sequential forward feature selection with low computational cost. In 2005 13th European Signal Processing Conference. IEEE, 1–4.Google ScholarGoogle Scholar
  27. [27] Yejing Wang, Xiangyu Zhao, Tong Xu, and Xian Wu. 2022. Autofield: Automating feature selection in deep recommender systems. In Proceedings of the ACM Web Conference 2022. 1977–1986.Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. [28] Ronald J Williams. 1992. Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine learning 8, 3 (1992), 229–256.Google ScholarGoogle Scholar
  29. [29] Bencheng Yan, Pengjie Wang, Kai Zhang, Wei Lin, Kuang-Chih Lee, Jian Xu, and Bo Zheng. 2021. Learning Effective and Efficient Embedding via an Adaptively-Masked Twins-based Layer. In Proceedings of the 30th ACM International Conference on Information & Knowledge Management. 3568–3572.Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. [30] Shuai Zhang, Lina Yao, Aixin Sun, and Yi Tay. 2019. Deep learning based recommender system: A survey and new perspectives. ACM Computing Surveys (CSUR) 52, 1 (2019), 1–38.Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. [31] Pengyu Zhao, Kecheng Xiao, Yuanxing Zhang, Kaigui Bian, and Wei Yan. 2021. AMEIR: Automatic Behavior Modeling, Interaction Exploration and MLP Investigation in the Recommender System.. In IJCAI. 2104–2110.Google ScholarGoogle Scholar
  32. [32] Xiangyu Zhao, Haochen Liu, Hui Liu, Jiliang Tang, Weiwei Guo, Jun Shi, Sida Wang, Huiji Gao, and Bo Long. 2021. Autodim: Field-aware embedding dimension searchin recommender systems. In Proceedings of the Web Conference 2021. 3015–3022.Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. [33] Xiangyu Zhaok, Haochen Liu, Wenqi Fan, Hui Liu, Jiliang Tang, Chong Wang, Ming Chen, Xudong Zheng, Xiaobing Liu, and Xiwang Yang. 2021. Autoemb: Automated embedding dimensionality search in streaming recommendations. In 2021 IEEE International Conference on Data Mining (ICDM). IEEE, 896–905.Google ScholarGoogle ScholarCross RefCross Ref
  34. [34] Guorui Zhou, Xiaoqiang Zhu, Chenru Song, Ying Fan, Han Zhu, Xiao Ma, Yanghui Yan, Junqi Jin, Han Li, and Kun Gai. 2018. Deep interest network for click-through rate prediction. In Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining. 1059–1068.Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. [35] Barret Zoph, Vijay Vasudevan, Jonathon Shlens, and Quoc V Le. 2018. Learning transferable architectures for scalable image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 8697–8710.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Automatic Feature Selection By One-Shot Neural Architecture Search In Recommendation Systems

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      WWW '23: Proceedings of the ACM Web Conference 2023
      April 2023
      4293 pages
      ISBN:9781450394161
      DOI:10.1145/3543507

      Copyright © 2023 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 30 April 2023

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed limited

      Acceptance Rates

      Overall Acceptance Rate1,899of8,196submissions,23%

      Upcoming Conference

      WWW '24
      The ACM Web Conference 2024
      May 13 - 17, 2024
      Singapore , Singapore

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format .

    View HTML Format