Skip to main content

W2FM: The Doubly-Warped Factorization Machine

  • Conference paper
  • First Online:
Advances in Knowledge Discovery and Data Mining (PAKDD 2021)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12713))

Included in the following conference series:

  • 2161 Accesses

Abstract

Factorization Machines (FMs) enhance an underlying linear regression or classification model by capturing feature interactions. Intuitively, FMs warp the feature space to help capture the underlying non-linear structure of the machine learning task. In this paper, we propose novel Doubly-Warped Factorization Machines (or \(\mathtt{W2FM}\)s) that leverage multiple complementary space warping strategies to improve the representational ability of FMs. Our approach abstracts the feature interaction in FMs as additional affine transformations (thus warping the space), which can be learned efficiently without introducing large numbers of model parameters. We also explore alternative W2FM based approaches and conduct extensive experiments on real world data sets. These experiments show that \(\mathtt{W2FM}\) achieves better performance in collaborative filtering task not only relative to vanilla FMs, but also against other state-of-the-art competitors, such as Attention FM (AFM), Holographic FM (HFM), and Neural FM (NFM).

This work is supported by NSF (#1610282, #1633381, #1909555, #2026860, #1827757, #1629888), and EUH2020 Marie Sklodowska-Curie grant agreement #690817. Results were obtained using the ChameleonCloud resources supported by the NSF.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Note that FMs can be generalized to higher degrees of feature interactions. In this paper, without loss of generality, we focus on pairwise FMs, which have been shown to be generally effective and, thus, make up the most commonly used approach for FMs – details can be found in [16].

  2. 2.

    For other machine learning tasks, e.g. classification, log loss may be used.

  3. 3.

    Note that for the very last column of \(V_1\), we have \(a_k = (k-1)\lfloor \frac{m}{k}\rfloor + 1\) to \(b_k = m\).

  4. 4.

    Here we use \({\mathtt{W2FM}}_{TR}\) for clarity, other \({\mathtt{W2FM}}\) variants can be seamlessly integrated.

  5. 5.

    Support Vector Machine (SVM) with linear kernel [10], Factorization Machine (FM, single warping baseline) [16], Attention FM (the attention factor: 256, activation function: ReLU, drop out rate: 0.5, the valid dimension:2 (user id and item id)) [24], Neural FM (drop out rate for bi-interaction layer: 0.5, 1 hidden layer with 64 neuron and drop out rate: 0.8, activation function: ReLU) [11] and Holographic FM [23].

  6. 6.

    Ciao (# of instances: 284K, # of features: 107K, density: 0.0003) [21], Epinions (# of instances: 922K, # of features: 141K, density: 0.0003) [22], MovieLens-100K (# of instances: 100K, # of features: 2273, density: 0.041) [9] and MovieLens-1M (# of instances: 1M, # of features: 9746, density: 0.059) [9].

References

  1. Binois, M., Ginsbourger, D., Roustant, O.: A warped kernel improving robustness in Bayesian optimization via random embeddings. In: International Conference on Learning and Intelligent Optimization (2015)

    Google Scholar 

  2. Blondel, M., Fujino, A., Ueda, N., Ishihata, M.: Higher-order factorization machines. In: NIPS 2016 (2016)

    Google Scholar 

  3. Blondel, M., Ishihata, M., Fujino, A., Ueda, N.: Polynomial networks and factorization machines: New insights and efficient training algorithms. In: PMLR (2016)

    Google Scholar 

  4. Chen, T., Yin, H., Nguyen, Q.V.H., Peng, W., Li, X., Zhou, X.: Sequence-aware factorization machines for temporal predictive analytics. In: ICDE 2020 (2020)

    Google Scholar 

  5. Chen, X., Zheng, Y., Wang, J., Ma, W., Huang, J.: RaFM: Rank-aware factorization machines. In: PMLR (2019)

    Google Scholar 

  6. Cheng, H.T., et al.: Wide & deep learning for recommender systems. In: DLRS, vol. 2016, (2016)

    Google Scholar 

  7. Duchi, J., Hazan, E., Singer, Y.: Adaptive subgradient methods for online learning and stochastic optimization. J. Mach. Learn. Res. 12, 2121–2159 (2011)

    Google Scholar 

  8. Grčar, M., Mladenič, D., Fortuna, B., Grobelnik, M.: Data sparsity issues in the collaborative filtering framework. In: Advances in Web Mining and Web Usage Analysis (2006)

    Google Scholar 

  9. Harper, F.M., Konstan, J.A.: The movielens datasets: History and context. ACM Trans. Interact. Intell. Syst. 5(4), 1–19 (Dec 2015)

    Google Scholar 

  10. Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning: Data Mining, Inference and Prediction. Springer, New York (2009)

    Google Scholar 

  11. He, X., Chua, T.S.: Neural factorization machines for sparse predictive analytics. In: SIGIR 2017 (2017)

    Google Scholar 

  12. Juan, Y., Zhuang, Y., Chin, W.S., Lin, C.J.: Field-aware factorization machines for CTR prediction. In: RecSys 2016 (2016)

    Google Scholar 

  13. Koren, Y.: Factorization meets the neighborhood: a multifaceted collaborative filtering model. In: KDD 2008 (2008)

    Google Scholar 

  14. Livni, R., Shalev-Shwartz, S., Shamir, O.: On the computational efficiency of training neural networks. In: NIPS (2014)

    Google Scholar 

  15. Rasmussen, C.E., Williams, C.K.I.: Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning). The MIT Press (2005)

    Google Scholar 

  16. Rendle, S.: Factorization machines. In: ICDM 2010, IEEE Computer Society (2010)

    Google Scholar 

  17. Salakhutdinov, R., Mnih, A.: Probabilistic matrix factorization. In: NIPS 2007 (2007)

    Google Scholar 

  18. Seal, H.L.: Studies in the history of probability and statistics. xv: The historical development of the gauss linear model. Biometrika 54(1–2), 1–24 (1967)

    Google Scholar 

  19. Shan, Y., Hoens, T.R., Jiao, J., Wang, H., Yu, D., Mao, J.: Deep crossing: web-scale modeling without manually crafted combinatorial features. In: KDD 2016 (2016)

    Google Scholar 

  20. Snoek, J., Swersky, K., Zemel, R., Adams, R.P.: Input warping for Bayesian optimization of non-stationary functions. In: ICML 2014 (2014)

    Google Scholar 

  21. Tang, J., Gao, H., Liu, H., Sarma, A.D.: eTrust: Understanding trust evolution in an online world. In: KDD (2012)

    Google Scholar 

  22. Tang, J., Hu, X., Gao, H., Liu, H.: Exploiting local and global social context for recommendation. In: IJCAI (2013)

    Google Scholar 

  23. Tay, Y., Zhang, S., Luu, A.T., Hui, S.C., Yao, L., Vinh, T.D.Q.: Holographic factorization machines for recommendation. In: AAAI (2019)

    Google Scholar 

  24. Xiao, J., Ye, H., He, X., Zhang, H., Wu, F., Chua, T.S.: Attentional factorization machines: Learning the weight of feature interactions via attention networks. In: IJCAI 2017 (2017)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mao-Lin Li .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Li, ML., Candan, K.S. (2021). W2FM: The Doubly-Warped Factorization Machine. In: Karlapalem, K., et al. Advances in Knowledge Discovery and Data Mining. PAKDD 2021. Lecture Notes in Computer Science(), vol 12713. Springer, Cham. https://doi.org/10.1007/978-3-030-75765-6_39

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-75765-6_39

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-75764-9

  • Online ISBN: 978-3-030-75765-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics