research-article

Low Rank Field-Weighted Factorization Machines for Low Latency Item Recommendation

Authors:

Alex Shtoff,

Michael Viderman,

Naama Haramaty-Krasne,

Oren Somekh,

Ariel Raviv,

Tularam BanAuthors Info & Claims

RecSys '24: Proceedings of the 18th ACM Conference on Recommender Systems

Pages 238 - 246

https://doi.org/10.1145/3640457.3688097

Published: 08 October 2024 Publication History

Get Access

Abstract

Factorization machine (FM) variants are widely used in recommendation systems that operate under strict throughput and latency requirements, such as online advertising systems. FMs have two prominent strengths. First, is their ability to model pairwise feature interactions while being resilient to data sparsity by learning factorized representations. Second, their computational graphs facilitate fast inference and training. Moreover, when items are ranked as a part of a query for each incoming user, these graphs facilitate computing the portion stemming from the user and context fields only once per query. Thus, the computational cost for each ranked item is proportional only to the number of fields that vary among the ranked items. Consequently, in terms of inference cost, the number of user or context fields is practically unlimited.

More advanced variants of FMs, such as field-aware and field-weighted FMs, provide better accuracy by learning a representation of field-wise interactions, but require computing all pairwise interaction terms explicitly. In particular, the computational cost during inference is proportional to the square of the number of fields, including user, context, and item. When the number of fields is large, this is prohibitive in systems with strict latency constraints, and imposes a limit on the number of user and context fields for a given computational budget. To mitigate this caveat, heuristic pruning of low intensity field interactions is commonly used to accelerate inference.

In this work we propose an alternative to the pruning heuristic in field-weighted FMs using a diagonal plus symmetric low-rank decomposition. Our technique reduces the computational cost of inference, by allowing it to be proportional to the number of item fields only. Using a set of experiments on real-world datasets, we show that aggressive rank reduction outperforms similarly aggressive pruning in both accuracy and item recommendation speed. Beyond computational complexity analysis, we corroborate our claim of faster inference experimentally, both via a synthetic test, and by having deployed our solution to a major online advertising system, where we observed significant ranking latency improvements. We have made the code to reproduce the results on public datasets and synthetic tests available at https://github.com/michaelviderman/pytorch-fm.

Supplemental Material

PDF File

Appendix

Download
491.23 KB

References

[1]

2014. 3 Idiots’ Approach for Display Advertising Challenge. https://github.com/ycjuan/kaggle-2014-criteo.git. Accessed: 2024-04-01.

Abstract

Supplemental Material

References

Index Terms

Recommendations

Attributes coupling based matrix factorization for item recommendation

Recommendation based on weighted social trusts and item relationships

Item recommendation in collaborative tagging systems via heuristic data fusion

Comments

Information

Published In

Sponsors

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Login options

Full Access

View options

PDF

eReader

HTML Format

Share

Share this Publication link

Share on social media

Affiliations