Multimodal Representation Learning-Based Product Matching

Feng, Changkai; Chen, Wei; Chen, Chao; Xu, Tong; Chen, Enhong

doi:10.1007/978-981-19-8300-9_20

Changkai Feng¹⁰,
Wei Chen¹⁰,
Chao Chen¹⁰,
Tong Xu¹⁰ &
…
Enhong Chen¹⁰

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1711))

Included in the following conference series:

China Conference on Knowledge Graph and Semantic Computing

604 Accesses

Abstract

This paper describes our methodology for the identical product mining task organized by the China Conference on Knowledge Graph and Semantic Computing (CCKS) 2022. This identical product mining task has two main challenges: 1) How to perform text representation to refine product representation. 2) How to more effectively combine text representation and image representation. For the first challenge, we propose the K-Gram Exponential Decay scheme in the text representation module to aggregate the information of surrounding words. For the second challenge, we apply conventional multimodal representation learning to combine text representation and image representation to generate the item representation. We view the identical product mining task as a binary classification task for product pairs, for which we adopt sample pair-based contrastive learning. Extensive experiments have demonstrated the effectiveness of our method. We won first place in the competition by utilizing model ensemble and post-processing.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 64.99; Price excludes VAT (USA)

Softcover Book: USD 84.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
https://kexue.fm/archives/8847.

References

Zhang, N., et al.: AliCG: fine-grained and Evolvable Conceptual Graph Construction for Semantic Search at Alibaba. In: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery Data Mining, pp. 3895–3905 (2021). ACM, Virtual Event Singapore. https://doi.org/10.1145/3447548.3467057
Fang, Y., Wang, J., Jia, L., Kin, F.W.: Shopee price match guarantee algorithm based on multimodal learning. In: 2021 IEEE International Conference on Computer Science, Artificial Intelligence and Electronic Engineering (CSAIEE), pp. 84–87 IEEE, SC, USA (2021). https://doi.org/10.1109/CSAIEE54046.2021.9543217
Sun, Y., et al.: Circle loss: a unified perspective of pair similarity optimization. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6397–6406 IEEE, Seattle, WA, USA (2020). https://doi.org/10.1109/CVPR42600.2020.00643
Huang, Y., et al.: CurricularFace: adaptive curriculum learning loss for deep face recognition. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5900–5909. IEEE, Seattle, WA, USA (2020). https://doi.org/10.1109/CVPR42600.2020.00594
Tracz, J., Wójcik, P.I., Jasinska-Kobus, K., Belluzzo, R., Mroczkowski, R., Gawlik, I.: BERT-based similarity learning for product matching, pp. 66–75 (2020)
Google Scholar
Li, J., Dou, Z., Zhu, Y., Zuo, X., Wen, J.-R.: Deep cross-platform product matching in e-commerce. Inf. Retrieval J. 23(2), 136–158 (2019). https://doi.org/10.1007/s10791-019-09360-1
Article Google Scholar
Li, H., et al.: Path-based deep network for candidate item matching in recommenders. In: Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 1493–1502 ACM, Virtual Event Canada (2021). https://doi.org/10.1145/3404835.3462878
Peeters, R., Bizer, C.: Supervised contrastive learning for product matching (2022). https://doi.org/10.1145/3487553.3524254
Wu, C., Wu, F., Huang, Y., Xie, X.: User-as-Graph: user modeling with heterogeneous graph pooling for news recommendation. In: Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, pp. 1624–1630. International Joint Conferences on Artificial Intelligence Organization, Montreal, Canada (2021). https://doi.org/10.24963/ijcai.2021/224
Yu, J., Jiang, J., Yang, L., Xia, R.: Improving multimodal named entity recognition via entity span detection with unified multimodal transformer. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 3342–3352. Association for Computational Linguistics (2020). https://doi.org/10.18653/v1/2020.acl-main.306
Liu, Y., et al.: RoBERTa: a robustly optimized BERT pretraining approach. http://arxiv.org/abs/1907.11692 (2019)
Yao, H., Liu, H., Zhang, P.: A novel sentence similarity model with word embedding based on convolutional neural network: sentence similarity model with word embedding based on convolutional neural network. Concurrency Computat. Pract. Exper. 30, e4415 (2018). https://doi.org/10.1002/cpe.4415
Liu, Z., et al.: Swin transformer: hierarchical vision transformer using shifted windows. In: 2021 IEEE/CVF International Conference on Computer Vision (ICCV), pp. 9992–10002. IEEE, Montreal, QC, Canada (2021). https://doi.org/10.1109/ICCV48922.2021.00986

Download references

Acknowledgement

This work was supported by the grants from National Natural Science Foundation of China (No. 62072423), and the USTC Research Funds of the Double First-Class Initiative (No. YD2150002009).

Author information

Authors and Affiliations

School of Data Science, University of Science and Technology of China, Hefei, China
Changkai Feng, Wei Chen, Chao Chen, Tong Xu & Enhong Chen

Authors

Changkai Feng
View author publications
You can also search for this author in PubMed Google Scholar
Wei Chen
View author publications
You can also search for this author in PubMed Google Scholar
Chao Chen
View author publications
You can also search for this author in PubMed Google Scholar
Tong Xu
View author publications
You can also search for this author in PubMed Google Scholar
Enhong Chen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tong Xu .

Editor information

Editors and Affiliations

Zhejiang University, Hangzhou, China
Ningyu Zhang
Southeast University, Nanjing, China
Meng Wang
Southeast University, Nanjing, China
Tianxing Wu
Nanjing University, Nanjing, China
Wei Hu
National University of Singapore, Singapore, Singapore
Shumin Deng

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Feng, C., Chen, W., Chen, C., Xu, T., Chen, E. (2022). Multimodal Representation Learning-Based Product Matching. In: Zhang, N., Wang, M., Wu, T., Hu, W., Deng, S. (eds) CCKS 2022 - Evaluation Track. CCKS 2022. Communications in Computer and Information Science, vol 1711. Springer, Singapore. https://doi.org/10.1007/978-981-19-8300-9_20

Download citation

DOI: https://doi.org/10.1007/978-981-19-8300-9_20
Published: 02 December 2022
Publisher Name: Springer, Singapore
Print ISBN: 978-981-19-8299-6
Online ISBN: 978-981-19-8300-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics