Skip to main content

Multimodal Representation Learning-Based Product Matching

  • Conference paper
  • First Online:
CCKS 2022 - Evaluation Track (CCKS 2022)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1711))

Included in the following conference series:

  • 604 Accesses

Abstract

This paper describes our methodology for the identical product mining task organized by the China Conference on Knowledge Graph and Semantic Computing (CCKS) 2022. This identical product mining task has two main challenges: 1) How to perform text representation to refine product representation. 2) How to more effectively combine text representation and image representation. For the first challenge, we propose the K-Gram Exponential Decay scheme in the text representation module to aggregate the information of surrounding words. For the second challenge, we apply conventional multimodal representation learning to combine text representation and image representation to generate the item representation. We view the identical product mining task as a binary classification task for product pairs, for which we adopt sample pair-based contrastive learning. Extensive experiments have demonstrated the effectiveness of our method. We won first place in the competition by utilizing model ensemble and post-processing.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 64.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 84.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://kexue.fm/archives/8847.

References

  1. Zhang, N., et al.: AliCG: fine-grained and Evolvable Conceptual Graph Construction for Semantic Search at Alibaba. In: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery Data Mining, pp. 3895–3905 (2021). ACM, Virtual Event Singapore. https://doi.org/10.1145/3447548.3467057

  2. Fang, Y., Wang, J., Jia, L., Kin, F.W.: Shopee price match guarantee algorithm based on multimodal learning. In: 2021 IEEE International Conference on Computer Science, Artificial Intelligence and Electronic Engineering (CSAIEE), pp. 84–87 IEEE, SC, USA (2021). https://doi.org/10.1109/CSAIEE54046.2021.9543217

  3. Sun, Y., et al.: Circle loss: a unified perspective of pair similarity optimization. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6397–6406 IEEE, Seattle, WA, USA (2020). https://doi.org/10.1109/CVPR42600.2020.00643

  4. Huang, Y., et al.: CurricularFace: adaptive curriculum learning loss for deep face recognition. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5900–5909. IEEE, Seattle, WA, USA (2020). https://doi.org/10.1109/CVPR42600.2020.00594

  5. Tracz, J., Wójcik, P.I., Jasinska-Kobus, K., Belluzzo, R., Mroczkowski, R., Gawlik, I.: BERT-based similarity learning for product matching, pp. 66–75 (2020)

    Google Scholar 

  6. Li, J., Dou, Z., Zhu, Y., Zuo, X., Wen, J.-R.: Deep cross-platform product matching in e-commerce. Inf. Retrieval J. 23(2), 136–158 (2019). https://doi.org/10.1007/s10791-019-09360-1

    Article  Google Scholar 

  7. Li, H., et al.: Path-based deep network for candidate item matching in recommenders. In: Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 1493–1502 ACM, Virtual Event Canada (2021). https://doi.org/10.1145/3404835.3462878

  8. Peeters, R., Bizer, C.: Supervised contrastive learning for product matching (2022). https://doi.org/10.1145/3487553.3524254

  9. Wu, C., Wu, F., Huang, Y., Xie, X.: User-as-Graph: user modeling with heterogeneous graph pooling for news recommendation. In: Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, pp. 1624–1630. International Joint Conferences on Artificial Intelligence Organization, Montreal, Canada (2021). https://doi.org/10.24963/ijcai.2021/224

  10. Yu, J., Jiang, J., Yang, L., Xia, R.: Improving multimodal named entity recognition via entity span detection with unified multimodal transformer. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 3342–3352. Association for Computational Linguistics (2020). https://doi.org/10.18653/v1/2020.acl-main.306

  11. Liu, Y., et al.: RoBERTa: a robustly optimized BERT pretraining approach. http://arxiv.org/abs/1907.11692 (2019)

  12. Yao, H., Liu, H., Zhang, P.: A novel sentence similarity model with word embedding based on convolutional neural network: sentence similarity model with word embedding based on convolutional neural network. Concurrency Computat. Pract. Exper. 30, e4415 (2018). https://doi.org/10.1002/cpe.4415

  13. Liu, Z., et al.: Swin transformer: hierarchical vision transformer using shifted windows. In: 2021 IEEE/CVF International Conference on Computer Vision (ICCV), pp. 9992–10002. IEEE, Montreal, QC, Canada (2021). https://doi.org/10.1109/ICCV48922.2021.00986

Download references

Acknowledgement

This work was supported by the grants from National Natural Science Foundation of China (No. 62072423), and the USTC Research Funds of the Double First-Class Initiative (No. YD2150002009).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tong Xu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Feng, C., Chen, W., Chen, C., Xu, T., Chen, E. (2022). Multimodal Representation Learning-Based Product Matching. In: Zhang, N., Wang, M., Wu, T., Hu, W., Deng, S. (eds) CCKS 2022 - Evaluation Track. CCKS 2022. Communications in Computer and Information Science, vol 1711. Springer, Singapore. https://doi.org/10.1007/978-981-19-8300-9_20

Download citation

  • DOI: https://doi.org/10.1007/978-981-19-8300-9_20

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-19-8299-6

  • Online ISBN: 978-981-19-8300-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics