Skip to main content

An E-Commerce Dataset in French for Multi-modal Product Categorization and Cross-Modal Retrieval

  • Conference paper
  • First Online:
Advances in Information Retrieval (ECIR 2021)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12656))

Included in the following conference series:

  • 2330 Accesses

Abstract

A multi-modal dataset of ninety nine thousand product listings are made available from the production catalog of Rakuten France, a major e-commerce platform. Each product in the catalog data contains a textual title, a (possibly empty) textual description and an associated image. The dataset has been released as part of a data challenge hosted by the SIGIR ECom’20 Workshop. Two tasks are proposed, namely a principal large-scale multi-modal classification task and a subsidiary cross-modal retrieval task. This real world dataset contains around 85K products and their corresponding product type categories that are released as training data and around 9.5K and 4.5K products are released as held-out test sets for the multi-modal classification and cross-modal retrieval tasks respectively. The evaluation is run in two phases to measure system performance, first on 10% of the test data, and then on the rest 90% of the test data. The different systems are evaluated using macro-F1 score for the multi-modal classification task and recall@1 for the cross-modal retrieval task. Additionally, a robust baseline system for the multi-modal classification task is proposed. The top performance obtained at the end of the second phase is \(91.44\%\) macro-F1 and \(34.28\%\) recall@1 for the two tasks respectively.

H. Amoualian—Most of the work was performed while at RIT-Paris.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 99.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 129.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Rakuten France Multimodal Dataset in https://rit.rakuten.co.jp/data_release/.

  2. 2.

    Gross Merchandise Volume (GMV) is the total monetary value for merchandise sold through a particular marketplace over a certain period of time.

  3. 3.

    https://huggingface.co/transformers/summary.html.

  4. 4.

    https://huggingface.co/distilbert-base-multilingual-cased.

References

  1. Fashion-MNIST. https://github.com/zalandoresearch/fashion-mnist

  2. Innerwear data from victoria’s secret and others. https://www.kaggle.com/PromptCloudHQ/innerwear-data-from-victorias-secret-and-others

  3. Cardoso, Â., Daolio, F., Vargas, S.: Product characterisation towards personalisation: learning attributes from unstructured data to recommend fashion products. In: Proceedings of the 24th ACM International Conference on Knowledge Discovery & Data Mining (SIGKDD), pp. 80–89 (2018)

    Google Scholar 

  4. Corbiere, C., Ben-Younes, H., Rame, A., Ollion, C.: Leveraging weakly annotated data for fashion image retrieval and label prediction. In: 2017 IEEE International Conference on Computer Vision Workshops (ICCVW) (October 2017). https://doi.org/10.1109/iccvw.2017.266

  5. Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding (2018)

    Google Scholar 

  6. Dong, X., et al. AutoKnow: self-driving knowledge collection for products of thousands of types. arXiv arXiv:2006.13473 (2020)

  7. Duong, C.T., Lebret, R., Aberer, K.: Multimodal classification for analysing social media, CoRR abs/1708.02099 (2017)

    Google Scholar 

  8. DÄ…browski, J., et al.: An efficient manifold density estimator for all recommendation systems (2020)

    Google Scholar 

  9. Faghri, F., Fleet, D.J., Kiros, J.R., Fidler, S.: VSE++: improved visual-semantic embeddings, CoRR abs/1707.05612 (2017)

    Google Scholar 

  10. Han, X., et al.: Automatic spatially-aware fashion concept discovery (2017)

    Google Scholar 

  11. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition (2015)

    Google Scholar 

  12. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. CoRR abs/1512.03385 (2015). http://arxiv.org/abs/1512.03385

  13. Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (June 2018)

    Google Scholar 

  14. Johnson, J., Douze, M., Jégou, H.: Billion-scale similarity search with GPUs. arXiv preprint arXiv:1702.08734 (2017)

  15. Kiela, D., Bhooshan, S., Firooz, H., Testuggine, D.: Supervised multimodal bitransformers for classifying images and text (2019)

    Google Scholar 

  16. Kiros, R., Salakhutdinov, R., Zemel, R.S.: Unifying visual-semantic embeddings with multimodal neural language models, CoRR abs/1411.2539 (2014)

    Google Scholar 

  17. Kolesnikov, A., et al.: Big transfer (BiT): general visual representation learning (2019)

    Google Scholar 

  18. Le, H., et al.: FlauBERT: unsupervised language model pre-training for French. In: Proceedings of the 12th Language Resources and Evaluation Conference, LREC 2020, Marseille, France, 11–16 May 2020, pp. 2479–2490. European Language Resources Association (2020)

    Google Scholar 

  19. Lin, Y.C., Das, P., Trotman, A., Kallumadi, S.: A dataset and baselines for e-commerce product categorization. In: Proceedings of the 2019 ACM SIGIR International Conference on Theory of Information Retrieval, ICTIR 2019, pp. 213–216. Association for Computing Machinery, New York (2019)

    Google Scholar 

  20. Martin, L., et al.: CamemBERT: a tasty French language model. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. pp. 7203–7219. Association for Computational Linguistics (July 2020). https://www.aclweb.org/anthology/2020.acl-main.645

  21. McAuley, J., Targett, C., Shi, Q., van den Hengel, A.: Image-based recommendations on styles and substitutes (2015)

    Google Scholar 

  22. Park, G., Han, C., Yoon, W., Kim, D.: MHSAN: multi-head self-attention network for visual semantic embedding, CoRR abs/2001.03712 (2020)

    Google Scholar 

  23. Qi, D., Su, L., Song, J., Cui, E., Bharti, T., Sacheti, A.: ImageBERT: cross-modal pre-training with large-scale weak-supervised image-text data, CoRR abs/2001.07966 (2020)

    Google Scholar 

  24. Sanh, V., Debut, L., Chaumond, J., Wolf, T.: DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter (2019)

    Google Scholar 

  25. Sidorov, M.: Attribute extraction from ecommerce product descriptions. CS229 (2018)

    Google Scholar 

  26. Wolf, T., et al.: Huggingface’s transformers: state-of-the-art natural language processing. arXiv arXiv:1910.03771 (2019)

  27. Yang, F., et al.: Visual search at eBay. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (August 2017). https://doi.org/10.1145/3097983.3098162

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Parantapa Goswami .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Amoualian, H., Goswami, P., Das, P., Montalvo, P., Ach, L., Dean, N.R. (2021). An E-Commerce Dataset in French for Multi-modal Product Categorization and Cross-Modal Retrieval. In: Hiemstra, D., Moens, MF., Mothe, J., Perego, R., Potthast, M., Sebastiani, F. (eds) Advances in Information Retrieval. ECIR 2021. Lecture Notes in Computer Science(), vol 12656. Springer, Cham. https://doi.org/10.1007/978-3-030-72113-8_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-72113-8_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-72112-1

  • Online ISBN: 978-3-030-72113-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics