Skip to main content

Assessing the Performance Gain on Retail Article Categorization at the Expense of Explainability and Resource Efficiency

  • Conference paper
  • First Online:
KI 2022: Advances in Artificial Intelligence (KI 2022)

Abstract

Current state-of-the-art methods for text classification rely on large deep neural networks. For use cases such as product cataloging, their required computational resources and lack of explainability may be problematic: not every online shop can afford a vast IT infrastructure to guarantee low latency in their web applications and some of them require sensitive categories not to be confused. This motivates alternative methods that can perform close to the mentioned methods while being explainable and less resource-demanding. In this work, we evaluate an explainable framework consisting of a representation learning model for article descriptions and a similarity-based classifier. We contrast its results with those obtained by DistilBERT, a solid low-resource baseline for deep learning-based models, on two different retail article categorization datasets; and we finally discuss the suitability of the different presented models when they need to be deployed considering not only their classification performance but also their implied resource costs and explainability aspects.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 44.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 59.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Bhatia, K., et al.: The extreme classification repository: multi-label datasets and code (2016). http://manikvarma.org/downloads/XC/XMLRepository.html

  2. Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)

    MATH  Google Scholar 

  3. Bojanowski, P., Grave, E., Joulin, A., Mikolov, T.: Enriching word vectors with subword information. Trans. Assoc. Comput. Linguist. 5, 135–146 (2017). https://doi.org/10.1162/tacl_a_00051, https://aclanthology.org/Q17-1010

  4. Brito, E., Georgiev, B., Domingo-Fernández, D., Hoyt, C.T., Bauckhage, C.: Ratvec: a general approach for low-dimensional distributed vector representations via rational kernels. In: LWDA, pp. 74–78 (2019)

    Google Scholar 

  5. Gallagher, R.J., Reing, K., Kale, D., Ver Steeg, G.: Anchored correlation explanation: Topic modeling with minimal domain knowledge. Trans. Assoc. Comput. Linguist. 5, 529–542 (2017). https://doi.org/10.1162/tacl_a_00078, https://aclanthology.org/Q17-1037

  6. García-Martín, E., Rodrigues, C.F., Riley, G., Grahn, H.: Estimation of energy consumption in machine learning. J. Parallel Distrib. Comput. 134, 75–88 (2019)

    Article  Google Scholar 

  7. Hong, D., Baek, S.S., Wang, T.: Interpretable sequence classification via prototype trajectory (2021)

    Google Scholar 

  8. Honnibal, M., Montani, I., Van Landeghem, S., Boyd, A.: spaCy: industrial-strength natural language processing in python (2020). https://doi.org/10.5281/zenodo.1212303

  9. Jagarlamudi, J., Daumé III, H., Udupa, R.: Incorporating lexical priors into topic models. In: Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics, pp. 204–213. Association for Computational Linguistics, Avignon (2012). http://aclanthology.org/E12-1021

  10. Jain, H., Prabhu, Y., Varma, M.: Extreme multi-label loss functions for recommendation, tagging, ranking & other missing label applications. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2016, pp. 935–944. Association for Computing Machinery, New York (2016). https://doi.org/10.1145/2939672.2939756

  11. Molnar, C.: Interpretable machine learning (2020). http://christophm.github.io/interpretable-ml-book/

  12. Pluciński, K., Lango, M., Stefanowski, J.: Prototypical convolutional neural network for a phrase-based explanation of sentiment classification. In: Kamp, M., et al. (eds.) ECML PKDD 2021. Communications in Computer and Information Science, vol. 1524, pp. 457–472. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-93736-2_35

    Chapter  Google Scholar 

  13. Reimers, N., Gurevych, I.: Sentence-BERT: sentence embeddings using Siamese BERT-networks. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp. 3982–3992. Association for Computational Linguistics, Hong Kong (2019). https://doi.org/10.18653/v1/D19-1410, http://aclanthology.org/D19-1410

  14. Rudin, C.: Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nat. Mach. Intell. 1(5), 206–215 (2019)

    Article  Google Scholar 

  15. Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter (2020)

    Google Scholar 

  16. Strubell, E., Ganesh, A., McCallum, A.: Energy and policy considerations for deep learning in NLP. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 3645–3650. Association for Computational Linguistics, Florence (2019). https://doi.org/10.18653/v1/P19-1355, http://aclanthology.org/P19-1355

  17. Szymański, P., Kajdanowicz, T.: A scikit-based Python environment for performing multi-label classification. ArXiv e-prints (2017)

    Google Scholar 

  18. Vaswani, A., et al.: Attention is all you need. Adv. Neural Inf. Process. Syst. 30 (2017)

    Google Scholar 

  19. Zhang, M.L., Zhou, Z.H.: ML-KNN: a lazy learning approach to multi-label learning. Pattern Recogn. 40(7), 2038–2048 (2007)

    Article  Google Scholar 

Download references

Acknowledgements

This work was funded by the German Federal Ministry of Education and Research, ML2R - no. 01S18038B.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Eduardo Brito .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Brito, E., Gupta, V., Hahn, E., Giesselbach, S. (2022). Assessing the Performance Gain on Retail Article Categorization at the Expense of Explainability and Resource Efficiency. In: Bergmann, R., Malburg, L., Rodermund, S.C., Timm, I.J. (eds) KI 2022: Advances in Artificial Intelligence. KI 2022. Lecture Notes in Computer Science(), vol 13404. Springer, Cham. https://doi.org/10.1007/978-3-031-15791-2_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-15791-2_5

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-15790-5

  • Online ISBN: 978-3-031-15791-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics