Abstract
Identification of product attributes (product type, brand, color, gender, etc.) from a query is critically important for e-commerce search systems, especially the identification of brand intent. Recently, Named Entity Recognition (NER) method has been used to address this issue. However, the limitation of NER method is that it can only identify brand intent specified by terms of a query and cannot work appropriately if brand terms are not provided explicitly. To overcome this limitation, we propose a novel Extreme Multi-label based hierarchical Multi-tAsk (EMMA) framework, where we treat the brand identification as an issue of extreme multi-label classification; thereafter, a deep learning model is also developed to jointly learn query’s product intent and brand intent in a coarse-to-fine approach. The results from both online A/B test and offline experiment on real industrial dataset demonstrate the effectiveness of our proposed framework. Additionally, this framework may be extended potentially from e-commerce system to other search scenarios.
Keywords
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Putthividhya, D., Hu, J.: Bootstrapped named entity recognition for product attribute extraction. In: Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, pp. 1557–1567 (2011)
Zheng, G., Mukherjee, S., Dong, X.L., Li, F.: OpenTag: open attribute value extraction from product profiles. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1049–1058 (2018)
Jansen, B.J., Booth, D.L., Spink, A.: Determining the user intent of web search engine queries. In: Proceedings of the 16th International Conference on World Wide Web, pp. 1149–1150 (2007)
Ha, J.W., Pyo, H., Kim, J.: Large-scale item categorization in e-commerce using multiple recurrent neural networks. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 107–115 (2016)
Ashkan, A., Clarke, C.L.A., Agichtein, E., Guo, Q.: Classifying and characterizing query intent. In: Boughanem, M., Berrut, C., Mothe, J., Soule-Dupuy, C. (eds.) ECIR 2009. LNCS, vol. 5478, pp. 578–586. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-00958-7_53
Yu, W., Sun, Z., Liu, H., Li, Z., Zheng, Z.: Multi-level deep learning based e-commerce product categorization. In: The SIGIR 2018 Workshop On eCommerce co-located with the 41st International ACM SIGIR Conference on Research and Development in Information Retrieval (2018)
Caruana, R.: Multitask learning. Mach. Learn. 28(1), 41–75 (1997)
Ruder, S.: An overview of multi-task learning in deep neural networks. arXiv preprint arXiv:1706.05098 (2017)
Subramanian, S., Trischler, A., Bengio, Y., Pal, C.J.: Learning general purpose distributed sentence representations via large scale multi-task learning. arXiv preprint arXiv:1804.00079 (2018)
Sanh, V., Wolf, T., Ruder, S.: A hierarchical multi-task approach for learning embeddings from semantic tasks. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 6949–6956 (2019)
Ahmadvand, A., Kallumadi, S., Javed, F., Agichtein, E.: Jointmap: joint query intent understanding for modeling intent hierarchies in e-commerce search. In: Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrievalm, pp. 1509–1512 (2020)
Gao, C., et al.: Neural multi-task recommendation from multi-behavior data. In: 2019 IEEE 35th International Conference on Data Engineering (ICDE), pp. 1554–1557. IEEE (2019)
Wang, J., et al.: A multi-task learning approach for improving product title compression with user search log data. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 32 (2018)
Zhang, L., Wang, R., Zhou, J., Yu, J., Ling, Z., Xiong, H.: Joint intent detection and entity linking on spatial domain queries. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: Findings, pp. 4937–4947 (2020)
Babbar, R., Schölkopf, B.: Data scarcity, robustness and extreme multi-label classification. Mach. Learn. 108(8), 1329–1351 (2019)
Babbar, R., Schölkopf, B.: DiSMEC distributed sparse machines for extreme multi-label classification. In: Proceedings of the Tenth ACM International Conference on Web Search and Data Mining, pp. 721–729 (2017)
Izmailov, P., Podoprikhin, D., Garipov, T., Vetrov, D., Wilson, A.G.: Averaging weights leads to wider optima and better generalization. arXiv preprint arXiv:1803.05407 (2018)
Prabhu, Y., Varma, M.: FastXML: a fast, accurate and stable tree-classifier for extreme multi-label learning. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 263–272 (2014)
Bhatia, K., Jain, H., Kar, P., Varma, M., Jain, P.: Sparse local embeddings for extreme multi-label classification. In: Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, vol. 29, pp. 730–738 (2015)
Tagami, Y.: AnnexML: approximate nearest neighbor search for extreme multi-label classification. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 455–464 (2017)
Huang, W., et al.: Hierarchical multi-label text classification: an attention-based recurrent network approach. In: Proceedings of the 28th ACM International Conference on Information and Knowledge Management, pp. 1051–1060 (2019)
Kendall, A., Gal, Y., Cipolla, R.: Multi-task learning using uncertainty to weigh losses for scene geometry and semantics. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7482–7491 (2018)
Van der Maaten, L., Hinton, G.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9(11), 1–28 (2008)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Zhang, J. et al. (2022). Extreme Multi-label Classification with Hierarchical Multi-task for Product Attribute Identification. In: Gama, J., Li, T., Yu, Y., Chen, E., Zheng, Y., Teng, F. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2022. Lecture Notes in Computer Science(), vol 13282. Springer, Cham. https://doi.org/10.1007/978-3-031-05981-0_20
Download citation
DOI: https://doi.org/10.1007/978-3-031-05981-0_20
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-05980-3
Online ISBN: 978-3-031-05981-0
eBook Packages: Computer ScienceComputer Science (R0)