Commodity Classification Based on Multi-modal Jointly Using Image and Text Information

Xu, Yan; Tang, Yufang; Suen, Ching Y.

doi:10.1007/978-3-030-59830-3_6

Commodity Classification Based on Multi-modal Jointly Using Image and Text Information

Conference paper
First Online: 09 October 2020

1382 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12068))

Abstract

Considering that there exists image and text information almost on every commodity web page, although these two kinds of information belong to different modals, both of them describe the same commodity, so there must be a certain relationship between them. We name this relationship “symbiosis and complementary”, and propose a multi-modal based on image and text information for commodity classification algorithm (MMIT). Firstly, we use \(\ell _{2,0}\) mixed norm to optimize sparse representation method for image classification, and then employ Bayesian posterior probability to optimize k-nearest neighbor method for text classification. Secondly, we fuse two modal classification results, and build MMIT mathematical model. Finally, we utilize a dataset to train MMIT model, and then employ trained MMIT classifier to classify different commodities. Experimental results show that our method can achieve better classification performance than other state-of-the-art methods, which only exploit image information.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 109.00; Price excludes VAT (USA)

Softcover Book: USD 139.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

1.
http://www.sccs.swarthmore.edu/users/09/btomasi1/images.zip.

References

Xie, J., Zheng, Z., Gao, R., Wang, W., Zhu, S.C., Nian Wu, Y.: Learning descriptor networks for 3D shape synthesis and analysis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8629–8638 (2018)
Google Scholar
Allagwail, S., Gedik, O.S., Rahebi, J.: Face recognition with symmetrical face training samples based on local binary patterns and the Gabor filter. Symmetry 11(2), 157 (2019)
Article Google Scholar
Shen, C., Chen, L., Dong, Y., Priebe, C.: Sparse representation classification via screening for graphs. arXiv preprint arXiv:1906.01601 (2019)
Gou, J., Ma, H., Weihua, O., Zeng, S., Rao, Y., Yang, H.: A generalized mean distance-based k-nearest neighbor classifier. Expert Syst. Appl. 115, 356–372 (2019)
Article Google Scholar
Wang, H., Peng, J., Xianping, F.: Co-regularized multi-view sparse reconstruction embedding for dimension reduction. Neurocomputing 347, 191–199 (2019)
Article Google Scholar
Khaleghi, B., Khamis, A., Karray, F.O., Razavi, S.N.: Multisensor data fusion: a review of the state-of-the-art. Inf. Fusion 14(1), 28–44 (2013)
Article Google Scholar
Woźniak, M., Graña, M., Corchado, E.: A survey of multiple classifier systems as hybrid systems. Inf. Fusion 16, 3–17 (2014)
Article Google Scholar
Moreno-Seco, F., Iñesta, J.M., de León, P.J.P., Micó, L.: Comparison of classifier fusion methods for classification in pattern recognition tasks. In: Yeung, D.-Y., Kwok, J.T., Fred, A., Roli, F., de Ridder, D. (eds.) SSPR /SPR 2006. LNCS, vol. 4109, pp. 705–713. Springer, Heidelberg (2006). https://doi.org/10.1007/11815921_77
Chapter Google Scholar
Tang, Y., Li, X., Liu, Y., Wang, J., Xu, Y.: Sparse dimensionality reduction based on compressed sensing. In: IEEE WCNC, pp. 3373–3378 (2014)
Google Scholar
Yufang, T., Xueming, L., Yan, X., Shuchang, L.: Group lasso based collaborative representation for face recognition. In: 2014 4th IEEE International Conference on Network Infrastructure and Digital Content, pp. 79–83. IEEE (2014)
Google Scholar

Download references

Acknowledgements

This work is made possible by support from the 4th Shandong-Quebec International Cooperative Project of “Commodity Recommendation System Based on multi-modal Information” and the 5th Shandong-Quebec International Cooperative Project of “Research and Realization of Commodity Recommendation System Based on Deep Learning”.

Author information

Authors and Affiliations

Shandong Normal University, Jinan, 250014, Shandong, China
Yan Xu & Yufang Tang
Concordia University, Montreal, QC, H3G 1M8, Canada
Yan Xu, Yufang Tang & Ching Y. Suen

Authors

Yan Xu
View author publications
You can also search for this author in PubMed Google Scholar
Yufang Tang
View author publications
You can also search for this author in PubMed Google Scholar
Ching Y. Suen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yufang Tang .

Editor information

Editors and Affiliations

East China Normal University, Shanghai, China
Yue Lu
Paris Descartes University, Paris, France
Nicole Vincent
Hong Kong Baptist University, Kowloon, Hong Kong
Pong Chi Yuen
Sun Yat-sen University, Guangzhou, China
Wei-Shi Zheng
Polytechnique Montréal, Montreal, QC, Canada
Farida Cheriet
Concordia University, Montreal, QC, Canada
Ching Y. Suen

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Xu, Y., Tang, Y., Suen, C.Y. (2020). Commodity Classification Based on Multi-modal Jointly Using Image and Text Information. In: Lu, Y., Vincent, N., Yuen, P.C., Zheng, WS., Cheriet, F., Suen, C.Y. (eds) Pattern Recognition and Artificial Intelligence. ICPRAI 2020. Lecture Notes in Computer Science(), vol 12068. Springer, Cham. https://doi.org/10.1007/978-3-030-59830-3_6

Download citation

DOI: https://doi.org/10.1007/978-3-030-59830-3_6
Published: 09 October 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-59829-7
Online ISBN: 978-3-030-59830-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics