skip to main content
10.1145/3474085.3475704acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Cross-View Representation Learning for Multi-View Logo Classification with Information Bottleneck

Published:17 October 2021Publication History

ABSTRACT

Multi-view logo classification is a challenging task due to the cross-view misalignment of logo image varies under different viewpoints, large intra-classes and small inter-classes variation of logo appearance. Cross-view data can represent objects from different views and thus provide complementary information for data analysis. However, most existing multi-view algorithms usually maximize the correlation between different views for consistency. Those methods ignore the interaction among different views and may cause semantic bias during the process of common feature learning. In this paper, we investigate the information bottleneck (IB) to the multi-view learning for extracting the different view common features of one category, named Dual-View Information Bottleneck representation (Dual-view IB). To the best of our knowledge, this is the first cross-view learning method for logo classification. Specifically, we maximize the mutual information between the representations of the two views to achieve the preservation of key features in the classification task, while eliminating the redundant information that is not shared between the two views. In addition, due to the unbalance of samples and limited computing resources, we further introduce a novel Pair Batch Data Augmentation (PB) algorithm for Dual-view IB model, which applies augmentations from a learned policy based on replicates instances of two samples within the same batch. Comprehensive experiments on three existing benchmark datasets, which demonstrate the effectiveness of the proposed method that outperforms the methods in the state of the art. The proposed method is expected to further the development of cross-view representation learning.

References

  1. Z.Hussain J.Dunnmon A.J.Ratner, H.Ehrenberg and C.Re. 2017. Learning to compose domain-specific transformations for data augmentation. in Advances in neural information processing systems (2017), 3236--3246.Google ScholarGoogle Scholar
  2. Joshua V Dillon Alexander A Alemi, Ian Fischer and Kevin Murphy. 2016. Deep variational information bottleneck. In Proceedings of the 5th International Conference on Learning Representations.Google ScholarGoogle Scholar
  3. Deepak Turaga Alhussein Fawzi, Horst Samulowitz and Pascal Frossard. 2016. Adaptive data augmentation for image classification. In International Conference on Image Processing, 3688--3692.Google ScholarGoogle Scholar
  4. Rana Ali Amjad and Bernhard Claus Geiger. 2019. Learning representations for neural network-based classification using the information bottleneck principle. IEEE Transactions on Pattern Analysis and Machine Intelligence.Google ScholarGoogle Scholar
  5. Imma Boada Miquel Feixas Anton Bardera, Jaume Rigau and Mateu Sbert. 2009. Image segmentation using information bottleneck method. IEEE Transactions on Image Processing, 1601--1612. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Brendan D. Tracey Artemy Kolchinsky and Steven Van Kuyk. 2019. Caveats for information bottleneck in deterministic scenarios. in International Conference on Learning Representations.Google ScholarGoogle Scholar
  7. Y. Bengio, S. Bengio, and J. Cloutier. 1991. Learning a synaptic learning rule. In IJCNN-91-Seattle International Joint Conference on Neural Networks. 969--975.Google ScholarGoogle Scholar
  8. Y. Bengio, A. Courville, and P. Vincent. 2013. Representation Learning: A Review and New Perspectives. (2013), 1798--1832. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Simone Bianco, Marco Buzzelli, Davide Mazzini, and Raimondo Schettini. 2015. Logo Recognition Using CNN Features. In International Conference on Image Analysis and Processing. 438--448.Google ScholarGoogle Scholar
  10. Simone Bianco, Marco Buzzelli, Davide Mazzini, and Raimondo Schettini. 2017. Deep learning for logo recognition. Neurocomputing. (2017), 23--30. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. R. Lienhart C. Eggert, A. Winschel. 2015. On the benefit of synthetic data for company logo detection. In ACM Conference on Multimedia Conference. 1283--1286. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. D. Tao C. Xu and C. Xu. 2014. Large-margin multi-viewinformation bottleneck. IEEE Trans. Pattern Anal. Mach. Intell., 1559--1572. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Jingying Chen, Maylor K. Leung, and Yongsheng Gao. 2003. Noisy logo recognition using line segment Hausdorff distance. Pattern Recognition (2003), 943--955.Google ScholarGoogle Scholar
  14. ZhiQi Cheng, Yang Liu, Xiao Wu, and Xian Sheng Hua. 2016. Video ECommerce: Towards Online Video Advertising. In ACM International Conference on Multimedia. 1365--1374. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Z. Cheng, X. Wu, Y. Liu, and X. Hua. 2017. Video eCommerce: Toward Large Scale Online Video Advertising. IEEE Transactions on Multimedia (2017), 1170--1183. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Dandelion Mane Vijay Vasudevan Ekin D. Cubuk, Barret Zoph and Quoc V. Le. 2018. Autoaugment: Learning augmentation policies from data. arXiv preprint arXiv:1805.09501.Google ScholarGoogle Scholar
  17. Marco Federici, Anjan Dutta, Patrick Forre, Nate Kushman, and Zeynep Akata. 2020. Learning Robust Representations via Multi-View Information Bottleneck. In International Conference on Learning Representations.Google ScholarGoogle Scholar
  18. István Fehérvári and Srikar Appalaraju. 2019. Scalable Logo Recognition Using Proxies. In IEEE Winter Conference on Applications of Computer Vision. 715--725.Google ScholarGoogle Scholar
  19. Y. Gao, F. Wang, H. Luan, and T.-S. Chua. 2014. Brand data gathering from live social media streams. In International Conference on Multimedia Retrieval. 169--176. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Ross Girshick. 2015. Fast R-CNN. In IEEE International Conference on Computer Vision. 1440--1448. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. R Girshick, J Donahue, T Darrell, and J Malik. 2014. Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation. In IEEE Conference on Computer Vision and Pattern Recognition. 580--587. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep Residual Learning for Image Recognition. In IEEE Conference on Computer Vision and Pattern Recognition. 770--778.Google ScholarGoogle Scholar
  23. Tong He, Zhi Zhang, Hang Zhang, Zhongyue Zhang, Junyuan Xie, and Mu Li. 2018. Bag of Tricks for Image Classification with Convolutional Neural Networks. CoRR, Vol. abs/1812.01187 (2018).Google ScholarGoogle Scholar
  24. Steven C.H. Hoi, Xiongwei Wu, Hantang Liu, Yue Wu, Huiqiong Wang, Hui Xue, and Qiang Wu. 2015. LOGO-Net: Large-scale Deep Logo Detection and Brand Recognition with Deep Region-based Convolutional Networks. arXiv preprint arXiv:1511.02462 (2015).Google ScholarGoogle Scholar
  25. Sujuan Hou, Jianwei Lin, Shangbo Zhou, Maoling Qin, Weikuan Jia, and Yuanjie Zheng. 2017. Deep Hierarchical Representation from Classifying Logo-405. Complexity (2017), 1--12.Google ScholarGoogle Scholar
  26. Forrest N. Iandola, Anting Shen, Peter Gao, and Kurt Keutzer. 2015. DeepLogo: Hitting Logo Recognition with the Deep Neural Network Hammer. arXiv preprint arXiv:1510.02131 (2015).Google ScholarGoogle Scholar
  27. S. Bazrafkan J. Lemley and P. Corcoran. 2017. Smart augmentation learning an optimal data augmentation strategy. IEEE Access (2017), 5858--5869.Google ScholarGoogle Scholar
  28. S. Ren K. He, X. Zhang and J. Sun. 2014. Spatial pyramid pooling in deep convolutional networks for visual recognition. In European Conference on Computer Vision. 346--361.Google ScholarGoogle Scholar
  29. Sham M Kakade and Dean P Foster. 2007. Multi-view regression via canonical correlation analysis. In International Conference on Computational Learning Theory, 82--96. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Karen Livescu Kamalika Chaudhuri, Sham M Kakade and Karthik Sridharan. 2009. Multi-view clustering via canonical correlation analysis. ACM ICML, 129--136. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. 2012. ImageNet Classification with Deep Convolutional Neural Networks. Advances in neural information processing systems (2012), 1097--1105. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Yuanyuan Li, Qiuyue Shi, Jiangfan Deng, and Su Fei. 2018. Graphic logo detection with deep region-based convolutional networks. In Visual Communications and Image Processing. 10--13.Google ScholarGoogle Scholar
  33. Liu Liu, Daria Dzyabura, and Natalie Mizik. 2018. Visual Listening In: Extracting Brand Image Portrayed on Social Media. In AAAI Conference on Artificial Intelligence. 71--77.Google ScholarGoogle Scholar
  34. Xu Lu, Lei Zhu, Zhiyong Cheng, Jingjing Li, Xiushan Nie, and Huaxzhang Zhang. 2019. Flexible Online Multi-modal Hashing for Large-scale Multimedia Retrieval. 1129--1137. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. David McAllester and Karl Stratos. 2018. Formal Limitations on the Measurement of Mutual Information. arXiv (2018), 7--12.Google ScholarGoogle Scholar
  36. Tao Mei, Xian-Sheng Hua, Linjun Yang, and Shipeng Li. 2007. VideoSense-Towards Effective Online Video Advertising. In ACM International Conference on Multimedia. 1075--1084. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. S. Motiian and G. Doretto. 2016. Information bottleneck domain adaptation with privileged information for visual recognition. Proc. Eur. Conf. Comput. Vis, 630--647.Google ScholarGoogle Scholar
  38. Jan Neumann, Hanan Samet, and Aya Soffer. 2002. Integration of local and global shape analysis for logo classification. Pattern Recognition Letters. (2002), 1449--1457.Google ScholarGoogle Scholar
  39. Julien Mairal Nikita Dvornik and Cordelia Schmid. 2018. Modeling Visual Context is Key to Augmenting Object Detection Datasets. ECCV, 2.Google ScholarGoogle Scholar
  40. G. Oliveira, X. Frazao, A. Pimentel, and B. Ribeiro. 2016. Automatic graphic logo detection via Fast Region-based Convolutional Networks. In International Joint Conference on Neural Networks. 985--991.Google ScholarGoogle Scholar
  41. Stefan Romberg, Lluis Garcia Pueyo, Rainer Lienhart, and Roelof Van Zwol. 2011. Scalable logo recognition in real-world images. In ACM International Conference on Multimedia Retrieval. 18--20. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. R. Lienhart S. Romberg. 2013. Bundle min-hashing for logo recognition. In Proceedings of the 3rd ACM conference on International conference on multimedia retrieval. 113--120. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Karen Simonyan and Andrew Zisserman. 2014. Very Deep Convolutional Networks for Large-Scale Image Recognition. arXiv preprint arXiv:1409.1556 (2014).Google ScholarGoogle Scholar
  44. Hang Su, Shaogang Gong, and Xiatian Zhu. 2017. WebLogo-2M: Scalable Logo Detection by Deep Learning from the Web. In IEEE International Conference on Computer Vision Workshop. 270--279.Google ScholarGoogle ScholarCross RefCross Ref
  45. Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich. 2015. Going Deeper with Convolutions. CoRR, abs/1409.4842 (2015), 7--12.Google ScholarGoogle Scholar
  46. Jing Wang, Weiqing Min, Sujuan Hou, Shengnan Ma, Yuanjie Zheng, Haishuai Wang, and Shuqiang Jiang. 2020. Logo-2K: A Large-Scale Logo Dataset for Scalable Logo Classification. In AAAI Conference on Artificial Intelligence. 6194--6201.Google ScholarGoogle ScholarCross RefCross Ref
  47. J. Li Y. Gao, S. Gu and Z. Liao. 2007. The multi-view information bottleneck clustering. in Advancs in Databases: Concepts, Systems and Applications, 912--917. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Ze Yang, Tiange Luo, Dong Wang, Zhiqiang Hu, Jun Gao, and Liwei Wang. 2018. Learning to Navigate for Fine-grained Classification. In European Conference on Computer Vision. 438--454.Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Haojie Li Tat-Seng Chua Yue Gao, Yi Zhen. 2016. Filtering of Brand-Related Microblogs Using Social-Smooth Multiview Embedding. IEEE Transactions on Multimedia (2016), 2115--2126.Google ScholarGoogle Scholar
  50. Xiao Zhang, Fuzhen Zhuang, Wenzhong Li, Haochao Ying, Hui Xiong, and Sanglu Lu. 2019. Inferring Mood Instability via Smartphone Sensing: A Multi-View Learning Approach. 1401--1409. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Cross-View Representation Learning for Multi-View Logo Classification with Information Bottleneck

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        MM '21: Proceedings of the 29th ACM International Conference on Multimedia
        October 2021
        5796 pages
        ISBN:9781450386517
        DOI:10.1145/3474085

        Copyright © 2021 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 17 October 2021

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        Overall Acceptance Rate995of4,171submissions,24%

        Upcoming Conference

        MM '24
        MM '24: The 32nd ACM International Conference on Multimedia
        October 28 - November 1, 2024
        Melbourne , VIC , Australia

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader