ABSTRACT
Multi-view logo classification is a challenging task due to the cross-view misalignment of logo image varies under different viewpoints, large intra-classes and small inter-classes variation of logo appearance. Cross-view data can represent objects from different views and thus provide complementary information for data analysis. However, most existing multi-view algorithms usually maximize the correlation between different views for consistency. Those methods ignore the interaction among different views and may cause semantic bias during the process of common feature learning. In this paper, we investigate the information bottleneck (IB) to the multi-view learning for extracting the different view common features of one category, named Dual-View Information Bottleneck representation (Dual-view IB). To the best of our knowledge, this is the first cross-view learning method for logo classification. Specifically, we maximize the mutual information between the representations of the two views to achieve the preservation of key features in the classification task, while eliminating the redundant information that is not shared between the two views. In addition, due to the unbalance of samples and limited computing resources, we further introduce a novel Pair Batch Data Augmentation (PB) algorithm for Dual-view IB model, which applies augmentations from a learned policy based on replicates instances of two samples within the same batch. Comprehensive experiments on three existing benchmark datasets, which demonstrate the effectiveness of the proposed method that outperforms the methods in the state of the art. The proposed method is expected to further the development of cross-view representation learning.
- Z.Hussain J.Dunnmon A.J.Ratner, H.Ehrenberg and C.Re. 2017. Learning to compose domain-specific transformations for data augmentation. in Advances in neural information processing systems (2017), 3236--3246.Google Scholar
- Joshua V Dillon Alexander A Alemi, Ian Fischer and Kevin Murphy. 2016. Deep variational information bottleneck. In Proceedings of the 5th International Conference on Learning Representations.Google Scholar
- Deepak Turaga Alhussein Fawzi, Horst Samulowitz and Pascal Frossard. 2016. Adaptive data augmentation for image classification. In International Conference on Image Processing, 3688--3692.Google Scholar
- Rana Ali Amjad and Bernhard Claus Geiger. 2019. Learning representations for neural network-based classification using the information bottleneck principle. IEEE Transactions on Pattern Analysis and Machine Intelligence.Google Scholar
- Imma Boada Miquel Feixas Anton Bardera, Jaume Rigau and Mateu Sbert. 2009. Image segmentation using information bottleneck method. IEEE Transactions on Image Processing, 1601--1612. Google ScholarDigital Library
- Brendan D. Tracey Artemy Kolchinsky and Steven Van Kuyk. 2019. Caveats for information bottleneck in deterministic scenarios. in International Conference on Learning Representations.Google Scholar
- Y. Bengio, S. Bengio, and J. Cloutier. 1991. Learning a synaptic learning rule. In IJCNN-91-Seattle International Joint Conference on Neural Networks. 969--975.Google Scholar
- Y. Bengio, A. Courville, and P. Vincent. 2013. Representation Learning: A Review and New Perspectives. (2013), 1798--1832. Google ScholarDigital Library
- Simone Bianco, Marco Buzzelli, Davide Mazzini, and Raimondo Schettini. 2015. Logo Recognition Using CNN Features. In International Conference on Image Analysis and Processing. 438--448.Google Scholar
- Simone Bianco, Marco Buzzelli, Davide Mazzini, and Raimondo Schettini. 2017. Deep learning for logo recognition. Neurocomputing. (2017), 23--30. Google ScholarDigital Library
- R. Lienhart C. Eggert, A. Winschel. 2015. On the benefit of synthetic data for company logo detection. In ACM Conference on Multimedia Conference. 1283--1286. Google ScholarDigital Library
- D. Tao C. Xu and C. Xu. 2014. Large-margin multi-viewinformation bottleneck. IEEE Trans. Pattern Anal. Mach. Intell., 1559--1572. Google ScholarDigital Library
- Jingying Chen, Maylor K. Leung, and Yongsheng Gao. 2003. Noisy logo recognition using line segment Hausdorff distance. Pattern Recognition (2003), 943--955.Google Scholar
- ZhiQi Cheng, Yang Liu, Xiao Wu, and Xian Sheng Hua. 2016. Video ECommerce: Towards Online Video Advertising. In ACM International Conference on Multimedia. 1365--1374. Google ScholarDigital Library
- Z. Cheng, X. Wu, Y. Liu, and X. Hua. 2017. Video eCommerce: Toward Large Scale Online Video Advertising. IEEE Transactions on Multimedia (2017), 1170--1183. Google ScholarDigital Library
- Dandelion Mane Vijay Vasudevan Ekin D. Cubuk, Barret Zoph and Quoc V. Le. 2018. Autoaugment: Learning augmentation policies from data. arXiv preprint arXiv:1805.09501.Google Scholar
- Marco Federici, Anjan Dutta, Patrick Forre, Nate Kushman, and Zeynep Akata. 2020. Learning Robust Representations via Multi-View Information Bottleneck. In International Conference on Learning Representations.Google Scholar
- István Fehérvári and Srikar Appalaraju. 2019. Scalable Logo Recognition Using Proxies. In IEEE Winter Conference on Applications of Computer Vision. 715--725.Google Scholar
- Y. Gao, F. Wang, H. Luan, and T.-S. Chua. 2014. Brand data gathering from live social media streams. In International Conference on Multimedia Retrieval. 169--176. Google ScholarDigital Library
- Ross Girshick. 2015. Fast R-CNN. In IEEE International Conference on Computer Vision. 1440--1448. Google ScholarDigital Library
- R Girshick, J Donahue, T Darrell, and J Malik. 2014. Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation. In IEEE Conference on Computer Vision and Pattern Recognition. 580--587. Google ScholarDigital Library
- Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep Residual Learning for Image Recognition. In IEEE Conference on Computer Vision and Pattern Recognition. 770--778.Google Scholar
- Tong He, Zhi Zhang, Hang Zhang, Zhongyue Zhang, Junyuan Xie, and Mu Li. 2018. Bag of Tricks for Image Classification with Convolutional Neural Networks. CoRR, Vol. abs/1812.01187 (2018).Google Scholar
- Steven C.H. Hoi, Xiongwei Wu, Hantang Liu, Yue Wu, Huiqiong Wang, Hui Xue, and Qiang Wu. 2015. LOGO-Net: Large-scale Deep Logo Detection and Brand Recognition with Deep Region-based Convolutional Networks. arXiv preprint arXiv:1511.02462 (2015).Google Scholar
- Sujuan Hou, Jianwei Lin, Shangbo Zhou, Maoling Qin, Weikuan Jia, and Yuanjie Zheng. 2017. Deep Hierarchical Representation from Classifying Logo-405. Complexity (2017), 1--12.Google Scholar
- Forrest N. Iandola, Anting Shen, Peter Gao, and Kurt Keutzer. 2015. DeepLogo: Hitting Logo Recognition with the Deep Neural Network Hammer. arXiv preprint arXiv:1510.02131 (2015).Google Scholar
- S. Bazrafkan J. Lemley and P. Corcoran. 2017. Smart augmentation learning an optimal data augmentation strategy. IEEE Access (2017), 5858--5869.Google Scholar
- S. Ren K. He, X. Zhang and J. Sun. 2014. Spatial pyramid pooling in deep convolutional networks for visual recognition. In European Conference on Computer Vision. 346--361.Google Scholar
- Sham M Kakade and Dean P Foster. 2007. Multi-view regression via canonical correlation analysis. In International Conference on Computational Learning Theory, 82--96. Google ScholarDigital Library
- Karen Livescu Kamalika Chaudhuri, Sham M Kakade and Karthik Sridharan. 2009. Multi-view clustering via canonical correlation analysis. ACM ICML, 129--136. Google ScholarDigital Library
- Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. 2012. ImageNet Classification with Deep Convolutional Neural Networks. Advances in neural information processing systems (2012), 1097--1105. Google ScholarDigital Library
- Yuanyuan Li, Qiuyue Shi, Jiangfan Deng, and Su Fei. 2018. Graphic logo detection with deep region-based convolutional networks. In Visual Communications and Image Processing. 10--13.Google Scholar
- Liu Liu, Daria Dzyabura, and Natalie Mizik. 2018. Visual Listening In: Extracting Brand Image Portrayed on Social Media. In AAAI Conference on Artificial Intelligence. 71--77.Google Scholar
- Xu Lu, Lei Zhu, Zhiyong Cheng, Jingjing Li, Xiushan Nie, and Huaxzhang Zhang. 2019. Flexible Online Multi-modal Hashing for Large-scale Multimedia Retrieval. 1129--1137. Google ScholarDigital Library
- David McAllester and Karl Stratos. 2018. Formal Limitations on the Measurement of Mutual Information. arXiv (2018), 7--12.Google Scholar
- Tao Mei, Xian-Sheng Hua, Linjun Yang, and Shipeng Li. 2007. VideoSense-Towards Effective Online Video Advertising. In ACM International Conference on Multimedia. 1075--1084. Google ScholarDigital Library
- S. Motiian and G. Doretto. 2016. Information bottleneck domain adaptation with privileged information for visual recognition. Proc. Eur. Conf. Comput. Vis, 630--647.Google Scholar
- Jan Neumann, Hanan Samet, and Aya Soffer. 2002. Integration of local and global shape analysis for logo classification. Pattern Recognition Letters. (2002), 1449--1457.Google Scholar
- Julien Mairal Nikita Dvornik and Cordelia Schmid. 2018. Modeling Visual Context is Key to Augmenting Object Detection Datasets. ECCV, 2.Google Scholar
- G. Oliveira, X. Frazao, A. Pimentel, and B. Ribeiro. 2016. Automatic graphic logo detection via Fast Region-based Convolutional Networks. In International Joint Conference on Neural Networks. 985--991.Google Scholar
- Stefan Romberg, Lluis Garcia Pueyo, Rainer Lienhart, and Roelof Van Zwol. 2011. Scalable logo recognition in real-world images. In ACM International Conference on Multimedia Retrieval. 18--20. Google ScholarDigital Library
- R. Lienhart S. Romberg. 2013. Bundle min-hashing for logo recognition. In Proceedings of the 3rd ACM conference on International conference on multimedia retrieval. 113--120. Google ScholarDigital Library
- Karen Simonyan and Andrew Zisserman. 2014. Very Deep Convolutional Networks for Large-Scale Image Recognition. arXiv preprint arXiv:1409.1556 (2014).Google Scholar
- Hang Su, Shaogang Gong, and Xiatian Zhu. 2017. WebLogo-2M: Scalable Logo Detection by Deep Learning from the Web. In IEEE International Conference on Computer Vision Workshop. 270--279.Google ScholarCross Ref
- Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich. 2015. Going Deeper with Convolutions. CoRR, abs/1409.4842 (2015), 7--12.Google Scholar
- Jing Wang, Weiqing Min, Sujuan Hou, Shengnan Ma, Yuanjie Zheng, Haishuai Wang, and Shuqiang Jiang. 2020. Logo-2K: A Large-Scale Logo Dataset for Scalable Logo Classification. In AAAI Conference on Artificial Intelligence. 6194--6201.Google ScholarCross Ref
- J. Li Y. Gao, S. Gu and Z. Liao. 2007. The multi-view information bottleneck clustering. in Advancs in Databases: Concepts, Systems and Applications, 912--917. Google ScholarDigital Library
- Ze Yang, Tiange Luo, Dong Wang, Zhiqiang Hu, Jun Gao, and Liwei Wang. 2018. Learning to Navigate for Fine-grained Classification. In European Conference on Computer Vision. 438--454.Google ScholarDigital Library
- Haojie Li Tat-Seng Chua Yue Gao, Yi Zhen. 2016. Filtering of Brand-Related Microblogs Using Social-Smooth Multiview Embedding. IEEE Transactions on Multimedia (2016), 2115--2126.Google Scholar
- Xiao Zhang, Fuzhen Zhuang, Wenzhong Li, Haochao Ying, Hui Xiong, and Sanglu Lu. 2019. Inferring Mood Instability via Smartphone Sensing: A Multi-View Learning Approach. 1401--1409. Google ScholarDigital Library
Index Terms
Cross-View Representation Learning for Multi-View Logo Classification with Information Bottleneck
Recommendations
CARL: Cross-Aligned Representation Learning for Multi-view Lung Cancer Histology Classification
Medical Image Computing and Computer Assisted Intervention – MICCAI 2023AbstractAccurately classifying the histological subtype of non-small cell lung cancer (NSCLC) using computed tomography (CT) images is critical for clinicians in determining the best treatment options for patients. Although recent advances in multi-view ...
Multi-view classification with cross-view must-link and cannot-link side information
Side information, like must-link (ML) and cannot-link (CL), has been widely used in single-view classification tasks. However, so far such information has never been applied in multi-view classification tasks. In many real world situations, data with ...
Multi-View Capsule Network
Artificial Neural Networks and Machine Learning – ICANN 2019: Theoretical Neural ComputationAbstractMulti-view learning attempts to generate a model with a better performance by exploiting information among multi-view data. Most existing approaches only focus on either consistency or complementarity principle, and learn representations (or ...
Comments