skip to main content
10.1145/3239576.3239587acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicaipConference Proceedingsconference-collections
research-article

Transferring CNN Intermediate Layers via Weakly-Supervised Learning and Latent Semantic Analysis

Published: 16 June 2018 Publication History

Abstract

Neuroscience research about neuron coding has found out that each item is encoded by strong activations of a relatively small set of neurons. The goal of this research is to identify the neuron set that are major contributors for recognizing specific object categories. To avoid biases introduced by human experiences, our approach uses neither dataset with part label or attribute caption, nor any semantic model assumption. Our approach only uses object-level labels in both pre-training and latent semantic learning. Given a convolutional neural network (CNN) that is pre-trained for object classification, this paper proposes a framework of trimming and transferring pre-trained CNNs for any given task from a variety of tasks. The challenge of learning latent semantics in intermediate layers via a weakly-supervised approach is solved by mining the correlation between neuron rapid firing statists and image stimulus of different object categories. In addition, our work also learns the latent semantics associated with major contributor neurons by visualizing patches that triggers rapid firing, which provides a better understanding of what CNNs have learnt from deep trainings.

References

[1]
A. Krizhevsky, I. Sutskever, and G. E. Hinton. "ImageNet classification with deep convolutional neural networks." In Advances in Neural Information Processing Systems (NIPS), pages 1097--1105, 2012.
[2]
K. Simonyan and A. Zisserman. "Very deep convolutional networks for large-scale image recognition." In International Conference on Learning Representations, 2015.
[3]
K. He, X. Zhang, S. Ren, and J. Sun. "Deep residual learning for image recognition." In Computer Vision and Pattern Recognition, 2016.
[4]
A. Romero, N. Ballas, S. E. Kahou, A. Chassang, C. Gatta, and Y. Bengio. "FitNets: Hints for Thin Deep Nets." In international conference on learning representations, 2015.
[5]
S. Ren, K. He, R. B. Girshick, and J. Sun. "Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks." In IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(6), 1137--1149, 2017.
[6]
Sabour, Sara, Nicholas Frosst, and Geoffrey E. Hinton. "Dynamic routing between capsules." Advances in Neural Information Processing Systems. 2017.
[7]
Huang, Gao, et al. "Densely connected convolutional networks." Proceedings of the IEEE conference on computer vision and pattern recognition. Vol. 1. No. 2. 2017.
[8]
Si, Zhangzhang, and Song-Chun Zhu. "Learning and-or templates for object recognition and detection." IEEE transactions on pattern analysis and machine intelligence 35.9 (2013): 2189--2205.
[9]
Saxena, Shreyas, and Jakob Verbeek. "Convolutional neural fabrics." Advances in Neural Information Processing Systems. 2016.
[10]
D. Field, "What is the goal of sensory coding." In Neural Computation, 6(4), 559--601, 1994.
[11]
B. Olshausen and D. Field. "Emergence of simple-cell receptive field properties by learning a sparse code for natural images." In Nature. 381 (6583): 607--109, 1996.
[12]
D. Field. "Relations between the statistics of natural images and the response properties of cortical cells. " In Journal of The Optical Society of America A-optics Image Science and Vision, 4(12), 2379--2394, 1987.
[13]
M. Rehn and F. Sommer. "A network that uses few active neurons to code visual input predicts the diverse shapes of cortical receptive fields." In Journal of Computational Neuroscience. 22: 135--146, 2007.
[14]
H. Lee, A. Battle, R. Raina, and A. Ng. "Efficient sparse coding algorithms." In Advances in Neural Information Processing Systems, 2006.
[15]
S. Han, J. Pool, J. Tran, and W. Dally. "Learning both weights and connections for efficient neural network." In Advances in Neural Information Processing Systems, pages 1135--1143, 2015.
[16]
M. Jaderberg, A. Vedaldi, and A. Zisserman. "Speeding up Convolutional Neural Networks with Low Rank Expansions." In British machine vision conference, 2014.
[17]
X. Yu, T. Liu, X. Wang, and D. Tao, "On Compressing Deep Models by Low Rank and Sparse Decomposition." In Computer Vision and Pattern Recognition, 2017.
[18]
K. Saenko, B. Kulis, M. Fritz, and T. Darrell. "Adapting visual category models to new domains." In European conference on computer vision, 2010.
[19]
Y. Aytar and A. Zisserman. Tabula rasa. "Model transfer for object category detection." In International Conference on Computer Vision, 2011.
[20]
M. Oquab, L. Bottou, I. Laptev, and J. Sivic. "Learning and transferring mid-level image representations using convolutional neural networks." In Computer Vision and Pattern Recognition, 2014.
[21]
M. Kummerer, L. Theis, and M. Bethge. "Deep Gaze I: Boosting Saliency Prediction with Feature Maps Trained on ImageNet." In International Conference on Learning Representations Workshop, 2015.
[22]
X. Huang, C. Shen, X. Boix, and Q. Zhao. "SALICON: Reducing the Semantic Gap in Saliency Prediction by Adapting Deep Neural Networks." In International Conference on Computer Vision, 2015.
[23]
B. Zhou, A. Khosla, A. Lapedriza, A. Oliva and A. Torralba. "Object Detectors Emerge in Deep Scene CNNs." In International Conference on Learning Representation, 2015
[24]
R. B. Girshick, J. Donahue, T. Darrell and J. Malik "Rich feature hierarchies for accurate object detection and semantic segmentation." In Computer Vision and Pattern Recognition, 2014.
[25]
M. D. Zeiler, and R. Fergus. "Visualizing and Understanding Convolutional Networks." In European conference on computer vision, 818--833, 2013.
[26]
L. Maaten and G. Hinton. "Visualization Data Using t-SNE." In Journal of Machine Learing Research, 2008.
[27]
A. Mahendran and A. Vedaldi. "Understanding Deep Image Representations by Inverting Them." In Computer Vision and Pattern Recognition, 2014
[28]
Justin Johnson, Andrej Karpathy, and Fei-Fe Lii. "Densecap: Fully convolutional localization networks for dense captioning." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016.
[29]
Quanshi Zhang, et al. "Mining object parts from cnns via active question-answering." arXiv preprint arXiv:1704.03173 (2017).
[30]
David Bau, et al. "Network dissection: Quantifying interpretability of deep visual representations." Computer Vision and Pattern Recognition (CVPR), 2017 IEEE Conference on. IEEE, 2017.
[31]
Christian Szegedy, et al. "Going deeper with convolutions." In Computer Vision and Pattern Recognition, 2015.
[32]
Mark Everingham, et al. "The pascal visual object classes (voc) challenge." International journal of computer vision 88.2 (2010): 303--338.

Cited By

View all
  • (2020)Enforcement Key Feature Mining for Task Specific Salient Region Detection2020 IEEE 6th International Conference on Computer and Communications (ICCC)10.1109/ICCC51575.2020.9345254(1203-1207)Online publication date: 11-Dec-2020
  • (2019)Latent Markov Chain Encoding for Abnormal Landing Event Detection2019 3rd International Conference on Electronic Information Technology and Computer Engineering (EITCE)10.1109/EITCE47263.2019.9094793(1858-1862)Online publication date: Oct-2019

Index Terms

  1. Transferring CNN Intermediate Layers via Weakly-Supervised Learning and Latent Semantic Analysis

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    ICAIP '18: Proceedings of the 2nd International Conference on Advances in Image Processing
    June 2018
    261 pages
    ISBN:9781450364607
    DOI:10.1145/3239576
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    In-Cooperation

    • University of Electronic Science and Technology of China: University of Electronic Science and Technology of China
    • Southwest Jiaotong University

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 16 June 2018

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. CNN
    2. Latent Semantics
    3. Sparse Coding
    4. Transfer Learning
    5. Weakly-Supervised Learning

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Conference

    ICAIP '18

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)3
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 17 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2020)Enforcement Key Feature Mining for Task Specific Salient Region Detection2020 IEEE 6th International Conference on Computer and Communications (ICCC)10.1109/ICCC51575.2020.9345254(1203-1207)Online publication date: 11-Dec-2020
    • (2019)Latent Markov Chain Encoding for Abnormal Landing Event Detection2019 3rd International Conference on Electronic Information Technology and Computer Engineering (EITCE)10.1109/EITCE47263.2019.9094793(1858-1862)Online publication date: Oct-2019

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media