MIDCN: A Multiple Instance Deep Convolutional Network for Image Classification

He, Kelei; Huo, Jing; Shi, Yinghuan; Gao, Yang; Shen, Dinggang

doi:10.1007/978-3-030-29908-8_19

Kelei He¹⁰,
Jing Huo¹⁰,
Yinghuan Shi¹⁰,
Yang Gao¹⁰ &
…
Dinggang Shen¹¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11670))

Included in the following conference series:

Pacific Rim International Conference on Artificial Intelligence

2335 Accesses
1 Citations

Abstract

For the image classification task, usually, the image collected in the wild contains multiple objects instead of a single dominant one. Besides, the image label is not explicitly associated with the object region, i.e., it is weakly annotated. In this paper, we propose a novel deep convolutional network for image classification under a weakly supervised condition. The proposed method, namely MIDCN, formulate the problem into Multiple Instance Learning (MIL), where each image is a bag which contains multiple instances (objects). Different with previous deep MIL methods which predict the label of each bag (i.e., image) by simply performing pooling/voting strategy over their instance (i.e., region) predictions, MIDCN directly predicts the label of a bag via bag features learned by measuring the similarities between instance features and a set of learned informative prototypes. Specifically, the prototypes are obtained by a newly proposed Global Contrast Pooling (GCP) layer which leverages instances not only coming from the current bag but also the other bags. Thus the learned bag features also contain global information of all the training bags, which is more robust and noise free. We did extensive experiments on two real-world image datasets, including both natural image dataset (PASCAL VOC 07) and pathological lung cancer image dataset, and show the results of the proposed MIDCN consistently outperforms the state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Amores, J.: Multiple instance classification: review, taxonomy and comparative study. Artif. Intell. 201, 81–105 (2013). https://doi.org/10.1016/j.artint.2013.06.003
Article MathSciNet MATH Google Scholar
Andrews, S., Tsochantaridis, I., Hofmann, T.: Support vector machines for multiple-instance learning. In: NIPS, Vancouver, BC, Canada, 9–14 December 2002, pp. 561–568 (2002)
Google Scholar
Babenko, B., Verma, N., Dollár, P., Belongie, S.J.: Multiple instance learning with manifold bags. In: ICML 2011, Bellevue, WA, USA, 28 June–2 July 2011, pp. 81–88 (2011)
Google Scholar
Chang, C., Lin, C.: LIBSVM: a library for support vector machines. ACM TIST 2(3), 27 (2011). https://doi.org/10.1145/1961189.1961199
Article Google Scholar
Chatfield, K., Simonyan, K., Vedaldi, A., Zisserman, A.: Return of the devil in the details: delving deep into convolutional nets. In: British Machine Vision Conference (2014)
Google Scholar
Cheng, M., Zhang, Z., Lin, W., Torr, P.H.S.: BING: binarized normed gradients for objectness estimation at 300 fps. In: CVPR 2014, Columbus, OH, USA, 23–28 June 2014, pp. 3286–3293 (2014). https://doi.org/10.1109/CVPR.2014.414
Everingham, M., Eslami, S.M.A., Gool, L.J.V., Williams, C.K.I., Winn, J.M., Zisserman, A.: The Pascal visual object classes challenge: a retrospective. Int. J. Comput. Vis. 111(1), 98–136 (2015). https://doi.org/10.1007/s11263-014-0733-5
Article Google Scholar
Feng, J., Zhou, Z.H.: Deep MIML network. In: AAAI, pp. 1884–1890 (2017)
Google Scholar
Girshick, R.B., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: CVPR 2014, Columbus, OH, USA, 23–28 June 2014, pp. 580–587 (2014). https://doi.org/10.1109/CVPR.2014.81
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Google Scholar
Hoffman, J., Pathak, D., Darrell, T., Saenko, K.: Detector discovery in the wild: joint multiple instance and representation learning. In: CVPR 2015, Boston, MA, USA, 7–12 June 2015, pp. 2883–2891 (2015). https://doi.org/10.1109/CVPR.2015.7298906
Huang, G., Liu, Z., Van Der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4700–4708 (2017)
Google Scholar
Jia, Y., et al.: Caffe: convolutional architecture for fast feature embedding. In: MM 2014, Orlando, FL, USA, 03–07 November 2014, pp. 675–678 (2014). https://doi.org/10.1145/2647868.2654889
Karpathy, A., Li, F.: Deep visual-semantic alignments for generating image descriptions. In: CVPR 2015, Boston, MA, USA, 7–12 June 2015, pp. 3128–3137 (2015). https://doi.org/10.1109/CVPR.2015.7298932
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: NIPS, Lake Tahoe, NV, USA, 3–6 December 2012, pp. 1106–1114 (2012)
Google Scholar
LeCun, Y., et al.: Backpropagation applied to handwritten zip code recognition. Neural Comput. 1(4), 541–551 (1989). https://doi.org/10.1162/neco.1989.1.4.541
Article Google Scholar
Liu, M., Zhang, D., Shen, D.: Ensemble sparse classification of Alzheimer’s disease. NeuroImage 60(2), 1106–1116 (2012). https://doi.org/10.1016/j.neuroimage.2012.01.055
Article Google Scholar
Mittelman, R., Lee, H., Kuipers, B., Savarese, S.: Weakly supervised learning of mid-level features with Beta-Bernoulli process restricted Boltzmann machines. In: CVPR, Portland, OR, USA, 23–28 June 2013, pp. 476–483 (2013). https://doi.org/10.1109/CVPR.2013.68
Oquab, M., Bottou, L., Laptev, I., Sivic, J.: Learning and transferring mid-level image representations using convolutional neural networks. In: CVPR 2014, Columbus, OH, USA, 23–28 June 2014, pp. 1717–1724 (2014). https://doi.org/10.1109/CVPR.2014.222
Oquab, M., Bottou, L., Laptev, I., Sivic, J.: Is object localization for free? Weakly-supervised learning with convolutional neural networks. In: CVPR, Boston, USA, June 2015
Google Scholar
Paszke, A., et al.: Automatic differentiation in PyTorch. In: NIPS-W (2017)
Google Scholar
Pathak, D., Krähenbühl, P., Darrell, T.: Constrained convolutional neural networks for weakly supervised segmentation. In: ICCV 2015, Santiago, Chile, 7–13 December 2015, pp. 1796–1804 (2015). https://doi.org/10.1109/ICCV.2015.209
Pinheiro, P.H.O., Collobert, R.: From image-level to pixel-level labeling with convolutional networks. In: CVPR 2015, Boston, MA, USA, 7–12 June 2015, pp. 1713–1721 (2015). https://doi.org/10.1109/CVPR.2015.7298780
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems, pp. 91–99 (2015)
Google Scholar
Sermanet, P., Eigen, D., Zhang, X., Mathieu, M., Fergus, R., LeCun, Y.: Overfeat: integrated recognition, localization and detection using convolutional networks. CoRR abs/1312.6229 (2013)
Google Scholar
Shi, Y., Gao, Y., Yang, Y., Zhang, Y., Wang, D.: Multimodal sparse representation-based classification for lung needle biopsy images. IEEE Trans. Biomed. Eng. 60(10), 2675–2685 (2013). https://doi.org/10.1109/TBME.2013.2262099
Article Google Scholar
Sun, M., Han, T.X., Liu, M.C., Khodayari-Rostamabad, A.: Multiple instance learning convolutional neural networks for object recognition. In: 2016 International Conference on Pattern Recognition, pp. 3270–3275. IEEE (2016)
Google Scholar
Taigman, Y., Yang, M., Ranzato, M., Wolf, L.: Deepface: closing the gap to human-level performance in face verification. In: CVPR 2014, Columbus, OH, USA, 23–28 June 2014, pp. 1701–1708 (2014). https://doi.org/10.1109/CVPR.2014.220
Wei, Y., et al.: CNN: single-label to multi-label. CoRR abs/1406.5726 (2014)
Google Scholar
Wu, J., Yu, Y., Huang, C., Yu, K.: Deep multiple instance learning for image classification and auto-annotation. In: CVPR 2015, Boston, MA, USA, 7–12 June 2015, pp. 3460–3469 (2015). https://doi.org/10.1109/CVPR.2015.7298968
Xu, Y., Mo, T., Feng, Q., Zhong, P., Lai, M., Chang, E.I.: Deep learning of feature representation with multiple instance learning for medical image analysis. In: ICASSP 2014, Florence, Italy, 4–9 May 2014, pp. 1626–1630 (2014). https://doi.org/10.1109/ICASSP.2014.6853873
Zhang, L., et al.: Kernel sparse representation-based classifier. IEEE Trans. Signal Process. 60(4), 1684–1695 (2012). https://doi.org/10.1109/TSP.2011.2179539
Article MathSciNet MATH Google Scholar

Download references

Acknowledgment

This work was supported in part by the National Key Research and Development Program of China (2017YFB0702601), the National Natural Science Foundation of China (Grant Nos. 61673203, 61806092), Jiangsu Natural Science Foundation (BK20180326), and the Fundamental Research Funds for the Central Universities (14380056).

Author information

Authors and Affiliations

State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, People’s Republic of China
Kelei He, Jing Huo, Yinghuan Shi & Yang Gao
Biomedical Research Imaging Center, University of North Carolina, Chapel Hill, NC, USA
Dinggang Shen

Authors

Kelei He
View author publications
You can also search for this author in PubMed Google Scholar
Jing Huo
View author publications
You can also search for this author in PubMed Google Scholar
Yinghuan Shi
View author publications
You can also search for this author in PubMed Google Scholar
Yang Gao
View author publications
You can also search for this author in PubMed Google Scholar
Dinggang Shen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yang Gao .

Editor information

Editors and Affiliations

Department of Computing, Macquarie University, Sydney, NSW, Australia
Abhaya C. Nayak
RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
Alok Sharma

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

He, K., Huo, J., Shi, Y., Gao, Y., Shen, D. (2019). MIDCN: A Multiple Instance Deep Convolutional Network for Image Classification. In: Nayak, A., Sharma, A. (eds) PRICAI 2019: Trends in Artificial Intelligence. PRICAI 2019. Lecture Notes in Computer Science(), vol 11670. Springer, Cham. https://doi.org/10.1007/978-3-030-29908-8_19

Download citation

DOI: https://doi.org/10.1007/978-3-030-29908-8_19
Published: 23 August 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-29907-1
Online ISBN: 978-3-030-29908-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics