Abstract
The problem of object localization is one of the key problems in computer vision applications. Recently, multiple-instance learning (MIL) is a kind of machine learning framework which receiving a set of instances that are individually labeled. This framework has been verified that will get good effect in object localization in images. In this paper, we propose a novel method to handle the classical MIL problem. We preprocess images with superpixel techniques to speed up the whole procedure of training our model and regard the positiveness of instance as a continuous variable. The softmax model is used to bring a bridge between instances and bags and jointly optimize the bag label and instance label in a unified framework. At last, the model is trained by iterative weakly supervised training method. The extensive experiments demonstrate that out method achieves superior performance on various MIL benchmarks. The state-of-the-art results of object discovery on Pascal VOC datasets further confirm the advantages of the proposed method.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Wang, X., Zhang, Z., Ma, Y., Bai, X., Liu, W., Tu, Z.: Robust subspace discovery via relaxed rank minimization. Neural Comput. 26(3), 611–635 (2014)
Leistner, C., Saffari, A., Bischof, H.: MIForests: multiple-instance learning with randomized trees. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010. LNCS, vol. 6316, pp. 29–42. Springer, Heidelberg (2010). doi:10.1007/978-3-642-15567-3_3
Chen, X., Shrivastava, A., Gupta, A.: Enriching visual knowledge bases via object discovery and segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2027–2034 (2014)
Tang, K., Joulin, A., Li, L.-J., Fei-Fei, L.: Co-localization in real-world images. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1464–1471. IEEE (2014)
Zhu, J.-Y., Wu, J., Xu, Y., Chang, E., Tu, Z.: Unsupervised object class discovery via saliency-guided multiple class learning. IEEE Trans. Pattern Anal. Mach. Intell. 37(4), 862–875 (2015)
Dietterich, T.G., Lathrop, R.H., Lozano-Pérez, T.: Solving the multiple instance problem with axis-parallel rectangles. Artif. Intell. 89(1), 31–71 (1997)
Wu, J., Yu, Y., Huang, C., Yu, K.: Deep multiple instance learning for image classification and auto-annotation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3460–3469 (2015)
Wei, X.-S., Wu, J., Zhou, Z.-H.: Scalable multi-instance learning. In: 2014 IEEE International Conference on Data Mining, pp. 1037–1042. IEEE (2014)
Oquab, M., Bottou, L., Laptev, I., Sivic, J.: Is object localization for free?-weakly-supervised learning with convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 685–694 (2015)
Song, H.O., Girshick, R.B., Jegelka, S., Mairal, J., Harchaoui, Z., Darrell, T.: On learning to localize objects with minimal supervision. In: ICML, pp. 1611–1619 (2014)
Chen, Y., Wang, J.Z.: Image categorization by learning and reasoning with regions. J. Mach. Learn. Res. 5, 913–939 (2004)
Sapienza, M., Cuzzolin, F., Torr, P.H.: Learning discriminative space–time action parts from weakly labelled videos. Int. J. Comput. Vis. 110(1), 30–47 (2014)
Hong, R., Wang, M., Gao, Y., Tao, D., Li, X., Wu, X.: Image annotation by multiple-instance learning with discriminative feature mapping and selection. IEEE Trans. Cybern. 44(5), 669–680 (2014)
Achanta, R., Shaji, A., Smith, K., Lucchi, A., Fua, P., Süsstrunk, S.: Slic superpixels (2010)
Zitnick, C.L., Dollár, P.: Edge boxes: locating object proposals from edges. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 391–405. Springer, Cham (2014). doi:10.1007/978-3-319-10602-1_26
Maron, O., Lozano-Pérez, T.: A framework for multiple-instance learning. In: Advances in Neural Information Processing Systems, pp. 570–576 (1998)
Zhang, Q., Goldman, S.A.: EM-DD: an improved multiple-instance learning technique. In: Advances in Neural Information Processing Systems, pp. 1073–1080 (2001)
Xu, X., Frank, E.: Logistic regression and boosting for labeled bags of instances. In: Dai, H., Srikant, R., Zhang, C. (eds.) PAKDD 2004. LNCS, vol. 3056, pp. 272–281. Springer, Heidelberg (2004). doi:10.1007/978-3-540-24775-3_35
Andrews, S., Tsochantaridis, I., Hofmann, T.: Support vector machines for multiple-instance learning. In: Advances in Neural Information Processing Systems, pp. 561–568 (2002)
Zhou, Z.-H., Sun, Y.-Y., Li, Y.-F.: Multi-instance learning by treating instances as non-iid samples. In: Proceedings of the 26th Annual International Conference on Machine Learning, pp. 1249–1256. ACM (2009)
Berg, T.L., Berg, A.C., Edwards, J., Maire, M., White, R., Teh, Y.-W., Learned-Miller, E., Forsyth. D.A.: Names and faces in the news. In: 2004 Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR, vol. 842, pp. II-848–II-854. IEEE (2004)
Felzenszwalb, P.F., Girshick, R.B., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part-based models. IEEE Trans. Pattern Anal. Mach. Iintell. 32(9), 1627–1645 (2010)
Gu, C., Arbeláez, P., Lin, Y., Yu, K., Malik, J.: Multi-component models for object detection. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7575, pp. 445–458. Springer, Heidelberg (2012). doi:10.1007/978-3-642-33765-9_32
Deselaers, T., Alexe, B., Ferrari, V.: Weakly supervised localization and learning with generic knowledge. Int. J. Comput. Vis. 100(3), 275–293 (2012)
Wang, X.-F., Huang, D.S., Xu, H.: An efficient local Chan-Vese model for image segmentation. Pattern Recogn. 43(3), 603–618 (2010)
Huang, D.S.: Systematic Theory of Neural Networks for Pattern Recognition (in Chinese). Publishing House of Electronic Industry of China, Beijing, May 1996
Li, B., Huang, D.S.: Locally linear discriminant embedding: An efficient method for face recognition. Pattern Recogn. 41(12), 3813–3821 (2008)
Huang, D.S., Du, J.-X.: A constructive hybrid structure optimization methodology for radial basis probabilistic neural networks. IEEE Trans. Neural Networks 19(12), 2099–2115 (2008)
Huang, D.S.: Radial basis probabilistic neural networks: model and application. Int. J. Pattern Recogn. Artif. Intell. 13(7), 1083–1101 (1999)
Wang, X.-F., Huang, D.S.: A novel density-based clustering framework by using level set method. IEEE Trans. Knowl. Data Eng. 21(11), 1515–1531 (2009)
Huang, D.S., Ip, H.H.S., Chi, Z.: A neural root finder of polynomials based on root moments. Neural Comput. 16(8), 1721–1762 (2004)
Acknowledgments
This work was supported by the grants of the National Science Foundation of China, Nos. 61520106006, 31571364, U1611265, 61532008, 61672203, 61402334, 61472282, 61472280, 61472173, 61572447, 61373098 and 61672382, China Postdoctoral Science Foundation Grant, Nos. 2016M601646.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Li, D., Li, Z., Zhang, Y. (2017). Smooth Multi-instance Learning for Object Detection. In: Huang, DS., Bevilacqua, V., Premaratne, P., Gupta, P. (eds) Intelligent Computing Theories and Application. ICIC 2017. Lecture Notes in Computer Science(), vol 10361. Springer, Cham. https://doi.org/10.1007/978-3-319-63309-1_67
Download citation
DOI: https://doi.org/10.1007/978-3-319-63309-1_67
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-63308-4
Online ISBN: 978-3-319-63309-1
eBook Packages: Computer ScienceComputer Science (R0)