Gradual recovery based occluded digit images recognition

Wang, Yasi; Yao, Hongxun; Yu, Wei; Wang, Dong; Zhou, Shangchen; Sun, Xiaoshuai

doi:10.1007/s11042-018-6048-8

Gradual recovery based occluded digit images recognition

Published: 11 July 2018

Volume 78, pages 2571–2586, (2019)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Yasi Wang¹,
Hongxun Yao¹,
Wei Yu¹,
Dong Wang²,
Shangchen Zhou¹ &
…
Xiaoshuai Sun¹

316 Accesses
2 Citations
Explore all metrics

Abstract

Recent research shows that auto-encoder is suitable to model a variation which varies smoothly. In this paper, we attempt to utilize auto-encoder to recognize partially occluded digit images with gradual recovery. We propose a new variation of auto-encoder, namely the “generalized auto-encoder”, and construct stacked generalized auto-encoders (SGAE) for the problem of occluded digit images recovery and recognition. Rather than recovering the occlusion directly, the degree of occlusion is regarded as a continuous variable, and the recovery task is regarded as a gradual process. We divide the whole task into multiple intermediate recovery procedures, and assign each procedure to one generalized auto-encoder, thus handling the recovery problem gradually. Based on the encouraging recovery results, the occluded digit images can be recognized well. The results demonstrate that gradual recovery outperforms direct recovery of the occluded region. Moreover, the main application in this paper is occluded digit images recognition, though, the proposed framework can be generalized to other problems easily and nicely. Extensive experiments are designed to verify our settings and show the effectiveness, extendibility and generalizability of the method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Stochastic Decorrelation Constraint Regularized Auto-Encoder for Visual Recognition

NeighborMix data augmentation for image recognition

Article 01 September 2023

AONet: Attentional Occlusion-Aware Network for Occluded Person Re-identification

Notes

References

Bansal A, Chen X, Russell B, Gupta A, Ramanan D (2017) Pixelnet: representation of the pixels, by the pixels, and for the pixels. arXiv:1702.06506
Benenson R (2014) Occlusion handling. Springer, New York
Book Google Scholar
Ciregan D, Meier U, Schmidhuber J (2012) Multi-column deep neural networks for image classification. In: IEEE conference on computer vision and pattern recognition, pp 3642–3649
Cui J, Liu Y, Xu Y, Zhao H, Zha H (2013) Tracking generic human motion via fusion of low- and high-dimensional approaches. IEEE Trans Syst Man Cybern Syst 43:996–1002
Article Google Scholar
de Campos TE, Babu BR, Varma M (2009) Character recognition in natural images. In: International conference on computer vision theory and applications
Dong C, Loy CC, He K, Tang X (2016) Image super-resolution using deep convolutional networks. IEEE Trans Pattern Anal Mach Intell 38:295–307
Article Google Scholar
Ekman P, Friesen W (1978) Facial action coding system. Consulting Psychologists Press, Washington
Google Scholar
Fan N (2010) Feature-based partially occluded object recognition. In: International conference on pattern recognition, pp 3001–3004
Filho ANGL, Mello CAB (2012) A novel method for reconstructing degraded digits. In: IEEE international conference on systems, man, and cybernetics, pp 733–738
Ghifary M, Kleijn W, Zhang M (2014) Deep hybrid networks with good out-of-sample object recognition. In: IEEE international conference on acoustics, speech and signal processing, pp 5437–5441
Hinton G, Salakhutdinov R (2006) Reducing the dimensionality of data with neural networks. Science 313:504–507
Article MathSciNet Google Scholar
Hu Z, Song Y (2009) Dimensionality reduction and reconstruction of data based on autoencoder network. J Electron Inf Technol 31:1189–1192
Google Scholar
Isola P, Zhu JY, Zhou T, Efros AA (2016) Image-to-image translation with conditional adversarial networks. arXiv:1611.07004
Kan M, Shan S, Chang H, Chen X (2014) Stacked progressive auto-encoders (spae) for face recognition across poses. In: IEEE conference on computer vision and pattern recognition, pp 1883–1890
Kanade T, Cohn JF, Tian Y (2000) Comprehensive database for facial expression analysis. In: IEEE international conference on automatic face and gesture recognition, pp 46–53
Kingma DP, Welling M (2013) Auto-encoding variational bayes. arXiv:1312.6114
Kramer MA (1991) Nonlinear principal component analysis using autoassociative neural networks. AIChE Journal 37:233–243
Article Google Scholar
Krolupper F, Flusser J (2007) Polygonal shape description for recognition of partially occluded objects. Pattern Recogn Lett 28:1002–1011
Article Google Scholar
Lecun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11):2278–2324
Article Google Scholar
Lee CY, Xie S, Gallagher P, Zhang Z, Tu Z (2014) Deeply-supervised nets. Eprint Arxiv, pp 562– 570
Li C, Zhu J, Zhang B (2016) Learning to generate with memory. International Conference on Machine Learning 48:1177–1186
Google Scholar
Liu Y, Nie L, Han L, Zhang L, Rosenblum DS (2015) Action2activity: recognizing complex activities from sensor data. International Joint Conference on Artificial Intelligence 2015:1617–1623
Google Scholar
Liu L, Cheng L, Liu Y, Jia Y, Rosenblum DS (2016) Recognizing complex activities by a probabilistic interval-based model. In: AAAI conference on artificial intelligence
Liu L, Xiong C, Zhang H, Niu Z (2016) Deep aging face verification with large gaps. IEEE Trans Multimedia 18:64–75
Article Google Scholar
Liu Y, Nie L, Liu L, Rosenblum DS (2016) From action to activity: sensor-based activity recognition. Neurocomputing 181:108–115
Article Google Scholar
Liu Y, Zhang L, Nie L, Yan Y, Rosenblum DS (2016) Fortune teller: predicting your career path. AAAI Conference on Artificial Intelligence 2016:201–207
Google Scholar
Liu Y, Zheng Y, Liang Y, Liu S, Rosenblum DS (2016) Urban water quality prediction based on multi-task multi-view learning. In: International joint conference on artificial intelligence, pp 2576–2582
Lu Y, Wei Y, Liu L, Zhong J, Sun L, Liu Y (2017) Towards unsupervised physical activity recognition using smartphone accelerometers. Multimedia Tools and Applications 76:10,701–10,719
Article Google Scholar
Lucey P, Cohn JF, Kanade T, Saragih J, Ambadar Z, Matthews I (2010) The extended cohn-kanade dataset (ck+): a complete dataset for action unit and emotion-specified expression. In: IEEE conference on computer vision and pattern recognition - workshops, pp 94–101
Makhzani A, Frey B (2013) K-sparse autoencoders. arXiv:1312.5663
Rifai S, Vincent P, Muller X, Glorot X, Bengio Y (2011) Contractive auto-encoders: explicit invariance during feature extraction. In: International conference on machine learning, pp 833–840
Saber E, Xu Y, Tekalp AM (2005) Partial shape recognition by sub-matrix matching for partial matching guided image labeling. Pattern Recogn 38:1560–1573
Article Google Scholar
Schmidhuber J (2012) Multi-column deep neural networks for image classification. In: IEEE conference on computer vision and pattern recognition, pp 3642–3649
Vincent P, Larochelle H, Bengio Y, Manzagol PA (2008) Extracting and composing robust features with denoising autoencoders. In: International conference on machine learning, pp 1096–1103
Wang P, Yuille AL (2015) Doc: deep occlusion recovering from a single image. CoRR. arXiv:1511.06457
Wang S, Shao M, Fu Y (2014) Attractive or not?: beauty prediction with attractiveness-aware encoders and robust late fusion. In: ACM international conference on multimedia, pp 805–808
Wang Y, Yao H, Zhao S (2015) Auto-encoder based dimensionality reduction. Neurocomputing 184 :232–242
Article Google Scholar
Wu Y, Lim J, Yang MH (2013) Online object tracking: a benchmark. In: IEEE conference on computer vision and pattern recognition, pp 2411–2418
Xiong X, De la Torre F (2013) Supervised descent method and its applications to face alignment. In: IEEE conference on computer vision and pattern recognition, pp 532–539
Yang H, Wang B, Lin S, Wipf D, Guo M, Guo B (2015) Unsupervised extraction of video highlights via robust recurrent auto-encoders. In: IEEE international conference on computer vision, pp 4633–4641
Zhao F, Feng J, Zhao J, Yang W, Yan S (2016) Robust lstm-autoencoders for face de-occlusion in the wild. arXiv:1612.08534

Download references

Author information

Authors and Affiliations

School of Computer Science and Technology, Harbin Institute of Technology, Harbin, China
Yasi Wang, Hongxun Yao, Wei Yu, Shangchen Zhou & Xiaoshuai Sun
State Key Laboratory of Robotics and System, Harbin Institute of Technology, Harbin, China
Dong Wang

Authors

Yasi Wang
View author publications
You can also search for this author in PubMed Google Scholar
Hongxun Yao
View author publications
You can also search for this author in PubMed Google Scholar
Wei Yu
View author publications
You can also search for this author in PubMed Google Scholar
Dong Wang
View author publications
You can also search for this author in PubMed Google Scholar
Shangchen Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Xiaoshuai Sun
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hongxun Yao.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, Y., Yao, H., Yu, W. et al. Gradual recovery based occluded digit images recognition. Multimed Tools Appl 78, 2571–2586 (2019). https://doi.org/10.1007/s11042-018-6048-8

Download citation

Received: 28 June 2017
Revised: 06 March 2018
Accepted: 23 April 2018
Published: 11 July 2018
Issue Date: January 2019
DOI: https://doi.org/10.1007/s11042-018-6048-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Gradual recovery based occluded digit images recognition

Abstract

Access this article

Similar content being viewed by others

Stochastic Decorrelation Constraint Regularized Auto-Encoder for Visual Recognition

NeighborMix data augmentation for image recognition

AONet: Attentional Occlusion-Aware Network for Occluded Person Re-identification

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Gradual recovery based occluded digit images recognition

Abstract

Access this article

Similar content being viewed by others

Stochastic Decorrelation Constraint Regularized Auto-Encoder for Visual Recognition

NeighborMix data augmentation for image recognition

AONet: Attentional Occlusion-Aware Network for Occluded Person Re-identification

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation