High-dimensional multimedia classification using deep CNN and extended residual units

Shamsolmoali, Pourya; Kumar Jain, Deepak; Zareapoor, Masoumeh; Yang, Jie; Afshar Alam, M.

doi:10.1007/s11042-018-6146-7

High-dimensional multimedia classification using deep CNN and extended residual units

Published: 20 June 2018

Volume 78, pages 23867–23882, (2019)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Pourya Shamsolmoali ORCID: orcid.org/0000-0002-0263-1661¹,
Deepak Kumar Jain²,
Masoumeh Zareapoor¹,
Jie Yang¹ &
…
M. Afshar Alam³

606 Accesses
Explore all metrics

Abstract

Processing multimedia data has emerged as a key area for the application of machine learning methods Building a robust classification model to use in high dimensional space requires the combination of a deep feature extractor and a powerful classifier. We present a new classification pipeline to facilitate multimedia data analysis based on convolutional neural network and the modified residual network which can integrate with the other feedforward network style in an endwise training fashion. The proposed residual network is producing attention-aware features. We proposed a unified deep CNN model to achieve promising performance in classifying high dimensional multimedia data by getting the advantages of the residual network. In every residual module, up-down and vice-versa feedforward structure is implemented to unfold the feedforward and backward process into a unique process. The hybrid proposed model evaluated on four datasets and have been shown promising results which outperform the previous best results. Last but not the least, the proposed model achieves detection speeds that are much faster than other approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1

Feature Analysis of Unsupervised Learning for Multi-task Classification Using Convolutional Neural Network

Article 14 October 2017

Locality and Sparsity Preserving Embedding Convolutional Neural Network for Image Classification

Image classification using regularized convolutional neural network design with dimensionality reduction modules: RCNN–DRM

Article 02 January 2021

References

Abdur R, Kashif J, Haroon AB, Mehreen S (2015) Relative discrimination criterion – A novel feature ranking method for text data. Expert Syst Appl 42(7):3670–3681
Article Google Scholar
Bianco S, Cusano C, Napoletano P, Schettini R (2017) Improving CNN-Based Texture Classification by Color Balancing. J Imaging 3:33
Article Google Scholar
Cheng D, Zhang S, Liu X, Sun K, Zong M (2017) Feature selection by combining subspace learning with sparse representation. Multimedia Systems 23(3):285–291
Article Google Scholar
Coates A, Lee H, Ng AY (2011) An analysis of single layer networks in unsupervised feature learning AISTATS
Cui G, Yang J, Zareapoor M (2017) Unsupervised feature selection algorithm based on sparse representation. International Conference on Systems and Informatics, ICSAI 2016, p 1028–1033
Cunningham JP, Ghahramani Z (2015) Linear dimensionality reduction: survey, insights, and generalizations. JMLR
Daniel E, Lars H, Bernd H (2011) A survey of dimension reduction methods for high-dimensional data analysis and visualization. In VLUDS, pp 135–149
Dominik S, Arthur F, Nenad T (2014) A case for hubness removal in high–dimensional multimedia retrieval. European Conference on Information Retrieval, Lecture Notes in Computer Science book series, vol 8416, p 687–692
Du S, Liu J, Liu Y, Zhang X, Xue J (2017) Precise glasses detection algorithm for face with in-plane rotation. Multimedia Systems 23(3):293–302
Article Google Scholar
Fang W, Le K, Yi L (2015) Sketch-based 3d shape retrieval using convolution neural networks. In CVPR, 2015
Gao L, Song J, Liu X, Shao J, Liu J, Shao J (2017) Learning in high-dimensional multimedia data: the state of the art. Multimedia Systems 23(3):303–313
Article Google Scholar
Girish C, Ferat S (2014) A survey on feature selection methods. Comput Electr Eng 40(1):16–28
Article Google Scholar
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In CVPR
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, p 27–30
He Y, Xueliang L, Richang H (2016) Image classification via fusing the latent deep CNN feature. Proceedings of the International Conference on Internet Multimedia Computing and Service, p 110–113
Ian J (2002) Principal component analysis. Wiley Online Library, New York
Google Scholar
Ionescu B, Lucian Gînsca A, Boteanu B, Popescu A, Lupu M, Müller H (2015) Retrieving diverse social images at MediaEval 2015: challenge, dataset and evaluation, MediaEval workshop
Itti L, Koch C (2011) Computational modelling of visual attention. Nat Rev Neurosci 2:194–203
Jiang W, Er GH, Dai QH, Gu JW (2006) Similarity-based online feature selection in content-based image retrieval. IEEE Trans Image Process 15:702–712
Article Google Scholar
Jianqing F, Yingying F (2008) High-dimensional classification using features annealed independence rules. Institute of Mathematical Statistics in the Annals of Statistics, vol 36(6), p 2605–2637
Jingkuan S, Yi Y, Zi H, Heng TS, Jiebo L (2013) Effective multiple feature hashing for large-scale near-duplicate video retrieval. IEEE Trans Multimedia 15(8):1997–2008
Article Google Scholar
Jinguk J, Jongho N (2004) An efficient bitmap indexing method for similarity search in high dimensional multimedia databases. IEEE International Conference on Multimedia and Expo
Juha R (2003) Overfitting in making comparisons between variable selection methods. JMLR 3:1371–1382
MATH Google Scholar
Kim KW, Hong HG, Nam GPP, Ark KR (2017) A Study of Deep CNN-Based Classification of Open and Closed Eyes Using a Visible Light Camera Sensor. Sensors 17:1534
Article Google Scholar
Lu C, Qu Y, Shi C, Fan J, Wu Y, Wang H (2015) Hierarchical learning for large-scale image classification via CNN and maximum confidence path. Conference on Advances in multimedia information processing, vol 9315, pp 236–245. https://doi.org/10.1007/978-3-319-24078-7_23
Mikhail B, Partha N (2003) Laplacian eigenmaps for dimensionality reduction and data representation. Neural Comput 15(6):1373–1396
Article MATH Google Scholar
Mnih V, Heess N, Graves A et al (2014) Recurrent models of visual attention. In NIPS
Napoletano P (2017) Hand-crafted vs learned descriptors for color texture classification. International workshop on computational color imaging. Springer, Berlin, pp 259–271
Nie W, Cao Q, Liu A, Y S (2017) Convolutional deep learning for 3D object retrieval. Multimedia Systems 23(3):325–332
Article Google Scholar
Reuter T, Papadopoulos S, Mezaris V, Cimiano P (2014) ReSEED: social event dEtection dataset, MMSys '14 Proceedings of the 5th ACM Multimedia Systems Conference, 2014, p 35–40
Roweis ST, Saul LK (2000) Nonlinear dimensionality reduction by locally linear embedding. Science 290:2323–2326
Article Google Scholar
Salah R, Pascal V, Xavier M, Xavier G, Yoshua B (2011) Contractive auto-encoders: explicit invariance during feature extraction. In ICML, pp 833–840
Salakhutdinov R, Hinton GE (2009) Deep boltzmann machines. In: Proceedings of the International Conference on Artificial Intelligence and Statistics, Clearwater Beach, FL, USA, 16–18; p 448–455
Seeja KR, Zareapoor M (2014) FraudMiner: A novel credit card fraud detection model based on frequent itemset mining. Sci World J 2014:1–10
Shamsolmoali P, Zareapoor M, Jain DK et al (2018) Deep convolution network for surveillance records super-resolution. Multimed Tools Appl. https://doi.org/10.1007/s11042-018-5915-7
Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. ICLR
Socher R, Huval B, Bath B, Manning CD, Ng AY (2012) Convolutional-recursive deep learning for 3D object classifcation. In: Advances in Neural Information Processing Systems. In: NIPS, p 665–673
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In CVPR
Uljarevic D, Veinovic M, Kunjadic G, Tepsic D (2017) A new way of covert communication by steganography via JPEG images within a Microsoft Word document. Multimedia Systems 23(3):333–341
Article Google Scholar
Walther D, Itti L, Riesenhuber M, Poggio T, Koch C (2002) Attentional selection for object recognitiona gentle way. In International Workshop on Biologically Motivated Computer Vision, pp 472–479. Springer
Wei W, Yan H, Yizhou W, Liang W (2014) Generalized autoencoder: a neural network framework for dimensionality reduction. In CVPR Workshops, pp 496–503
Yan Y, Chen M, Ling Shyu M, Ching Chen S (2015) Deep learning for imbalanced multimedia data classification. International Symposium on Multimedia, ISM, pp 483–488
Yuanjun X, Kai Z, Dahua L, Xiaoou T (2015) Recognize complex events from static images by fusing deep channels, Computer Vision and Pattern Recognition (CVPR)
Zareapoor M, Shamsolmoali P (2015) Application of credit card fraud detection: Based on bagging ensemble classifier. Procedia Comp Sci 48(C):679–686
Article Google Scholar
Zareapoor M, Shamsolmoali P (2018) Boosting prediction performance on imbalanced dataset. Int J Inf Commun Technol 13(2):186–195
Google Scholar
Zareapoor M, Yang J (2017) A novel strategy for mining highly imbalanced data in credit card transactions. Intell Autom Soft Comput. https://doi.org/10.1080/10798587.2017.1321228
Zareapoor M, Shamsolmoali P, Kumar DJ, Wang H, Yang J (2017) Kernelized support vector machine with deep learning: An efficient approach for extreme multiclass dataset. Pattern Recogn Lett. https://doi.org/10.1016/j.patrec.2017.09.018
Zhao B, Wu X, Feng J, Peng Q, Yan S (2016) Diversified visual attention networks for fine-grained object classification. arXiv preprint arXiv:1606.08572
Zhicheng Z, Rui X, Fei S (2018) Complex event detection via attention-based video representation and classification. Multimed Tools Appl 77(3):3209–3227
Zhou W, Newsam S, Li C, Shao Z (2017) Learning Low Dimensional Convolutional Neural Networks for High-Resolution Remote Sensing Image Retrieval. Remote Sens 9(5):489–508
Article Google Scholar
Zhu Y, Liang Z, Liu X, Sun K (2017) Self-representation graph feature selection method for classification. Multimedia Systems 23(3):351–356
Article Google Scholar
Zhu X, Jin Z, Ji R (2017) Learning high-dimensional multimedia data. Multimedia Systems 23(3):281–283
Article Google Scholar

Download references

Acknowledgements

This research is partly supported by NSFC, China (No: 61572315) and Committee of Science and Technology, Shanghai, China (No: 17JC1403000).

Author information

Authors and Affiliations

Institute of Image Processing & Pattern Recognition, Shanghai Jiao Tong University, Shanghai, China
Pourya Shamsolmoali, Masoumeh Zareapoor & Jie Yang
Institute of Automation, Chinese Academy of Sciences, Beijing, China
Deepak Kumar Jain
Department of Computer Science & Engineering, Jamia Hamdard University, New Delhi, India
M. Afshar Alam

Authors

Pourya Shamsolmoali
View author publications
You can also search for this author inPubMed Google Scholar
Deepak Kumar Jain
View author publications
You can also search for this author inPubMed Google Scholar
Masoumeh Zareapoor
View author publications
You can also search for this author inPubMed Google Scholar
Jie Yang
View author publications
You can also search for this author inPubMed Google Scholar
M. Afshar Alam
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Pourya Shamsolmoali.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Shamsolmoali, P., Kumar Jain, D., Zareapoor, M. et al. High-dimensional multimedia classification using deep CNN and extended residual units. Multimed Tools Appl 78, 23867–23882 (2019). https://doi.org/10.1007/s11042-018-6146-7

Download citation

Received: 05 January 2018
Revised: 10 April 2018
Accepted: 15 May 2018
Published: 20 June 2018
Issue Date: 15 September 2019
DOI: https://doi.org/10.1007/s11042-018-6146-7

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

High-dimensional multimedia classification using deep CNN and extended residual units

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Feature Analysis of Unsupervised Learning for Multi-task Classification Using Convolutional Neural Network

Locality and Sparsity Preserving Embedding Convolutional Neural Network for Image Classification

Image classification using regularized convolutional neural network design with dimensionality reduction modules: RCNN–DRM

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now