Skip to main content

Deep Learning Analytics

  • Chapter
  • First Online:
  • 1635 Accesses

Part of the book series: Intelligent Systems Reference Library ((ISRL,volume 149 ))

Abstract

The recent breakthroughs in Deep Learning have provided powerful data analytics tools for a wide range of domains ranging from advertising and analyzing users’ behavior to load and financial forecasting. Depending on the nature of the available data and the task at hand Deep Learning Analytics techniques can be divided into two broad categories: (a) unsupervised learning techniques and (b) supervised learning techniques. In this chapter we provide an extensive overview over both categories. Unsupervised learning methods, such as Autoencoders, are able to discover and extract the information from the data without using any ground truth information and/or supervision from domain experts. Thus, unsupervised techniques can be especially useful for data exploration tasks, especially when combined with advanced visualization techniques. On the other hand, supervised learning techniques are used when ground truth information is available and we want to build classification and/or forecasting models. Several deep learning models are examined ranging from simple Multilayer Perceptrons (MLPs) to Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs). However training deep learning models is not always a straightforward task requiring both a solid theoretical background as well as intuition and experience. To this end, we also present recent techniques that allow for efficiently training deep learning models, such as batch normalization, residual connections, advanced optimization techniques and activation functions, as well as a number of useful practical suggestions. Finally, we present an overview of the available open source deep learning frameworks that can be used to implement deep learning analytics techniques and accelerate the training process using Graphics Processing Units (GPUs).

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   199.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   199.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., Corrado, G.S., Davis, A., Dean, J., Devin, M., Ghemawat, S., Goodfellow, I., Harp, A., Irving, G., Isard, M., Jia, Y., Jozefowicz, R., Kaiser, L., Kudlur, M., Levenberg, J., Mané, D., Monga, R., Moore, S., Murray, D., Olah, C., Schuster, M., Shlens, J., Steiner, B., Sutskever, I., Talwar, K., Tucker, P., Vanhoucke, V., Vasudevan, V., Viégas, F., Vinyals, O., Warden, P., Wattenberg, M., Wicke, M., Yu, Y., Zheng, X.: TensorFlow: large-scale machine learning on heterogeneous systems (2015). www.tensorflow.org

  2. Abdi, H., Williams, L.J.: Principal component analysis. Wiley Interdiscip. Rev. Comput. Stat. 2(4), 433–459 (2010)

    Article  Google Scholar 

  3. Aggarwal, C.C.: Outlier analysis. In: Data Mining, pp. 237–263 (2015)

    Google Scholar 

  4. Aggarwal, C.C, Reddy, C.K.: Data Clustering: Algorithms and Applications. CRC press (2013)

    Google Scholar 

  5. Ahonen, T., Hadid, A., Pietikainen, M.: Face description with local binary patterns: Application to face recognition. IEEE Trans. Pattern Anal. Mach. Intell. 28(12), 2037–2041 (2006)

    Article  Google Scholar 

  6. Bengio, Y., et al.: Learning deep architectures for AI. Found. Trends Mach. Learn. 2(1), 1–127 (2009)

    Article  Google Scholar 

  7. Bengio, Y., Lamblin, P., Popovici, D., Larochelle, H.: Greedy layer-wise training of deep networks. In: Proceedings of the Advances in Neural Information Processing Systems, pp. 153–160 (2007)

    Google Scholar 

  8. Celebi, M.E., Aydin, K.: Unsupervised Learning Algorithms. Springer (2016)

    Google Scholar 

  9. Chatfield, K., Simonyan, K., Vedaldi, A., Zisserman, A.: Return of the devil in the details: delving deep into convolutional nets (2014). arXiv:1405.3531

  10. Cho, K., van Merriënboer, B., Bahdanau, D., Bengio, Y.: On the properties of neural machine translation: encoder–decoder approaches. Syntax Semant. Struct. Stat. Transl. p. 103 (2014)

    Google Scholar 

  11. Chollet, F., et al.: Keras (2015). https://github.com/fchollet/keras

  12. Choromanska, A., Henaff, M., Mathieu, M., Ben Arous, G., LeCun, Y.: The loss surfaces of multilayer networks. In: Artificial Intelligence and Statistics, pp. 192–204 (2015)

    Google Scholar 

  13. Christopher, D.M, Prabhakar, R., Hinrich, S.: Introduction to information retrieval. In: An Introduction to Information Retrieval, vol. 151, p. 177 (2008)

    Google Scholar 

  14. Collobert, R., Kavukcuoglu, K., Farabet, C.: Torch7: a matlab-like environment for machine learning. In: BigLearn, NIPS Workshop (2011)

    Google Scholar 

  15. Cui, X., Goel, V., Kingsbury, B.: Data augmentation for deep neural network acoustic modeling. Proc. IEEE/ACM Trans. Audio Speech Lang. Process. 23(9), 1469–1477 (2015)

    Article  Google Scholar 

  16. De Oliveira, M.C.F., Levkowitz, H.: From visual data exploration to visual data mining: a survey. IEEE Trans. Vis. Comput. Gr. 9(3), 378–394 (2003)

    Article  Google Scholar 

  17. Dos Santos, C.N., Gatti, M.: Deep convolutional neural networks for sentiment analysis of short texts. In: COLING, pp. 69–78 (2014)

    Google Scholar 

  18. Duchi, J., Hazan, E., Singer, Y.: Adaptive subgradient methods for online learning and stochastic optimization. J. Mach. Learn. Res. 12(Jul), 2121–2159 (2011)

    Google Scholar 

  19. Elkahky, A.M., Song, Y., He, X.: A multi-view deep learning approach for cross domain user modeling in recommendation systems. In: Proceedings of the International Conference on World Wide Web, pp. 278–288 (2015)

    Google Scholar 

  20. Gers, F.A., Eck, D., Schmidhuber, J.: Applying LSTM to time series predictable through time-window approaches. In: Proceedings of the Italian Workshop on Neural Nets, pp. 193–200 (2002)

    Google Scholar 

  21. Gers, F.A., Schmidhuber, J., Cummins, F.: Learning to forget: continual prediction with LSTM. Neural Comput. 12(10), 2451–2471 (2000)

    Article  Google Scholar 

  22. Glorot, X., Bengio, Y.: Understanding the difficulty of training deep feedforward neural networks. In: Proceedings of the International Conference on Artificial Intelligence and Statistics, pp. 249–256 (2010)

    Google Scholar 

  23. Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of the International Conference on Artificial Intelligence and Statistics, pp. 315–323 (2011)

    Google Scholar 

  24. Guyon, I., Elisseeff, A.: An introduction to feature extraction. Feature Extr. 1–25 (2006)

    Google Scholar 

  25. Haykin, S.S., Haykin, S.S., Haykin, S.S., Haykin, S.S.: Neural Networks and Learning Machines, vol. 3. Pearson Upper Saddle River (2009)

    Google Scholar 

  26. He, K., Zhang, X., Ren, S., Sun, J.: Spatial pyramid pooling in deep convolutional networks for visual recognition. In: Proceedings of the European Conference on Computer Vision, pp. 346–361 (2014)

    Google Scholar 

  27. He, K., Zhang, X., Ren, S., Sun, J.: Delving deep into rectifiers: surpassing human-level performance on imagenet classification. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1026–1034 (2015)

    Google Scholar 

  28. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)

    Google Scholar 

  29. Hershey, J.R., Chen, Z., Roux, J.L., Watanabe, S.: Deep clustering: discriminative embeddings for segmentation and separation. In: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 31–35 (2016)

    Google Scholar 

  30. Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)

    Article  MathSciNet  Google Scholar 

  31. Hosseini-Asl, E., Zurada, J.M., Nasraoui, O.: Deep learning of part-based representation of data using sparse autoencoders with nonnegativity constraints. IEEE Trans. Neural Netw. Learn. Syst. 27(12), 2486–2498 (2016)

    Article  Google Scholar 

  32. Huang, G., Sun, Y., Liu, Z., Sedra, D., Weinberger, k.Q.: Deep networks with stochastic depth. In: Proceedings of the European Conference on Computer Vision, pp. 646–661 (2016)

    Chapter  Google Scholar 

  33. Hubel, D.H., Wiesel, T.N.: Receptive fields and functional architecture of monkey striate cortex. The J. Physiol. 195(1), 215–243 (1968)

    Article  Google Scholar 

  34. Huffman, W.C., Pless, V.: Fundamentals of Error-Correcting Codes. Cambridge university press, 2010

    Google Scholar 

  35. Ioffe, S., Szegedy, C.: Batch normalization: adeep network training by reducing internal covariate shift. In: Proceedings of the International Conference on Machine Learning, pp. 448–456 (2015)

    Google Scholar 

  36. Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., Darrell, T.: Caffe: convolutional architecture for fast feature embedding (2014). arXiv:1408.5093

  37. Jozefowicz, R., Zaremba, W., Sutskever, I.: An empirical exploration of recurrent network architectures. In: Proceedings of the International Conference on Machine Learning, pp. 2342–2350 (2015)

    Google Scholar 

  38. Kanungo, T., Mount, D.M., Netanyahu, N.S., Piatko, C.D., Silverman, R., Wu, A.Y.: An efficient k-means clustering algorithm: analysis and implementation. IEEE Trans. Pattern Anal. Mach. Intell. 24(7), 881–892 (2002)

    Article  Google Scholar 

  39. Kingma, D., Ba, J.: Adam: a method for stochastic optimization. In: Proceedings of the International Conference on Learning Representations (2015)

    Google Scholar 

  40. Kingma, D.P., Welling, M.: Auto-encoding variational bayes (2013). arXiv:1312.6114

  41. Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Proceedings of the Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)

    Google Scholar 

  42. Law, M.T., Urtasun, R., Zemel, R.S.: Deep spectral clustering learning. In: Proceedings of the International Conference on Machine Learning, pp. 1985–1994 (2017)

    Google Scholar 

  43. LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)

    Article  Google Scholar 

  44. Lecun, Y., Cortes, C.: The MNIST database of handwritten digits

    Google Scholar 

  45. Lowe, D.G.: Object recognition from local scale-invariant features. Proceedings of the IEEE International Conference on Computer Vision 2, 1150–1157 (1999)

    Article  Google Scholar 

  46. Maas, A.L., Hannun, A.Y., Ng, A.Y.: Rectifier nonlinearities improve neural network acoustic models. In: ICML Workshop on Deep Learning for Audio, Speech and Language Processing (2013)

    Google Scholar 

  47. van der Maaten, L., Hinton, G.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9(Nov), 2579–2605 (2008)

    Google Scholar 

  48. Mahendran, A., Vedaldi, A.: Visualizing deep convolutional neural networks using natural pre-images. Int. J. Comput. Vis. 120(3), 233–255 (2016)

    Article  MathSciNet  Google Scholar 

  49. Makhzani, A., Frey, B.: K-sparse autoencoders (2013). arXiv:1312.5663

  50. Microsoft. Microsoft cognitive toolkit CNTK (2015). https://github.com/Microsoft/CNTK

  51. Mishkin, D., Matas, J.: All you need is a good init (2015). arXiv:1511.06422

  52. Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A.A., Veness, J., Bellemare, M.G., Graves, A., Riedmiller, M., Fidjeland, A.K., Ostrovski, G., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015)

    Article  Google Scholar 

  53. Nousi, P., Tefas, A.: Deep learning algorithms for discriminant autoencoding. Neurocomputing 266, 325–335 (2017)

    Article  Google Scholar 

  54. Pascanu, R., Mikolov, T., Bengio, Y.: On the difficulty of training recurrent neural networks. In: Proceedings of the International Conference on Machine Learning, pp. 1310–1318 (2013)

    Google Scholar 

  55. Passalis, N., Tefas, A.: Bag of embedded words learning for text retrieval. In: Proceedings of the International Conference on Pattern Recognition (ICPR), pp. 2416–2421 (2016)

    Google Scholar 

  56. Passalis, N., Tefas, A.: Entropy optimized feature-based bag-of-words representation for information retrieval. IEEE Trans. Knowl. Data Eng. 28(7), 1664–1677 (2016)

    Article  Google Scholar 

  57. Passalis, N., Tefas, A.: Spectral clustering using optimized bag-of-features. In: Proceedings of the 9th Hellenic Conference on Artificial Intelligence, p. 19 (2016)

    Google Scholar 

  58. Passalis, N., Tefas, A.: Concept detection and face pose estimation using lightweight convolutional neural networks for steering drone video shooting. In: Proceedings of the 25th European Signal Processing Conference, pp. 71–75 (2017)

    Google Scholar 

  59. Passalis, N., Tefas, A.: Dimensionality reduction using similarity-induced embeddings. IEEE Trans. Neural Netw. Learn. Syst. (to appear) 1–13 (2017)

    Google Scholar 

  60. Passalis, N., Tefas, A.: Improving face pose estimation using long-term temporal averaging for stochastic optimization. In: International Conference on Engineering Applications of Neural Networks, pp. 194–204 (2017)

    Chapter  Google Scholar 

  61. Passalis, N., Tefas, A.: Learning bag-of-features pooling for deep convolutional neural networks. In: Proceedings of the IEEE International Conference on Computer Vision, Oct 2017

    Google Scholar 

  62. Passalis, N., Tefas, A.: Learning neural bag-of-features for large-scale image retrieval. IEEE Trans. Syst. Man Cybern, Syst (2017)

    Google Scholar 

  63. Passalis, N., Tefas, A.: Neural bag-of-features learning. Pattern Recogn. 64, 277–294 (2017)

    Article  Google Scholar 

  64. Passalis, N., Tefas, A.: Information clustering using manifold-based optimization of the bag-of-features representation. IEEE Trans. Cybern. 48(1), 52–63 (2018)

    Article  Google Scholar 

  65. Passalis, N., Tsantekidis, A., Tefas, A., Kanniainen, J., Gabbouj, M., Iosifidis, A.: Time-series classification using neural bag-of-features. In: Proceedings of the European Signal Processing Conference, pp. 301–305 (2017)

    Google Scholar 

  66. Pennington, J., Socher, R., Manning, C.: Glove: global vectors for word representation. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 1532–1543 (2014)

    Google Scholar 

  67. PyTorch. Pytorch (2017). https://github.com/pytorch/pytorch

  68. Qiu, X., Zhang, L., Ren, Y., Suganthan, P.N., Amaratunga, G.: Ensemble deep learning for regression and time series forecasting. In: IEEE Symposium on Computational Intelligence in Ensemble Learning, pp. 1–6 (2014)

    Google Scholar 

  69. Rifai, S., Vincent, P., Muller, X., Glorot, X., Bengio, Y.: Contractive auto-encoders: explicit invariance during feature extraction. In: Proceedings of the International Conference on Machine Learning, pp. 833–840 (2011)

    Google Scholar 

  70. Rolfe, J.L., LeCun, Y.: Discriminative recurrent sparse auto-encoders (2013). arXiv:1301.3775

  71. Salakhutdinov, R., Hinton, G.: Deep boltzmann machines. In: Proceedings of the Artificial Intelligence and Statistics, pp. 448–455 (2009)

    Google Scholar 

  72. Senior, A., Heigold, G., Yang, K., et al.: An empirical study of learning rates in deep neural networks for speech recognition. In: Proceedings of the IEEE International Conference on on Acoustics, Speech and Signal Processing, pp. 6724–6728 (2013)

    Google Scholar 

  73. Severyn, A., Moschitti, A.: Twitter sentiment analysis with deep convolutional neural networks. In: Proceedings of the International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 959–962 (2015)

    Google Scholar 

  74. Shen, Y., Huang, P-S., Gao, J., Chen, W.: Reasonet: learning to stop reading in machine comprehension. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1047–1055 (2017)

    Google Scholar 

  75. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition (2014). arXiv:1409.1556

  76. Srivastava, N., Hinton, G.E., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)

    MathSciNet  MATH  Google Scholar 

  77. Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9 (2015)

    Google Scholar 

  78. Theano Development Team. Theano: A Python framework for fast computation of mathematical expressions, May 2016. arXiv:1605.02688

  79. Tsantekidis, A., Passalis, N., Tefas, A., Kanniainen, J., Gabbouj, M., Iosifidis, A.: Forecasting stock prices from the limit order book using convolutional neural networks. Proc. IEEE Conf. Bus. Inf. 1, 7–12 (2017)

    Google Scholar 

  80. Tsantekidis, A., Passalis, N., Tefas, A., Kanniainen, J., Gabbouj, M., Iosifidis, A.: Using deep learning to detect price change indications in financial markets. In: Proceedings of the European Signal Processing Conference, pp. 2511–2515 (2017)

    Google Scholar 

  81. Tzelepi, M., Tefas, A.: Deep convolutional learning for content based image retrieval. Neurocomputing 275, 2467–2478 (2018)

    Article  Google Scholar 

  82. Unal, M., Onat, M., Demetgul, M., Kucuk, H.: Fault diagnosis of rolling bearings using a genetic algorithm optimized neural network. Measurement 58, 187–196 (2014)

    Article  Google Scholar 

  83. Van Der Maaten, L., Postma, E., Van den Herik, J.: Dimensionality reduction: a comparative. J. Mach. Learn. Res. 10, 66–71 (2009)

    Google Scholar 

  84. Venugopalan, S., Rohrbach, M., Donahue, J., Mooney, R., Darrell, T., Saenko, K.: Sequence to sequence-video to text. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 4534–4542 (2015)

    Google Scholar 

  85. Vincent, P., Larochelle, H., Bengio, Y., Manzagol, P-A.: Extracting and composing robust features with denoising autoencoders. In: Proceedings of the International Conference on Machine Learning, pp. 1096–1103 (2008)

    Google Scholar 

  86. Vinyals, O., Toshev, A., Bengio, S., Erhan, D.: Show and tell: a neural image caption generator. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3156–3164 (2015)

    Google Scholar 

  87. Wan, L., Zeiler, M., Zhang, S., Cun, Y.L., Fergus, R.: Regularization of neural networks using dropconnect. In: Proceedings of the International Conference on Machine Learning, pp. 1058–1066 (2013)

    Google Scholar 

  88. Wang, S., Jiang, J.: Machine comprehension using match-lstm and answer pointer (2016). arXiv:1608.07905

  89. Werbos, P.J.: Backpropagation through time: what it does and how to do it. Proc. IEEE 78(10), 1550–1560 (1990)

    Article  Google Scholar 

  90. Dongkuan, X., Tian, Y.: A comprehensive survey of clustering algorithms. Ann. Data Sci. 2(2), 165–193 (2015)

    Article  Google Scholar 

  91. Yao, Y., Rosasco, L., Caponnetto, A.: On early stopping in gradient descent learning. Constr. Approx. 26(2), 289–315 (2007)

    Article  MathSciNet  Google Scholar 

  92. Zeiler, M.D.: ADADELTA: an adaptive learning rate method (2012). arXiv:1212.5701

  93. Zhang, Y., Schneider, J.: Multi-label output codes using canonical correlation analysis. In: Proceedings of the International Conference on Artificial Intelligence and Statistics, pp. 873–882 (2011)

    Google Scholar 

  94. Zheng, Y., Liu, Q., Chen, E., Ge, Y., Zhao, J.L.: Time series classification using multi-channels deep convolutional neural networks. In: Proceedings of the International Conference on Web-Age Information Management, pp. 298–310 (2014)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nikolaos Passalis .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer International Publishing AG, part of Springer Nature

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Passalis, N., Tefas, A. (2019). Deep Learning Analytics. In: Tsihrintzis, G., Sotiropoulos, D., Jain, L. (eds) Machine Learning Paradigms. Intelligent Systems Reference Library, vol 149 . Springer, Cham. https://doi.org/10.1007/978-3-319-94030-4_13

Download citation

Publish with us

Policies and ethics