Abstract
Big Data Analytics and Deep Learning are two high-focus of data science. Big Data has become important as many organizations both public and private have been collecting massive amounts of domain-specific information, which can contain useful information about problems such as national intelligence, cyber security, fraud detection, marketing, and medical informatics. Companies such as Google and Microsoft are analyzing large volumes of data for business analysis and decisions, impacting existing and future technology. Deep Learning algorithms extract high-level, complex abstractions as data representations through a hierarchical learning process. Complex abstractions are learnt at a given level based on relatively simpler abstractions formulated in the preceding level in the hierarchy. A key benefit of Deep Learning is the analysis and learning of massive amounts of unsupervised data, making it a valuable tool for Big Data Analytics where raw data is largely unlabeled and un-categorized. In the present study, we explore how Deep Learning can be utilized for addressing some important problems in Big Data Analytics, including extracting complex patterns from massive volumes of data, semantic indexing, data tagging, fast information retrieval, and simplifying discriminative tasks. We also investigate some aspects of Deep Learning research that need further exploration to incorporate specific challenges introduced by Big Data Analytics, including streaming data, high-dimensional data, scalability of models, and distributed computing. We conclude by presenting insights into relevant future works by posing some questions, including defining data sampling criteria, domain adaptation modeling, defining criteria for obtaining useful data abstractions, improving semantic indexing, semi-supervised learning, and active learning.
This chapter has been adopted from the Journal of Big Data, Borko Furht and Taghi Khoshgoftar, Editors-in-Chief.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Domingos P. A few useful things to know about machine learning. Commun ACM. 2012;55(10):78–87.
Dalal N, Triggs B. Histograms of oriented gradients for human detection. In: IEEE computer society conference on computer vision and pattern recognition, 2005. CVPR 2005. IEEE, vol. 1. 2005;886–93.
Lowe DG. Object recognition from local scale-invariant features. In: Computer Vision, 1999. The Proceedings of the seventh IEEE international conference on IEEE computer society, vol. 2. 1999. p. 1150–7.
Bengio Y, LeCun Y. Scaling learning algorithms towards, AI. In: Bottou L, Chapelle O, DeCoste D, Weston J, editors. Large scale kernel machines, vol. 34. Cambridge: MIT Press; 2007. p. 321–60. http://www.iro.umontreal.ca/~lisa/pointeurs/bengio+lecun_chapter2007.pdf.
Bengio Y, Courville A, Vincent P. Representation learning: a review and new perspectives. IEEE Trans Pattern Anal Mach Intell. 2013;35(8):1798–828. doi:10.1109/TPAMI.2013.50.
Arel I, Rose DC, Karnowski TP. Deep machine learning-a new frontier in artificial intelligence research [research frontier]. IEEE Comput Intell. 2010;5:13–8.
Hinton GE, Osindero S, Teh Y-W. A fast learning algorithm for deep belief nets. Neural Comput. 2006;18(7):1527–54.
Bengio Y, Lamblin P, Popovici D, Larochelle H. Greedy layer-wise training of deep networks. 2007;19.
Larochelle H, Bengio Y, Louradour J, Lamblin P. Exploring strategies for training deep neural networks. J Mach Learn Res. 2009;10:1–40.
Salakhutdinov R, Hinton GE. Deep boltzmann machines. In: International conference on artificial intelligence and statistics. JMLR.org. 2009. p. 448–55.
Goodfellow I, Lee H, Le QV, Saxe A, Ng AY. Measuring invariances in deep networks. Advances in neural information processing systems. Red Hook: Curran Associates, Inc.; 2009. p. 646–54.
Dahl G, Ranzato M, Mohamed A-R, Hinton GE. Phone recognition with the mean-covariance restricted boltzmann machine. Advances in neural information processing systems. Red Hook: Curran Associates, Inc.; 2010. p. 469–77.
Hinton G, Deng L, Yu D, Mohamed A-R, Jaitly N, Senior A, Vanhoucke V, Nguyen P, Sainath T, Dahl G, Kingsbury B. Deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups. Signal Process Mag IEEE. 2012;29(6):82–97.
Seide F, Li G, Yu D. Conversational speech transcription using context-dependent deep neural networks. In: INTERSPEECH. ISCA. 2011. p. 437–40.
Mohamed A-R, Dahl GE, Hinton G. Acoustic modeling using deep belief networks. IEEE Trans Audio Speech Lang Process. 2012;20(1):14–22.
Dahl GE, Yu D, Deng L, Acero A. Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition. IEEE Trans Audio Speech Lang Process. 2012;20(1):30–42.
Krizhevsky A, Sutskever I, Hinton G. Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems, vol. 25. Red Hook: Curran Associates, Inc.; 2012. p. 1106–14.
Mikolov T, Deoras A, Kombrink S, Burget L, Cernocky J. Empirical evaluation and combination of advanced language modeling techniques. In: INTERSPEECH. ISCA. 2011. p. 605–8.
Socher R, Huang EH, Pennin J, Manning CD, Ng A. Dynamic pooling and unfolding recursive autoencoders for paraphrase detection. Advances in neural information processing systems. Red Hook: Curran Associates, Inc.; 2011. p. 801–9.
Bordes A, Glorot X, Weston J, Bengio Y. Joint learning of words and meaning representations for open-text semantic parsing. In: International conference on artificial intelligence and statistics. JMLR.org. 2012. p. 127–35.
National Research Council. Frontiers in massive data analysis. Washington, DC: The National Academies Press; 2013. http://www.nap.edu/openbook.php?record_id=18374.
Dumbill E. What is Big Data? An introduction to the big data landscape. In: Strata 2012: making data work. O’Reilly, Santa Clara, CA O’Reilly. 2012.
Khoshgoftaar TM. Overcoming big data challenges. In: Proceedings of the 25th international conference on software engineering and knowledge engineering. Boston. ICSE. Invited Keynote Speaker, 2013.
Bengio Y. Learning deep architectures for AI. Hanover: Now Publishers Inc.; 2009.
Bengio Y. Deep learning of representations: looking forward. Proceedings of the 1st international conference on statistical language and speech processing. SLSP’13. Tarragona: Springer; 2013. p. 1–37. doi:10.1007/978-3-642-39593-2_1.
Hinton GE, Salakhutdinov RR (Science) Reducing the dimensionality of data with neural networks 313(5786):504–7.
Hinton GE, Zemel RS. Autoencoders, minimum description length, and helmholtz free energy. Adv Neural Inf Process Syst. 1994;6:3–10.
Smolensky P. Information processing in dynamical systems: foundations of harmony theory. Parallel distributed processing: explorations in the microstructure of cognition, vol. 1. Cambridge: MIT Press; 1986. p. 194–281.
Hinton GE. Training products of experts by minimizing contrastive divergence. Neural Comput. 2002;14(8):1771–800.
Garshol LM. Introduction to big data/machine learning. Online slide show. 2013. http://www.slideshare.net/larsga/introduction-to-big-datamachine-learning.
Grobelnik M. Big Data tutorial. European Data Forum. 2013. http://www.slideshare.net/EUDataForum/edf2013-bigdatatutorialmarkogrobelnik?related=1.
Salton G, Buckley C. Term-weighting approaches in automatic text retrieval. Inf Process Manag. 1988;24(5):513–23.
Robertson SE, Walker S. Some simple effective approximations to the 2-poisson model for probabilistic weighted retrieval. In: Proceedings of the 17th annual international ACM SIGIR conference on research and development in information retrieval. New York: Springer; 1994. p. 232–41.
Hinton G, Salakhutdinov R. Discovering binary codes for documents by learning deep generative models. Topics Cogn Sci. 2011;3(1):74–91.
Salakhutdinov R, Hinton G. Semantic hashing. Int J Approx Reason. 2009;50(7):969–78.
Ranzato M, Szummer M. Semi-supervised learning of compact document representations with deep networks. In: Proceedings of the 25th international conference on machine learning. ACM. 2008. p. 792–9.
Mikolov T, Chen K, Dean J. Efficient estimation of word representations in vector space. CoRR: Comput Res Repos. 2013;1–12. abs/1301.3781.
Dean J, Corrado G, Monga R, Chen K, Devin M, Le Q, Mao M, Ranzato M, Senior A, Tucker P, Yang K, Ng A. Large scale distributed deep networks. In: Bartlett P, Pereira FCN, Burges CJC, Bottou L, Weinberger KQ, editors. Advances in neural information processing systems, vol. 25. 2012. p. 1232–40. http://books.nips.cc/papers/files/nips25/NIPS2012_0598.pdf.
Mikolov T, Le QV, Sutskever I. Exploiting similarities among languages for machine translation. CoRR: Comput Res Repos. 2013;1–10. abs/1309.4168.
Li G, Zhu H, Cheng G, Thambiratnam K, Chitsaz B, Yu D, Seide F. Context-dependent deep neural networks for audio indexing of real-life data. In: Spoken language technology workshop (SLT), 2012 IEEE. IEEE. 2012. p. 143–8.
Zipern A. A quick way to search for images on the web. The New York Times. News Watch Article. 2001. http://www.nytimes.com/2001/07/12/technology/news-watch-a-quick-way-to-search-for-images-on-the-web.html.
Cusumano MA. Google: what it is and what it is not. Commun ACM Med Image Model. 2005;48(2):15–7. doi:10.1145/1042091.1042107.
Lee H, Battle A, Raina R, Ng A. Efficient sparse coding algorithms. Advances in neural information processing systems. Cambridge: MIT Press; 2006. p. 801–8.
Le Q, Ranzato M, Monga R, Devin M, Chen K, Corrado G, Dean J, Ng A. Building high-level features using large scale unsupervised learning. In: Proceeding of the 29th international conference in machine learning. Edingburgh. 2012.
Freytag A, Rodner E, Bodesheim P, Denzler J. Labeling examples that matter: relevance-based active learning with gaussian processes. In: 35th German conference on pattern recognition (GCPR). Germany: Saarland University and Max-Planck-Institute for Informatics; 2013. p. 282–91.
Socher R, Lin CC, Ng A, Manning C. Parsing natural scenes and natural language with recursive neural networks. In: Proceedings of the 28th international conference on machine learning. Madison: Omnipress; 2011. p. 129–36.
Kumar R, Talton JO, Ahmad S, Klemmer SR. Data-driven web design. In: Proceedings of the 29th international conference on machine learning..2012. icml.cc/Omnipress.
Le QV, Zou WY, Yeung SY, Ng AY. Learning hierarchical invariant spatio-temporal features for action recognition with independent subspace analysis. In: IEEE conference on computer vision and pattern recognition (CVPR) 2011 IEEE. 2011. p. 3361–8.
Zhou G, Sohn K, Lee H. Online incremental feature learning with denoising autoencoders. In: International conference on artificial intelligence and statistics. JMLR.org. 2012. p. 1453–61.
Vincent P, Larochelle H, Bengio Y, Manzagol P-A. Extracting and composing robust features with denoising autoencoders. In: Proceedings of the 25th international conference on machine learning. ACM. 2008. p. 1096–103.
Calandra R, Raiko T, Deisenroth MP, Pouzols FM. Learning deep belief networks from non-stationary streams. Artificial neural networks and machine learning–ICANN 2012. Berlin: Springer; 2012. p. 379–86.
Chen M, Xu ZE, Weinberger KQ, Sha F. Marginalized denoising autoencoders for domain adaptation. In: Proceeding of the 29th international conference in machine learning. Edingburgh; 2012.
Coates A, Ng A. The importance of encoding versus training with sparse coding and vector quantization. In: Proceedings of the 28th international conference on machine learning. Madison: Omnipress; 2011. p. 921–8.
Hinton GE, Srivastava N, Krizhevsky A, Sutskever I, Salakhutdinov R. Improving neural networks by preventing co-adaptation of feature detectors. CoRR: Comput Res Repos. 2012;1–18. abs/1207.0580.
Goodfellow IJ, Warde-Farley D, Mirza M, Courville A, Bengio Y. Maxout networks. In: Proceeding of the 30th international conference in machine learning. Atlanta. 2013.
Coates A, Huval B, Wang T, Wu D, Catanzaro B, Andrew N. Deep learning with Cots HPC systems. In: Proceedings of the 30th international conference on machine learning; 2013. p. 1337–45.
Glorot X, Bordes A, Bengio Y. Domain adaptation for large-scale sentiment classification: a deep learning approach. In: Proceedings of the 28th international conference on machine learning (ICML-11). 2011. p. 513–20.
Chopra S, Balakrishnan S, Gopalan R. Dlid: deep learning for domain adaptation by interpolating between domains. In: Workshop on challenges in representation learning, proceedings of the 30th international conference on machine learning. Atlanta. 2013.
Suthaharan S. Big data classification: problems and challenges in network intrusion prediction with machine learning. ACM sigmetrics: Big Data analytics workshop. Pittsburgh: ACM; 2013.
Wang W, Lu D, Zhou X, Zhang B, Mu J. Statistical wavelet-based anomaly detection in big data with compressive sensing. EURASIP J Wireless Commun Netw. 2013:269. http://www.bibsonomy.org/bibtex/25e432dc7230087ab1cdc65925be6d4cb/dblp.
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Najafabadi, M.M., Villanustre, F., Khoshgoftaar, T.M., Seliya, N., Wald, R., Muharemagc, E. (2016). Deep Learning Techniques in Big Data Analytics. In: Big Data Technologies and Applications. Springer, Cham. https://doi.org/10.1007/978-3-319-44550-2_5
Download citation
DOI: https://doi.org/10.1007/978-3-319-44550-2_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-44548-9
Online ISBN: 978-3-319-44550-2
eBook Packages: Computer ScienceComputer Science (R0)