skip to main content
10.1145/3011077.3011090acmotherconferencesArticle/Chapter ViewAbstractPublication PagessoictConference Proceedingsconference-collections
research-article

MapReduce based for speech classification

Published: 08 December 2016 Publication History

Abstract

Speech classification is one of the most vital problems in speech processing as well as spoken word recognition. Although, there have been many studies on the classification of speech signals, the results are still limited on both accuracy and the size of the vocabulary. When classifying a huge volumes vocabulary, the speech classification becomes more and more difficult. Today, there are some frameworks that allow working with big data. One of these is a data mining utility. It can perform supervised classification procedures on very large amounts of data, usually named as big data, on a distributed infrastructure by using the MapReduce framework of Hadoop clusters. This tool has four classification approaches implemented. These are Random Forest, Naïve Bayes, Decision Trees and Support Vector Machines (SVM). All these approaches require input data having the same size, so the input data must be quantized before using. This leads to decrease the accuracy in the classification stage. In this paper, we propose an implementation of Local Naïve Bayes Nearest Neighbor based on Hadoop framework, which allows input data with different sizes and works well with huge training data.

References

[1]
Björn W. Schuller, Pavel Král, and Václav Matoušek, "Speech Analysis in the Big Data Era," in Text, Speech, and Dialogue: 18th International Conference, 2015.
[2]
Wei Dai and Wei Ji, "A MapReduce Implementation of C4.5 Decision Tree Algorithm," International Journal of Database Theory and Application, vol. Vol.7, No.1, pp. 49--60, 2014.
[3]
Wang Dingxian, Liu Xiao, and Wang Mengdi, "A DT-SVM Strategy for Stock Futures Prediction with Big Data," in 16th International Conference on Computational Science and Engineering, 2013.
[4]
(2016, Mar.) mahout.apache.org. {Online}. https://mahout.apache.org
[5]
Anushree Priyadarshini and Agarwal Sonali, "A Map Reduce based Support Vector Machine for Big Data Classification," International Journal of Database Theory and Application, vol. No.5 Vol.8, pp. 77--98, 2015.
[6]
P Anchalia Prajesh and Roy Kaushik, "The k-Nearest Neighbor Algorithm Using MapReduce Paradigm," in Fifth International Conference on Intelligent Systems, Modelling and Simulation, 2014.
[7]
B. Apexa Kamdar and K. Rajani Ishan, "Improved Adaptive K Nearest Neighbor algorithm using MapReduce," International Journal of Science, Engineering and Technology Research (IJSETR), vol. 4, June 2015.
[8]
Boiman O., Shechtman E., and Iran M., "In Defense of Nearest-Neighbor Based Image Classification," In CVPR, 2008.
[9]
Sancho McCann, David G. Lowe, "Local Naive Bayes Nearest Neighbor for Image Classification," In CVPR, 2012.
[10]
Nguyen Quang Trung, Bui The Duy, and Ma Thi Chau, "An Image based Approach for Speech Perception," in NICS, 2015, pp. 208--213.
[11]
J. Dean and S. Ghemawat, "MapReduce: Simplified data processing on large clusters," in OSDI, 2004, pp. 137--150.
[12]
http://hadoop.apache.org/.
[13]
https://catalog.ldc.upenn.edu/LDC2008S07.
[14]
http://www.alovoice.vn/ai/du-lieu-tieng-noi-tieng-viet/.
[15]
http://research.nii.ac.jp/src/en/TMW.html, 2015.
[16]
http://research.nii.ac.jp/src/en/JVPD.html, 2015.
[17]
Lowe David G., "Distinctive image features from scale-invariant keypoints," IJCV, 2004.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
SoICT '16: Proceedings of the 7th Symposium on Information and Communication Technology
December 2016
442 pages
ISBN:9781450348157
DOI:10.1145/3011077
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 08 December 2016

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. LNBNN
  2. big data speech classification
  3. mapreduce
  4. speech classification

Qualifiers

  • Research-article

Conference

SoICT '16

Acceptance Rates

SoICT '16 Paper Acceptance Rate 58 of 132 submissions, 44%;
Overall Acceptance Rate 147 of 318 submissions, 46%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 80
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 07 Mar 2025

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media