Skip to main content
Log in

SVM for English semantic classification in parallel environment

  • Published:
International Journal of Speech Technology Aims and scope Submit manuscript

Abstract

Semantic analysis is very important and very helpful for many researches and many applications for a long time. SVM is a famous algorithm which is used in the researches and applications in many different fields. In this study, we propose a new model using a SVM algorithm with Hadoop Map (M)/Reduce (R) for English document-level emotional classification in the Cloudera parallel network environment. Cloudera is also a distributed system. Our English testing data set has 25,000 English documents, including 12,500 English positive reviews and 12,500 English negative reviews. Our English training data set has 90,000 English sentences, including 45,000 English positive sentences and 45,000 English negative sentences. Our new model is tested on the English testing data set and we achieve 63.7% accuracy of sentiment classification on this English testing data set.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  • Agarwal, B., & Mittal, N. (2016a). Semantic orientation-based approach for sentiment analysis. In: Prominent feature extraction for sentiment analysis (pp. 77–88). Berlin: Springer,

    Chapter  Google Scholar 

  • Agarwal, B., & Mittal, N. (2016b). Machine learning approach for sentiment analysis. In: Prominent feature extraction for sentiment analysis. Berlin: Springer, pp 21–45

    Chapter  Google Scholar 

  • Ahmed S, & Danti A (2016). Effective sentimental analysis and opinion mining of web reviews using rule based classifiers. In Computational Intelligence in Data Mining (Vol. 1, pp. 171–179), Print ISBN 978-81-322-2732-8. doi:10.1007/978-81-322-2734-2$418, India

  • Alham, N. K., Li, M., Liu, Y., & Hammoud, S. (2011). A MapReduce-based distributed SVM algorithm for automatic image annotation. Computers Mathematics with Applications, 62(7), 2801–2811

    Article  MATH  Google Scholar 

  • Annett, M., & Kondrak, G. (2008). A comparison of sentiment analysis techniques: Polarizing movie blogs. In: Conference of the Canadian Society for Computational Studies of Intelligence (pp. 25–35). Berlin: Springer,

    Google Scholar 

  • Apache, (2017). http://apache.org

  • Barnhill, S. D. (2000). U.S. Patent No. 6,157, 921. Washington, DC: U.S. Patent and Trademark Office

    Google Scholar 

  • Canuto, S., Gonçalves, M. A., & Benevenuto, F. (2016). Exploiting new sentiment-based meta-level features for effective sentiment analysis. In Proceedings of the 9 th ACM International Conference on Web Search and Data Mining (pp. 53–62). New York, USA

  • Carrera-Trejo, J.V., Sidorov, G., Miranda-Jiménez, S., Moreno Ibarra, M., & Cadena Martínez, R. (2015). Latent Dirichlet allocation complement in the vector space model for multi-label text classification. International Journal of Combinatorial Optimization Problems Informatics, 6(1), 7–19

    Google Scholar 

  • Caruana, G., Li, M., & Qi, M. (2011). A MapReduce based parallel SVM for large scale spam filtering. In 2011 8 th International Conference on Fuzzy systems and knowledge discovery (FSKD) (Vol. 4, pp. 2659–2662). New York: IEEE.

  • Cloudera, (2017). http://www.cloudera.com

  • Hadoop, (2017). http://hadoop.apache.org

  • Haque, A., & Rao, K. S. (2016). Modification of energy spectra, epoch parameters and prosody for emotion conversion in speech. International Journal of Speech Technology, 20(1), 15–25. doi:10.1007/s10772-016-9386-9

    Article  Google Scholar 

  • Hazan, T., Man, A., & Shashua, A. (2008). A parallel decomposition solver for svm: Distributed dual ascend using fenchel duality. In IEEE Conference on computer vision and pattern recognition, 2008. CVPR 2008 (pp. 1–8). New York: IEEE.

  • He, Q., Du, C., Wang, Q., Zhuang, F., & Shi, Z. (2011). A parallel incremental extreme SVM classifier. Neurocomputing, 74(16), 2532–2540

    Article  Google Scholar 

  • Huang, C. L., & Dun, J. F. (2008). A distributed PSO–SVM hybrid system with feature selection and parameter optimization. Application of Soft Computing, 8(4), 1381–1391

    Article  Google Scholar 

  • Kang, H., Yoo, S. J., & Han, D. (2012). Senti-lexicon and improved Naïve Bayes algorithms for sentiment analysis of restaurant reviews. Expert Systems Applications, 39(5), 6000–6010

    Article  Google Scholar 

  • Kennedy, A., & Inkpen, D. (2006). Sentiment classification of movie reviews using contextual valence shifters. Computational Intelligence 22(2):110–125

    Article  MathSciNet  Google Scholar 

  • Kiran, M., Kumar, A., Mukherjee, S., & Ravi Prakash, G. (2013). Verification and validation of mapreduce program model for parallel support vector machine algorithm on hadoop cluster. International Journal of Computer Science Issues, 10(1), 317–325

    Google Scholar 

  • Kraska, T., Talwalkar, A., Duchi, J. C., Griffith, R., Franklin, M. J., & Jordan, M. I. (2013). MLbase: A distributed machine-learning system. In Classless inter-domain routing (Vol. 1, pp. 2–1).

    Google Scholar 

  • Large Movie Review Dataset. (2016). http://ai.stanford.edu/~amaas/data/sentiment/

  • Li, Y., Guan, C., Li, H., & Chin, Z. (2008). A self-training semi-supervised SVM algorithm and its application in an EEG-based brain computer interface speller system. Pattern Recognition Letters 29(9), 1285–1294

    Article  Google Scholar 

  • Lia, N., & Wu, D. D. (2010). Using text mining and sentiment analysis for online forums hotspot detection and forecast. Decision Support Systems, 48(2), 354–368

    Article  Google Scholar 

  • Lu, Y., Roychowdhury, V., & Vandenberghe, L. (2008). Distributed parallel support vector machines in strongly connected networks. IEEE Trans Neural Networks, 19(7), 1167–1178

    Article  Google Scholar 

  • Moraes, R., Valiati J. F., & Neto, W. P. G. (2013). Document-level sentiment classification: An empirical comparison between SVM and ANN. Expert Systems Appl 40(2),621–633

    Article  Google Scholar 

  • Ngoc, P. V., Ngoc, C. V. T., Ngoc, T. V. T., & Duy, D. N. (2017). A C4. 5 algorithm for english emotional classification.

  • Noble, W. S. (2006). What is a support vector machine? Nature Biotechnology 24(12), 1565–1567

    Article  Google Scholar 

  • Pang, B., & Lee, L. (2005). Seeing stars: Exploiting class relationships for sentiment categorization with respect to rating scales. In Proceedings of the 43rd annual meeting on association for computational linguistics (pp. 115–124). USA

  • Pang, B., Lee, L., & Vaithyanathan, S. (2002). Thumbs up?: sentiment classification using machine learning techniques. In Proceedings of the ACL-02 conference on Empirical methods in natural language processing (Vol. 10, pp. 79–86).

  • Peng, D., Lee, F. C., & Boroyevich, D. (2002). A novel SVM algorithm for multilevel three-phase converters. In Power Electronics Specialists Conference, 2002. Pesc 02. 2002 IEEE 33rd Annual (Vol. 2, pp. 509–513). New York: IEEE.

  • Phillips, P. J. (1998). Support vector machines applied to face recognition. In M. Kearns, J., Solla, S. A., & Cohn, D. A. (Eds.), Processing Systems 11 MIT Press, 1999

  • Phu, V. N., & Tuoi, P. T. (2014). Sentiment classification using enhanced contextual valence shifters. In 2014 International Conference on Asian Language Processing (IALP) (pp. 224–229). New York: IEEE.

  • Phu, V. N., Dat, N. D., Tran, V. T. N., Chau, V. T. N., & Nguyen, T. A. (2016). Fuzzy C-means for english sentiment classification in a distributed system. Applied Intelligence, pp. 1–22.

  • Phu, V. N., Chau, V. T. N., Tran, V. T. N., Dat, N. D., & Nguyen, T. A. (2017a). STING algorithm used english sentiment classification in a parallel environment. International Journal of Pattern Recognition and Artificial Intelligence, 31(7), 30. doi:10.1142/S0218001417500215

    Google Scholar 

  • Phu, V. N., Chau, V. T. N., Tran, V. T. N., & Dat, N. D. (2017b). A Vietnamese adjective emotion dictionary based on exploitation of Vietnamese language characteristics. International Journal of Artificial Intelligence Review (AIR). doi:10.1007/s10462-017-9538-6.

    Google Scholar 

  • Saleh, M. R., Martín-Valdivia, M. T., Montejo-Ráez, A., & Ureña-López, L. A. (2011). Experiments with SVM to classify opinions in different domains. Expert Systems Applications, 38(12), 14799–14804

    Article  Google Scholar 

  • Shamili, A. S., Bauckhage, C., & Alpcan, T. (2010). Malware detection on mobile devices using distributed machine learning. In 20th International Conference on Pattern Recognition (ICPR), 2010 (pp. 4348–4351). New York: IEEE.

  • Shoker, L., Sanei, S., & Chambers, J. (2005). Artifact removal from electroencephalograms using a hybrid BSS-SVM algorithm. IEEE Signal Processing Letters 12(10), 721–724

    Article  Google Scholar 

  • Singh, V. K., & Singh, V. K. (2015). Vector space model: An information retrieval system. International Journal of Advanced Engineering Research Studies, 141, 143

    Google Scholar 

  • Soucy, P., & Mineau, G. W. (2005). Beyond TFIDF weighting for text categorization in the vector space model. Proceedings of the 19th International Joint Conference on Artificial Intelligence, 5, 1130–1135

    Google Scholar 

  • Sun, Y., & Wen, G. (2015). Emotion recognition using semi-supervised feature selection with speaker normalization. International Journal of Speech Technology, 18(3), 317–331

    Article  Google Scholar 

  • Tran, V. T. N., Phu, V. N., & Tuoi, P. T. (2014). Learning more chi square feature selection to improve the fastest and most accurate sentiment classification. In The 3rd Asian Conference on Information Systems (ACIS 2014)

  • Vishwanathan, S. V. M., & Murty, M. N. (2002). SSVM: a simple SVM algorithm. In: Proceedings of the 2002 International Joint Conference on Neural Networks (IJCNN ‘02), (Vol. 3, pp. 2393–2398). New York: IEEE.

  • Xia, R., Zong, C., & Li, S. (2011). Ensemble of feature sets and classification algorithms for sentiment classification. Information Science, 181(6), 1138–1152

    Article  Google Scholar 

  • Yang, N., Yuan, J., Zhou, Y., Demirkol, I., Duan, Z., Heinzelman, W., & Sturge-Apple, M. (2017). Enhanced multiclass SVM with thresholding fusion for speech-based emotion classification. International Journal of Speech Technology, 20(1), 27–41. doi:10.1007/s10772-016-9364-2.

    Article  Google Scholar 

  • Ye, Q., Lin, B., & Li, Y. J. (2005). Sentiment classification for Chinese reviews: A comparison between SVM and semantic approaches. In Proceedings of 2005 International Conference on Machine Learning and Cybernetics, 2005 (Vol. 4, pp. 2341–2346). New York: IEEE.

  • Ye, Q., Zhang, Z., & Law, R. (2009). Sentiment classification of online reviews to travel destinations by supervised machine learning approaches. Expert Systems Application, 36(3), 6527–6535

    Article  Google Scholar 

  • Yuan-cheng, L., Ting-jian, F., & Er-keng, Y. U. (2006). Study of support vector machines for short-term load forecasting, advances in machine learning and cybernetics, Volume 3930 of the series Lecture Notes in Computer Science, pp. 880–888.

  • Zanghirati, G., & Zanni, L. (2003). A parallel solver for large quadratic programs in training support vector machines. Parallel Computers, 29(4), 535–551

    Article  MathSciNet  MATH  Google Scholar 

  • Zhang, T., Wang, J., Xu, L., & Liu, P. (2006). Fall detection by wearable sensor and one-class SVM algorithm. Intelligent Computing in Signal Processing and Pattern Recognition, Volume 345 of the series Lecture Notes in Control and Information Sciences, pp. 858–863.

  • Zhang, H., Finney, S. J., Massoud, A., & Williams, B. W. (2008). An SVM algorithm to balance the capacitor voltages of the three-level NPC active power filter. IEEE Transactions Power Electronics, 23(6), 2694–2702

    Article  Google Scholar 

  • Zhang, Y., Wang, L., Sun, W., Green, R. C. II, & Alam, M. (2011). Distributed intrusion detection system in a multi-layer network architecture of smart grids. IEEE Transactions on Smart Grid, 2(4), 796–808

    Article  Google Scholar 

  • Zheng, W., & Ye, Q. (2009). Sentiment classification of Chinese traveler reviews by support vector machine algorithm. In 3 rd International Symposium on Intelligent Information Technology Application, 2009. IITA 2009. (Vol. 3, pp. 335–338). New York: IEEE.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Vo Ngoc Phu.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Phu, V.N., Chau, V.T.N. & Tran, V.T.N. SVM for English semantic classification in parallel environment. Int J Speech Technol 20, 487–508 (2017). https://doi.org/10.1007/s10772-017-9421-5

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10772-017-9421-5

Keywords

Navigation