Scalable Least Square Twin Support Vector Machine Learning

Prasad, Bakshi Rohit; Agarwal, Sonali

doi:10.1007/978-3-030-27520-4_17

Bakshi Rohit Prasad¹³ &
Sonali Agarwal¹³

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11708))

Included in the following conference series:

International Conference on Big Data Analytics and Knowledge Discovery

1459 Accesses
1 Citations

Abstract

Machine Learning (ML) on massive scale datasets, called Big Data, has become a challenge for traditional computing and storage technologies. Henceforth, massive scale ML is an emerging domain of research. Least Square Twin Support Vector Machine (LSTSVM) is a faster variant of Support Vector Machine (SVM). However, it suffers from scalability issues and shows computational and/or storage bottlenecks on massive datasets. Proposed work designs a scalable solution to LSTSVM called Distributed LSTSVM (DLSTSVM). DLSTSVM is designed using distributed parallel computing on top of cluster of multiple machines. After applying horizontal partitioning on massive datasets, DLSTSVM trains it in distributed parallel fashion and finds two non-parallel hyper-planes as decision boundaries for two different classes. MapReduce paradigm is utilized to execute parallel computation on partitioned data in a way that averts memory constraints. Proposed technique achieves computational and storage scalability without losing prediction accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT Press, Cambridge (2012)
MATH Google Scholar
Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20, 273–297 (1995)
MATH Google Scholar
Jayadeva, Khemchandani, R., Chandra, S.: Twin support vector machine for pattern classification. IEEE Trans. Pattern Anal. Mach. Intell. 29(5), 905–910 (2007)
Article Google Scholar
Kumar, M.A., Gopal, M.: Least squares twin support vector machines for pattern classification. Expert Syst. Appl. 36(4), 7535–7543 (2009)
Article Google Scholar
Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. Commun. ACM 51(1), 107–113 (2008)
Article Google Scholar
Isard, M., Budiu, M., Yu, Y., Birrell, A., Fetterly, D.: Dryad: distributed data-parallel programs from sequential building blocks. In: EuroSys 2007, pp. 59–72 (2007)
Google Scholar
Yang, H.-c., Dasdan, A., Hsiao, R.-L., Parker, D.S.: Map-Reduce-Merge: simplified relational data processing on large clusters. In: SIGMOD 2007, pp. 1029–1040. ACM (2007)
Google Scholar
Zaharia, M., Chowdhury, M., Franklin, M.J., Shenker, S., Stoica, I.: Spark: cluster computing with working sets. In: Proceedings of the 2nd USENIX conference on Hot Topics in Cloud Computing (HotCloud 2010), p. 10. USENIX Association, Berkeley (2010)
Google Scholar
Zaharia, M., et al.: Resilient distributed datasets: a fault-tolerant abstraction for in-memory cluster computing. In: Proceedings of the 9th USENIX conference on Networked Systems Design and Implementation (NSDI 2012), p. 2. USENIX Association (2012)
Google Scholar
Owen, S., Anil, R., Dunning, T., Friedman, E.: Mahout in Action. Manning Publications, Greenwich (2011)
Google Scholar
Low, Y., Gonzalez, J., Kyrola, A., Bickson, D., Guestrin, C., Hellerstein, J.M.: Distributed GraphLab: a framework for machine learning and data mining in the cloud. Proc. VLDB Endow. 5(8), 716–727 (2012)
Article Google Scholar
Budiu, M., Fetterly, D., Isard, M., McSherry, F., Yu, Y.: Large-scale machine learning using DryadLINQ. In: Bekkerman, R., Bilenko, M., Langford, J. (eds.) Scaling Up Machine Learning. Cambridge University Press, Cambridge (2012)
Google Scholar
Pednault, E., Tov, Y.E., Ghoting, E.: IBM parallel machine learning toolbox. In: Bekkerman, R., et al. (eds.) Scaling Up Machine Learning. Cambridge University Press, Cambridge (2012)
Google Scholar
Apache Spark: Apache Spark: lightning-fast cluster computing (2016)
Google Scholar
Upadhyaya, S.R.: Parallel approaches to machine learning—a comprehensive survey. J Parallel Distrib. Comput. 73(3), 284–292 (2013)
Article Google Scholar
Peteiro-Barral, D., Guijarro-Berdiñas, B.: A survey of methods for distributed machine learning. Prog. Artif. Intell. 2(1), 1–11 (2012)
Article Google Scholar
Hsu, C.-W., Lin, C.-J.: A comparison of methods for multiclass support vector machines. IEEE Trans. Neural Netw. 13(2), 415–425 (2002)
Article Google Scholar
Tomar, D., Agarwal, S.: A comparison on multi-class classification methods based on least squares twin support vector machine. Knowl. Based Syst. 81, 131–147 (2015)
Article Google Scholar
Collobert, R., et al.: A parallel mixture of SVMs for very large scale problems (2002)
Article Google Scholar
Zanghirati, G., Zanni, L.: A parallel solver for large quadratic programs in training support vector machines. Parallel Comput. 29, 535–551 (2003)
Article MathSciNet Google Scholar
Hazan, T., Man, A., Shashua, A.: A parallel decomposition solver for SVM: distributed dual ascend using Fenchel duality. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2008, pp. 1–8 (2008)
Google Scholar
Graf, H., Cosatto, E., Bottou, L., Durdanovic, I., Vapnik, V.: Parallel support vector machines: the cascade SVM. In: Neural Information Processing Systems (2004)
Google Scholar
Chu, C., et al.: Map-reduce for machine learning on multicore. In: NIPS, pp. 281–288. MIT Press (2006)
Google Scholar
Catanzaro, B., Sundaram, N., Keutzer, K.: Fast support vector machine training and classification on graphics processors. In: Proceedings of the 25th International Conference on Machine Learning, Helsinki, Finland, pp. 104–111. ACM (2008)
Google Scholar

Download references

Author information

Authors and Affiliations

Indian Institute of Information Technology Allahabad, Allahabad, India
Bakshi Rohit Prasad & Sonali Agarwal

Authors

Bakshi Rohit Prasad
View author publications
You can also search for this author in PubMed Google Scholar
Sonali Agarwal
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sonali Agarwal .

Editor information

Editors and Affiliations

University of Houston, Houston, TX, USA
Carlos Ordonez
Drexel University, Philadelphia, PA, USA
Il-Yeol Song
Johannes Kepler University of Linz, Linz, Austria
Gabriele Anderst-Kotsis
Software Competence Center Hagenberg, Hagenberg im Mühlkreis, Austria
A Min Tjoa
Johannes Kepler University of Linz, Linz, Austria
Ismail Khalil

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Prasad, B.R., Agarwal, S. (2019). Scalable Least Square Twin Support Vector Machine Learning. In: Ordonez, C., Song, IY., Anderst-Kotsis, G., Tjoa, A., Khalil, I. (eds) Big Data Analytics and Knowledge Discovery. DaWaK 2019. Lecture Notes in Computer Science(), vol 11708. Springer, Cham. https://doi.org/10.1007/978-3-030-27520-4_17

Download citation

DOI: https://doi.org/10.1007/978-3-030-27520-4_17
Published: 03 August 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-27519-8
Online ISBN: 978-3-030-27520-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics