Skip to main content

Scalable Least Square Twin Support Vector Machine Learning

  • Conference paper
  • First Online:
Book cover Big Data Analytics and Knowledge Discovery (DaWaK 2019)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11708))

Included in the following conference series:

Abstract

Machine Learning (ML) on massive scale datasets, called Big Data, has become a challenge for traditional computing and storage technologies. Henceforth, massive scale ML is an emerging domain of research. Least Square Twin Support Vector Machine (LSTSVM) is a faster variant of Support Vector Machine (SVM). However, it suffers from scalability issues and shows computational and/or storage bottlenecks on massive datasets. Proposed work designs a scalable solution to LSTSVM called Distributed LSTSVM (DLSTSVM). DLSTSVM is designed using distributed parallel computing on top of cluster of multiple machines. After applying horizontal partitioning on massive datasets, DLSTSVM trains it in distributed parallel fashion and finds two non-parallel hyper-planes as decision boundaries for two different classes. MapReduce paradigm is utilized to execute parallel computation on partitioned data in a way that averts memory constraints. Proposed technique achieves computational and storage scalability without losing prediction accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT Press, Cambridge (2012)

    MATH  Google Scholar 

  2. Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20, 273–297 (1995)

    MATH  Google Scholar 

  3. Jayadeva, Khemchandani, R., Chandra, S.: Twin support vector machine for pattern classification. IEEE Trans. Pattern Anal. Mach. Intell. 29(5), 905–910 (2007)

    Article  Google Scholar 

  4. Kumar, M.A., Gopal, M.: Least squares twin support vector machines for pattern classification. Expert Syst. Appl. 36(4), 7535–7543 (2009)

    Article  Google Scholar 

  5. Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. Commun. ACM 51(1), 107–113 (2008)

    Article  Google Scholar 

  6. Isard, M., Budiu, M., Yu, Y., Birrell, A., Fetterly, D.: Dryad: distributed data-parallel programs from sequential building blocks. In: EuroSys 2007, pp. 59–72 (2007)

    Google Scholar 

  7. Yang, H.-c., Dasdan, A., Hsiao, R.-L., Parker, D.S.: Map-Reduce-Merge: simplified relational data processing on large clusters. In: SIGMOD 2007, pp. 1029–1040. ACM (2007)

    Google Scholar 

  8. Zaharia, M., Chowdhury, M., Franklin, M.J., Shenker, S., Stoica, I.: Spark: cluster computing with working sets. In: Proceedings of the 2nd USENIX conference on Hot Topics in Cloud Computing (HotCloud 2010), p. 10. USENIX Association, Berkeley (2010)

    Google Scholar 

  9. Zaharia, M., et al.: Resilient distributed datasets: a fault-tolerant abstraction for in-memory cluster computing. In: Proceedings of the 9th USENIX conference on Networked Systems Design and Implementation (NSDI 2012), p. 2. USENIX Association (2012)

    Google Scholar 

  10. Owen, S., Anil, R., Dunning, T., Friedman, E.: Mahout in Action. Manning Publications, Greenwich (2011)

    Google Scholar 

  11. Low, Y., Gonzalez, J., Kyrola, A., Bickson, D., Guestrin, C., Hellerstein, J.M.: Distributed GraphLab: a framework for machine learning and data mining in the cloud. Proc. VLDB Endow. 5(8), 716–727 (2012)

    Article  Google Scholar 

  12. Budiu, M., Fetterly, D., Isard, M., McSherry, F., Yu, Y.: Large-scale machine learning using DryadLINQ. In: Bekkerman, R., Bilenko, M., Langford, J. (eds.) Scaling Up Machine Learning. Cambridge University Press, Cambridge (2012)

    Google Scholar 

  13. Pednault, E., Tov, Y.E., Ghoting, E.: IBM parallel machine learning toolbox. In: Bekkerman, R., et al. (eds.) Scaling Up Machine Learning. Cambridge University Press, Cambridge (2012)

    Google Scholar 

  14. Apache Spark: Apache Spark: lightning-fast cluster computing (2016)

    Google Scholar 

  15. Upadhyaya, S.R.: Parallel approaches to machine learning—a comprehensive survey. J Parallel Distrib. Comput. 73(3), 284–292 (2013)

    Article  Google Scholar 

  16. Peteiro-Barral, D., Guijarro-Berdiñas, B.: A survey of methods for distributed machine learning. Prog. Artif. Intell. 2(1), 1–11 (2012)

    Article  Google Scholar 

  17. Hsu, C.-W., Lin, C.-J.: A comparison of methods for multiclass support vector machines. IEEE Trans. Neural Netw. 13(2), 415–425 (2002)

    Article  Google Scholar 

  18. Tomar, D., Agarwal, S.: A comparison on multi-class classification methods based on least squares twin support vector machine. Knowl. Based Syst. 81, 131–147 (2015)

    Article  Google Scholar 

  19. Collobert, R., et al.: A parallel mixture of SVMs for very large scale problems (2002)

    Article  Google Scholar 

  20. Zanghirati, G., Zanni, L.: A parallel solver for large quadratic programs in training support vector machines. Parallel Comput. 29, 535–551 (2003)

    Article  MathSciNet  Google Scholar 

  21. Hazan, T., Man, A., Shashua, A.: A parallel decomposition solver for SVM: distributed dual ascend using Fenchel duality. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2008, pp. 1–8 (2008)

    Google Scholar 

  22. Graf, H., Cosatto, E., Bottou, L., Durdanovic, I., Vapnik, V.: Parallel support vector machines: the cascade SVM. In: Neural Information Processing Systems (2004)

    Google Scholar 

  23. Chu, C., et al.: Map-reduce for machine learning on multicore. In: NIPS, pp. 281–288. MIT Press (2006)

    Google Scholar 

  24. Catanzaro, B., Sundaram, N., Keutzer, K.: Fast support vector machine training and classification on graphics processors. In: Proceedings of the 25th International Conference on Machine Learning, Helsinki, Finland, pp. 104–111. ACM (2008)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sonali Agarwal .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Prasad, B.R., Agarwal, S. (2019). Scalable Least Square Twin Support Vector Machine Learning. In: Ordonez, C., Song, IY., Anderst-Kotsis, G., Tjoa, A., Khalil, I. (eds) Big Data Analytics and Knowledge Discovery. DaWaK 2019. Lecture Notes in Computer Science(), vol 11708. Springer, Cham. https://doi.org/10.1007/978-3-030-27520-4_17

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-27520-4_17

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-27519-8

  • Online ISBN: 978-3-030-27520-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics