skip to main content
10.1145/3029580.3029587acmotherconferencesArticle/Chapter ViewAbstractPublication Pagesparma-ditamConference Proceedingsconference-collections
research-article

Dataflow Acceleration of scikit-learn Gaussian Process Regression

Published: 25 January 2017 Publication History

Abstract

Big data revolution has sparked the widespread use of predictive data analytics based on sophisticated machine learning tasks. Fast data analysis have become very important, and this fact stresses software developers and computer architects to deliver more efficient design solutions able to address the increased performance requirements. Dataflow computing engines from Maxeler has been recently emerged as a promising way of performing high performance computation, utilizing FPGA devices. In this paper, we focus on exploiting Maxeler's dataflow computing for accelerating Gaussian Process Regression from scikit-learn Python library, one of the most computationally intensive and with poor scaling characteistics machine learning algorithm. Through extensive analysis over diverse datasets, we point out which NumPy and SciPy functions forms the major performance bottlenecks that should be implemented in a dataflow acceleration engine and then we discuss the mapping decisions that enable the generation of parameterized dataflow engines. Finally, we show that the proposed acceleration solution delivers significant speedups for the examined datasets, while it also reports good scalability in respect to increased dataset sizes.

References

[1]
Eric Jones, Travis Oliphant, Pearu Peterson, and others. 2001--. SciPy: Open source scientific tools for Python. (2001). http://www.scipy.org/ {Online; accessed 2016-09-09}.
[2]
CBLAS library. 2016. http://www.netlib.org/clapack/cblas/. (2016). http://www.netlib.org/clapack/cblas/
[3]
LAPACK library. 2016. http://www.netlib.org/lapack/. (2016). http://www.netlib.org/1apack/
[4]
Maxpower library. 2016. https://www.github.com/maxeler/maxpower. (2016). https://www.github.com/maxeler/maxpower
[5]
Ardavan Pedram, Andreas Gerstlauer, and Robert A. van de Geijn. 2014. Algorithm, Architecture, and Floating-Point Unit Codesign of a Matrix Factorization Accelerator. IEEE Trans. Computers 63, 8 (2014), 1854--1867.
[6]
F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay. 2011. Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research 12 (2011), 2825--2830.
[7]
Oliver Pell and Vitali Averbukh. 2012. Maximum Performance Computing with Dataflow Engines. Computing in Science and Engg. 14, 4 (July 2012), 98--103.
[8]
Carl Edward Rasmussen and Christopher K. I. Williams. 2005. Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning). The MIT Press.
[9]
UCI Machine Learning Repository. 2016. http://www.archive.ics.uci.edu/ml/. (2016). http://www.archive.ics.uci.edu/ml/
[10]
Antonio Roldao and George A. Constantinides. 2010. A High Throughput FPGA-Based Floating Point Conjugate Gradient Implementation for Dense Matrices. ACM Trans. Reconfigurable Technol. Syst. 3, 1, Article 1 (Jan. 2010), 19pages.
[11]
Maxeler Technologies. 2016. http://www.maxeler.com. (2016). http://www.maxeler.com
[12]
Vasily Volkov and James W. Demmel. 2008. Benchmarking GPUs to Tune Dense Linear Algebra. In Proceedings of the 2008 ACM/IEEE Conference on Supereomputing (SC '08). IEEE Press, Piscataway, NJ, USA, Article 31, 11 pages, http://dl.acm.org/citation.cfm?id=1413370.1413402
[13]
Stefan van der Walt, S. Chris Colbert, and Gael Varoquaux. 2011. The NumPy Array: A Structure for Efficient Numerical Computation. Computing in Science and Engg. 13, 2 (March 2011), 22--30.
[14]
Depeng Yang, Gregory D. Peterson, and Husheng Li. 2009. High performance reconfigurable computing for Cholesky decomposition. In in Proceedings of the Symposium on Application Accelerators in High Performance Computing (UIUC 09.

Cited By

View all
  • (2024)Deep Gaussian Process for Channel Estimation in LIS-Assisted mm-Wave Massive MIMO SystemsIEEE Transactions on Vehicular Technology10.1109/TVT.2024.340391073:12(19797-19802)Online publication date: Dec-2024
  • (2020)An FPGA Implementation of a Gaussian Process Based Predictor for Sequential Time Series Data2020 Eighth International Symposium on Computing and Networking Workshops (CANDARW)10.1109/CANDARW51189.2020.00091(445-449)Online publication date: Nov-2020
  • (2017)AEGLE's Cloud Infrastructure for Resource Monitoring and Containerized Accelerated Analytics2017 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)10.1109/ISVLSI.2017.70(362-367)Online publication date: Jul-2017

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
PARMA-DITAM '17: Proceedings of the 8th Workshop and 6th Workshop on Parallel Programming and Run-Time Management Techniques for Many-core Architectures and Design Tools and Architectures for Multicore Embedded Computing Platforms
January 2017
43 pages
ISBN:9781450348775
DOI:10.1145/3029580
© 2017 Association for Computing Machinery. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of a national government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 25 January 2017

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

PARMA-DITAM '17

Acceptance Rates

PARMA-DITAM '17 Paper Acceptance Rate 6 of 15 submissions, 40%;
Overall Acceptance Rate 11 of 24 submissions, 46%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)12
  • Downloads (Last 6 weeks)0
Reflects downloads up to 14 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Deep Gaussian Process for Channel Estimation in LIS-Assisted mm-Wave Massive MIMO SystemsIEEE Transactions on Vehicular Technology10.1109/TVT.2024.340391073:12(19797-19802)Online publication date: Dec-2024
  • (2020)An FPGA Implementation of a Gaussian Process Based Predictor for Sequential Time Series Data2020 Eighth International Symposium on Computing and Networking Workshops (CANDARW)10.1109/CANDARW51189.2020.00091(445-449)Online publication date: Nov-2020
  • (2017)AEGLE's Cloud Infrastructure for Resource Monitoring and Containerized Accelerated Analytics2017 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)10.1109/ISVLSI.2017.70(362-367)Online publication date: Jul-2017

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media