research-article

Dataflow Acceleration of scikit-learn Gaussian Process Regression

Authors:

PARMA-DITAM '17: Proceedings of the 8th Workshop and 6th Workshop on Parallel Programming and Run-Time Management Techniques for Many-core Architectures and Design Tools and Architectures for Multicore Embedded Computing Platforms

Pages 1 - 6

https://doi.org/10.1145/3029580.3029587

Published: 25 January 2017 Publication History

Get Access

Abstract

Big data revolution has sparked the widespread use of predictive data analytics based on sophisticated machine learning tasks. Fast data analysis have become very important, and this fact stresses software developers and computer architects to deliver more efficient design solutions able to address the increased performance requirements. Dataflow computing engines from Maxeler has been recently emerged as a promising way of performing high performance computation, utilizing FPGA devices. In this paper, we focus on exploiting Maxeler's dataflow computing for accelerating Gaussian Process Regression from scikit-learn Python library, one of the most computationally intensive and with poor scaling characteistics machine learning algorithm. Through extensive analysis over diverse datasets, we point out which NumPy and SciPy functions forms the major performance bottlenecks that should be implemented in a dataflow acceleration engine and then we discuss the mapping decisions that enable the generation of parameterized dataflow engines. Finally, we show that the proposed acceleration solution delivers significant speedups for the examined datasets, while it also reports good scalability in respect to increased dataset sizes.

References

[1]

Eric Jones, Travis Oliphant, Pearu Peterson, and others. 2001--. SciPy: Open source scientific tools for Python. (2001). http://www.scipy.org/ {Online; accessed 2016-09-09}.

Google Scholar

[2]

CBLAS library. 2016. http://www.netlib.org/clapack/cblas/. (2016). http://www.netlib.org/clapack/cblas/

Google Scholar

[3]

LAPACK library. 2016. http://www.netlib.org/lapack/. (2016). http://www.netlib.org/1apack/

Google Scholar

[4]

Maxpower library. 2016. https://www.github.com/maxeler/maxpower. (2016). https://www.github.com/maxeler/maxpower

Google Scholar

[5]

Ardavan Pedram, Andreas Gerstlauer, and Robert A. van de Geijn. 2014. Algorithm, Architecture, and Floating-Point Unit Codesign of a Matrix Factorization Accelerator. IEEE Trans. Computers 63, 8 (2014), 1854--1867.

Digital Library

Google Scholar

[6]

F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay. 2011. Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research 12 (2011), 2825--2830.

Digital Library

Google Scholar

[7]

Oliver Pell and Vitali Averbukh. 2012. Maximum Performance Computing with Dataflow Engines. Computing in Science and Engg. 14, 4 (July 2012), 98--103.

Digital Library

Google Scholar

[8]

Carl Edward Rasmussen and Christopher K. I. Williams. 2005. Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning). The MIT Press.

Digital Library

Google Scholar

[9]

UCI Machine Learning Repository. 2016. http://www.archive.ics.uci.edu/ml/. (2016). http://www.archive.ics.uci.edu/ml/

Google Scholar

[10]

Antonio Roldao and George A. Constantinides. 2010. A High Throughput FPGA-Based Floating Point Conjugate Gradient Implementation for Dense Matrices. ACM Trans. Reconfigurable Technol. Syst. 3, 1, Article 1 (Jan. 2010), 19pages.

Digital Library

Google Scholar

[11]

Maxeler Technologies. 2016. http://www.maxeler.com. (2016). http://www.maxeler.com

Google Scholar

[12]

Vasily Volkov and James W. Demmel. 2008. Benchmarking GPUs to Tune Dense Linear Algebra. In Proceedings of the 2008 ACM/IEEE Conference on Supereomputing (SC '08). IEEE Press, Piscataway, NJ, USA, Article 31, 11 pages, http://dl.acm.org/citation.cfm?id=1413370.1413402

Digital Library

Google Scholar

[13]

Stefan van der Walt, S. Chris Colbert, and Gael Varoquaux. 2011. The NumPy Array: A Structure for Efficient Numerical Computation. Computing in Science and Engg. 13, 2 (March 2011), 22--30.

Digital Library

Google Scholar

[14]

Depeng Yang, Gregory D. Peterson, and Husheng Li. 2009. High performance reconfigurable computing for Cholesky decomposition. In in Proceedings of the Symposium on Application Accelerators in High Performance Computing (UIUC 09.

Google Scholar

Cited By

View all

Shamsesalehi MAttari MSadr MChampagne B(2024)Deep Gaussian Process for Channel Estimation in LIS-Assisted mm-Wave Massive MIMO SystemsIEEE Transactions on Vehicular Technology10.1109/TVT.2024.340391073:12(19797-19802)Online publication date: Dec-2024
https://doi.org/10.1109/TVT.2024.3403910
Suzuki HTsutsumi SSato Y(2020)An FPGA Implementation of a Gaussian Process Based Predictor for Sequential Time Series Data2020 Eighth International Symposium on Computing and Networking Workshops (CANDARW)10.1109/CANDARW51189.2020.00091(445-449)Online publication date: Nov-2020
https://doi.org/10.1109/CANDARW51189.2020.00091
Koliogeorgi KMasouros DZervakis GXydis SBecker TGaydadjiev GSoudris D(2017)AEGLE's Cloud Infrastructure for Resource Monitoring and Containerized Accelerated Analytics2017 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)10.1109/ISVLSI.2017.70(362-367)Online publication date: Jul-2017
https://doi.org/10.1109/ISVLSI.2017.70

Recommendations

Comments

Information & Contributors

Information

Published In

January 2017

43 pages

ISBN:9781450348775

DOI:10.1145/3029580

© 2017 Association for Computing Machinery. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of a national government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 25 January 2017

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Qualifiers

Research-article
Research
Refereed limited

Conference

PARMA-DITAM '17

PARMA-DITAM '17: 8th Workshop on Parallel Programming and Run-Time Management Techniques for Many-core Architectures and 6th Workshop on Design Tools and Architectures for Multicore Embedded Computing Platforms

January 25, 2017

Stockholm, Sweden

Acceptance Rates

PARMA-DITAM '17 Paper Acceptance Rate 6 of 15 submissions, 40%;

Overall Acceptance Rate 11 of 24 submissions, 46%

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

3
Total Citations
View Citations
170
Total Downloads

Downloads (Last 12 months)12
Downloads (Last 6 weeks)0

Reflects downloads up to 14 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

View all

Shamsesalehi MAttari MSadr MChampagne B(2024)Deep Gaussian Process for Channel Estimation in LIS-Assisted mm-Wave Massive MIMO SystemsIEEE Transactions on Vehicular Technology10.1109/TVT.2024.340391073:12(19797-19802)Online publication date: Dec-2024
https://doi.org/10.1109/TVT.2024.3403910
Suzuki HTsutsumi SSato Y(2020)An FPGA Implementation of a Gaussian Process Based Predictor for Sequential Time Series Data2020 Eighth International Symposium on Computing and Networking Workshops (CANDARW)10.1109/CANDARW51189.2020.00091(445-449)Online publication date: Nov-2020
https://doi.org/10.1109/CANDARW51189.2020.00091
Koliogeorgi KMasouros DZervakis GXydis SBecker TGaydadjiev GSoudris D(2017)AEGLE's Cloud Infrastructure for Resource Monitoring and Containerized Accelerated Analytics2017 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)10.1109/ISVLSI.2017.70(362-367)Online publication date: Jul-2017
https://doi.org/10.1109/ISVLSI.2017.70

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Cited By

Recommendations

scikit-learn Cookbook - Second Edition: Over 80 recipes for machine learning in Python with scikit-learn

scikit-learn: Machine Learning Simplified Implement scikit-learn into every step of the data science pipeline

Learning scikit-learn: Machine Learning in Python

Comments

Published In

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Acceptance Rates

Other Metrics

Article Metrics

Other Metrics

Cited By

Login options

Full Access

PDF

eReader

Abstract

References

Cited By

Recommendations

scikit-learn Cookbook - Second Edition: Over 80 recipes for machine learning in Python with scikit-learn

scikit-learn: Machine Learning Simplified Implement scikit-learn into every step of the data science pipeline

Learning scikit-learn: Machine Learning in Python

Comments

Information

Published In

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Acceptance Rates

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

Login options

Full Access

View options

PDF

eReader

Share

Share this Publication link

Share on social media

Affiliations