Towards Machine Learning on the Automata Processor

Tracy, Tommy; Fu, Yao; Roy, Indranil; Jonas, Eric; Glendenning, Paul

doi:10.1007/978-3-319-41321-1_11

Tommy Tracy II¹⁶,
Yao Fu¹⁷,
Indranil Roy¹⁸,
Eric Jonas¹⁹ &
…
Paul Glendenning¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9697))

Included in the following conference series:

International Conference on High Performance Computing

2814 Accesses
29 Citations

Abstract

A variety of applications employ ensemble learning models, using a collection of decision trees, to quickly and accurately classify an input based on its vector of features. In this paper, we discuss the implementation of such a method, namely Random Forests, as the first machine learning algorithm to be executed on the Automata Processor (AP). The AP is an upcoming reconfigurable co-processor accelerator which supports the execution of numerous automata in parallel against a single input data-flow. Owing to this execution model, our approach is fundamentally different, translating Random Forest models from existing memory-bound tree-traversal algorithms to pipelined designs that use multiple automata to check all of the required thresholds independently and in parallel. We also describe techniques to handle floating-point feature values which are not supported in the native hardware, pipelining of the execution stages, and compression of automata for the fastest execution times. The net result is a solution which when evaluated using two applications, namely handwritten digit recognition and sentiment analysis, produce up to 63 and 93 times speed-up respectively over single-core state-of-the-art CPU-based solutions. We foresee these algorithmic techniques to be useful not only in the acceleration of other applications employing Random Forests, but also in the implementation of other machine learning methods on this novel architecture.

T. Tracy II and Y. Fu—Both authors contributed equally to this work.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
The symbol space of an STE minus one symbol reserved for the delimiter.

References

The micron automata processor developer portal, November 2014. http://www.micronautomata.com/
Asadi, N., Lin, J., de Vries, A.P.: Runtime optimizations for tree-based machine learning models. IEEE Trans. Knowl. Data Eng. 26(9), 1 (2014)
Article Google Scholar
Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001). http://dx.doi.org/10.1023/A3A1010933404324
Article MathSciNet MATH Google Scholar
Criminisi, A., Shotton, J., Konukoglu, E.: Decision forests: a unified framework for classification, regression, density estimation, manifold learning and semi-supervised learning. Found. Trends Comput. Graph. Vis. 7(2–3), 81–227 (2012). http://dx.doi.org/10.1561/0600000035
MATH Google Scholar
Dlugosch, P., Brown, D., Glendenning, P., Leventhal, M., Noyes, H.: An efficient and scalable semiconductor architecture for parallel automata processing. IEEE Trans. Parallel Distrib. Syst. 25(12), 3088–3098 (2014)
Article Google Scholar
LeCun, Y., Cortes, C.: Mnist handwritten digit database. AT&T Labs (2010). http://yann.lecun.com/exdb/mnist
Lucchese, C., Nardini, F.M., Orlando, S., Perego, R., Tonellotto, N., Venturini, R.: Quickscorer: a fast algorithm to rank documents with additive ensembles of regression trees. In: Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2015, pp. 73–82. ACM, New York (2015). http://doi.acm.org/10.1145/2766462.2767733
Ozuysal, M., Fua, P., Lepetit, V.: Fast keypoint recognition in ten lines of code. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2007, pp. 1–8, June 2007
Google Scholar
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., et al.: Scikit-learn: machine learning in python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
MathSciNet MATH Google Scholar
Prenger, R., Chen, B., Marlatt, T., Merl, D.: Fast map search for compact additive tree ensembles (cate). Technical report, Lawrence Livermore National Laboratory (LLNL), Livermore, CA (2013)
Google Scholar
Qi, Y.: Random forest for bioinformatics. In: Zhang, C., Ma, Y. (eds.) Ensemble Machine Learning, pp. 307–323. Springer US, New York (2012). http://dx.doi.org/10.1007/978-1-4419-9326-7_11
Chapter Google Scholar
Roy, I.: Algorithmic techniques for the micron automata processor. Dissertation, Georgia Institute of Technology (2015)
Google Scholar
Roy, I., Aluru, S.: Finding motifs in biological sequences using the micron automata processor. In: Proceedings of the 2014 IEEE 28th International Parallel and Distributed Processing Symposium, IPDPS 2014, pp. 415–424. IEEE Computer Society, Washington, DC (2014). http://dx.doi.org/10.1109/IPDPS.2014.51
Sanders, N.: Twitter sentiment corpus (2011). http://www.sananalytics.com/lab/twitter-sentiment/
Stan, J., Skadron, K.: Uses for random and stochastic input on microns automata processor. Technical report CS-2015-06, University of Virginia Department of Computer Science, Charlottesville, VA, September 2015
Google Scholar
Van Essen, B., Macaraeg, C., Gokhale, M., Prenger, R.: Accelerating a random forest classifier: multi-core, GP-GPU, or FPGA? In: 2012 IEEE 20th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM), pp. 232–239. IEEE (2012)
Google Scholar
Wang, K., Qi, Y., Fox, J., Stan, M., Skadron, K.: Association rule mining with the micron automata processor. In: IPDPS 2015, May 2015
Google Scholar
Windeatt, T., Ardeshir, G.: Boosted tree ensembles for solving multiclass problems. In: Roli, F., Kittler, J. (eds.) MCS 2002. LNCS, vol. 2364, pp. 42–51. Springer, Heidelberg (2002)
Chapter Google Scholar
Zhang, K., Cheng, Y., Xie, Y., Honbo, D., Agrawal, A., Palsetia, D., Lee, K., Keng Liao, W., Choudhary, A.: Ses: sentiment elicitation system for social media data. In: 2011 IEEE 11th International Conference on Data Mining Workshops (ICDMW), pp. 129–136, December 2011
Google Scholar
Zhou, K., Fox, J.J., Wang, K., Brown, D.E., Skadron, K.: Brill tagging on the micron automata processor. In: 2015 IEEE International Conference on Semantic Computing (ICSC), pp. 236–239. IEEE (2015)
Google Scholar

Download references

Author information

Authors and Affiliations

University of Virginia, Charlottesville, VA, USA
Tommy Tracy II
Micron Technology, Inc., Milpitas, CA, USA
Yao Fu & Paul Glendenning
Micron Technology, Inc., Boise, ID, USA
Indranil Roy
University of California, Berkeley, CA, USA
Eric Jonas

Authors

Tommy Tracy II
View author publications
You can also search for this author in PubMed Google Scholar
Yao Fu
View author publications
You can also search for this author in PubMed Google Scholar
Indranil Roy
View author publications
You can also search for this author in PubMed Google Scholar
Eric Jonas
View author publications
You can also search for this author in PubMed Google Scholar
Paul Glendenning
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tommy Tracy II .

Editor information

Editors and Affiliations

Deutsches Klimarechenzentrum, Hamburg, Germany
Julian M. Kunkel
Argonne National Laboratory, Lemont, Illinois, USA
Pavan Balaji
University of Tennessee, Knoxville, Tennessee, USA
Jack Dongarra

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Tracy, T., Fu, Y., Roy, I., Jonas, E., Glendenning, P. (2016). Towards Machine Learning on the Automata Processor. In: Kunkel, J., Balaji, P., Dongarra, J. (eds) High Performance Computing. ISC High Performance 2016. Lecture Notes in Computer Science(), vol 9697. Springer, Cham. https://doi.org/10.1007/978-3-319-41321-1_11

Download citation

DOI: https://doi.org/10.1007/978-3-319-41321-1_11
Published: 15 June 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-41320-4
Online ISBN: 978-3-319-41321-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics