Abstract
This paper investigates low-energy consumption and low-power hardware models and processor architectures for performing the real-time recognition of objects in power-constrained autonomous systems and robots. Most recent developments show that convolutional deep neural networks are currently the state-of-the-art in terms of classification accuracy. In this article we propose to use of a different type of deep neural network—stacked autoencoders—and show that within a limited number of layers and nodes, for accommodating the use of low-power accelerators such as mobile GPUs and FPGAs, we are still able to achieve both classification levels not far from the state-of-the-art and a high number of processed frames per second. We present experiments using the color CIFAR-10 dataset. This enables the adaptation of the architecture to a live feed camera. Another novelty equally proposed for the first time in this work suggests that the training phase can also be performed in these low-power devices, instead of the usual approach that uses a desktop CPU or a GPU to perform this task and only runs the trained network later on the FPGA. This allows incorporating new functionalities as, for example, a robot performing runtime learning.
Similar content being viewed by others
References
Altera: Stratix V FPGA Overview. http://www.altera.com/devices/fpga/stratix-fpgas/stratix-v/overview/stxv-overview.html
Altera: Altera SDK for OpenCL: Optimization Guide (2013) http://www.altera.com/literature/hb/opencl-sdk/aocl_optimization_guide.pdf
Altera: design tools: VHDL (2013) http://www.altera.com/support/examples/vhdl/vhdl.html
Altera: SDK for OpenCL (2014) http://www.altera.com/products/software/opencl/opencl-index.html
Bengio Y (2009) Learning deep architectures for ai. Foundations and trends\(\textregistered \). Mach Learn 2(1):1–127
Bengio Y, Lamblin P, Popovici D, Larochelle H et al (2007) Greedy layer-wise training of deep networks. Adv Neural Inf Process Syst 19:153
Ciresan DC, Meier U, Schmidhuber J (2012) Multi-column deep neural networks for image classification. In: IEEE conference on computer vision and pattern recognition (CVPR), Providence, pp 3642–3649
Dundar A, Jin J, Gokhale V, Martini B, Culurciello E (2013) Accelerating deep neural networks on mobile processor with embedded programmable logic. In: Neural information processing systems conference (NIPS)
Falcao G, Silva V, Sousa L, Andrade J (2012) Portable ldpc decoding on multicores using opencl (applications corner). IEE Signal Process Mag 29(4):81–109. doi:10.1109/MSP.2012.2192212
Farabet C, Martini B, Akselrod P, Talay S, LeCun Y, Culurciello E (2010) Hardware accelerated convolutional neural networks for synthetic vision systems. In: IEEE international symposium on circuits and systems (ISCAS), pp 257–260
Gong Y, Jia Y, Leung T, Toshev A, Ioffe S (2013) Deep convolutional ranking for multilabel image annotation. CoRR abs/1312.4894
Goodfellow IJ, Bulatov Y, Ibarz J, Arnoud S, Shet V (2013) Multi-digit number recognition from street view imagery using deep convolutional neural networks. CoRR abs/1312.6082
Group K (2012) The OpenCL specification Version 1.2. https://www.khronos.org/registry/cl/specs/opencl-1.2.pdf
Hardavellas N (2011) The rise and fall of dark silicon. USENIX 37(2):7–17
Hinton G, Deng L, Dahl GE, Mohamed A, Jaitly N, Senior A, Vanhoucke V, Nguyen P, Sainath T, Kingsbury B (2012) Deep neural networks for acoustic modeling in speech recognition. IEEE Signal Process Mag 29(6):82–97
Kaggle: Public Leaderboard: CIFAR-10: Object Recognition in Images (2013) http://www.kaggle.com/c/cifar-10/leaderboard
Krizhevsky A (2009) Learning multiple layers of features from tiny images. Tech. rep
Krizhevsky A, Nair V, Hinton G CIFAR-10 Dataset. http://www.cs.toronto.edu/kriz/cifar.html
Krizhevsky A, Sutskever I, Hinton G (2012) ImageNet classification with deep convolutional neural networks. In: Advances in neural information processing systems 25 (NIPS’2012)
Nallantech: PCIe-385N - Altera Stratix V D5 (2012) http://www.nallatech.com/PCI-Express-FPGA-Cards/pcie-385n-altera-stratix-v-fpga-computing-card.html
Oh KS, Jung K (2004) Gpu implementation of neural networks. Pattern Recognit 37(6):1311–1314
Owaida M, Falcao G, Andrade J, Antonopoulos C, Bellas N, Purnaprajna M, Novo D, Karakonstantis G, Burg A, Lenne P (2015) Enhancing design space exploration by extending CPU/GPU specifications onto FPGAs. ACM Trans Embed Comput Syst 14(2):33
Qualcomm: Snapdragon 800 (2013). http://www.qualcomm.com/snapdragon/processors/800
Qualcomm: Snapdragon 800 DragonBoard (2013). http://mydragonboard.org/db8074/
Wang G, Xiong Y, Yun J, Cavallaro JR (2013) Accelerating computer vision algorithms using opencl framework on the mobile GPU: a case study. In: IEEE international conference on acoustics, speech and signal processing (ICASSP)
Acknowledgments
This work has been supported by Instituto de Telecomunicações and Fundação para a Ciência e a Tecnologia (FCT) under grant UID/EEA/50008/2013.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Maria, J., Amaro, J., Falcao, G. et al. Stacked Autoencoders Using Low-Power Accelerated Architectures for Object Recognition in Autonomous Systems. Neural Process Lett 43, 445–458 (2016). https://doi.org/10.1007/s11063-015-9430-9
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11063-015-9430-9