skip to main content
10.1145/2485922.2485923acmotherconferencesArticle/Chapter ViewAbstractPublication PagesiscaConference Proceedingsconference-collections
research-article

Continuous real-world inputs can open up alternative accelerator designs

Published: 23 June 2013 Publication History

Abstract

Motivated by energy constraints, future heterogeneous multi-cores may contain a variety of accelerators, each targeting a subset of the application spectrum. Beyond energy, the growing number of faults steers accelerator research towards fault-tolerant accelerators.
In this article, we investigate a fault-tolerant and energy-efficient accelerator for signal processing applications. We depart from traditional designs by introducing an accelerator which relies on unary coding, a concept which is well adapted to the continuous real-world inputs of signal processing applications. Unary coding enables a number of atypical micro-architecture choices which bring down area cost and energy; moreover, unary coding provides graceful output degradation as the amount of transient faults increases.
We introduce a configurable hybrid digital/analog micro-architecture capable of implementing a broad set of signal processing applications based on these concepts, together with a back-end optimizer which takes advantage of the special nature of these applications. For a set of five signal applications, we explore the different design tradeoffs and obtain an accelerator with an area cost of 1.63mm2. On average, this accelerator requires only 2.3% of the energy of an Atom-like core to implement similar tasks. We then evaluate the accelerator resilience to transient faults, and its ability to trade accuracy for energy savings.

References

[1]
"TMS320C6000 CPU and instruction set reference guide," Texas Instruments, Tech. Rep., 2006.
[2]
R. S. Amant, D. A. Jimenez, and D. Burger, "Low-power, high-performance analog neural branch prediction," in International Symposium on Microarchitecture, Como, 2008.
[3]
J. V. Arthur and K. Boahen, "Silicon-Neuron Design: A Dynamical Systems Approach," Circuits and Systems I: Regular Papers, IEEE Transactions on, vol. 58, no. 99, p. 1, 2011.
[4]
N. Binkert, B. Beckmann, G. Black, S. K. Reinhardt, A. Saidi, A. Basu, J. Hestness, D. R. Hower, T. Krishna, S. Sardashti, R. Sen, K. Sewell, M. Shoaib, N. Vaish, M. D. Hill, and D. A. Wood, "The gem5 simulator," SIGARCH Comput. Archit. News, vol. 39, no. 2, pp. 1--7, Aug. 2011.
[5]
K. A. Boahen, "Point-to-point connectivity between neuromorphic chips using address events," IEEE Transactions on Circuits and Systems, vol. 47, no. 5, pp. 416--434, 2000.
[6]
S. Borkar, "Design perspectives on 22nm CMOS and beyond," in Design Automation Conference, Jul. 2009, pp. 93--94.
[7]
D. Burger, "Future Architectures will Incorporate HPUs (keynote)," in International Symposium on Microarchitecture, 2011.
[8]
L. N. Chakrapani, B. E. S. Akgul, S. Cheemalavagu, P. Korkmaz, K. V. Palem, and B. Seshasayee, "Ultra-efficient (embedded) SOC architectures based on probabilistic CMOS (PCMOS) technology," in Design, Automation and Test in Europe Conference, Munich, 2006, p. 1110.
[9]
A. Chanthbouala, V. Garcia, R. O. Cherifi, K. Bouzehouane, S. Fusil, X. Moya, S. Xavier, H. Yamada, C. Deranlot, N. D. Mathur, M. Bibes, A. Barthélémy, and J. Grollier, "A ferroelectric memristor." Nature materials, vol. 11, no. 10, pp. 860--4, Oct. 2012. {Online}. Available: http://dx.doi.org/10.1038/nmat3415
[10]
S. Deneve, "Bayesian Spiking Neurons I: Inference," Neural Computation, vol. 117, pp. 91--117, 2008.
[11]
C. Eliasmith and C. H. Anderson, Neural Engineering: Computation, Representation and Dynamics in Neurobiological Systems. MIT Press, 2003.
[12]
H. Esmaeilzadeh, E. Blem, R. S. Amant, K. Sankaralingam, and D. Burger, "Dark Silicon and the End of Multicore Scaling," in Proceedings of the 38th International Symposium on Computer Architecture (ISCA), Jun. 2011.
[13]
H. Esmaeilzadeh, A. Sampson, L. Ceze, and D. Burger, "Architecture support for disciplined approximate programming," in ASPLOS, T. Harris and M. L. Scott, Eds. ACM, 2012, pp. 301--312.
[14]
H. Esmaeilzadeh, "Neural Acceleration for General-Purpose Approximate Programs," in International Symposium on Microarchitecture, 2012.
[15]
K. Fan, M. Kudlur, G. S. Dasika, and S. A. Mahlke, "Bridging the computation gap between programmable processors and hardwired accelerators," in HPCA. IEEE Computer Society, 2009, pp. 313--322.
[16]
W. Gerstner and W. M. Kistler, Spiking Neuron Models. Cambridge University Press, 2002.
[17]
T. S. Hall, C. M. Twigg, J. D. Gray, P. Hasler, and D. V. Anderson, "Large-Scale Field-Programmable Analog Arrays for Analog Signal Processing," IEEE Transactions on Circuits and Systems I: Fundamental Theory and Applications, vol. 52, no. 11, pp. 2298--2307, 2005.
[18]
R. Hameed, W. Qadeer, M. Wachs, O. Azizi, A. Solomatnikov, B. C. Lee, S. Richardson, C. Kozyrakis, and M. Horowitz, "Understanding sources of inefficiency in general-purpose chips," in International Symposium on Computer Architecture. New York, New York, USA: ACM Press, 2010, p. 37.
[19]
A. Hashmi, A. Nere, J. J. Thomas, and M. Lipasti, "A case for neuromorphic ISAs," in International Conference on Architectural Support for Programming Languages and Operating Systems. New York, NY: ACM, 2011.
[20]
A. Joubert, B. Belhadj, O. Temam, and R. Heliot, "Hardware Spiking Neurons Design: Analog or Digital?" in International Joint Conference on Neural Networks, Brisbane, 2012.
[21]
M. D. Kruijf, S. Nomura, and K. Sankaralingam, "Relax: An Architectural Framework for Software Recovery of Hardware Faults," in International Symposium on Computer Architecture. Saint-Malo: ACM Press, 2010.
[22]
I. Kuon and J. Rose, "Measuring the gap between FPGAs and ASICs," in International Symposium on Field Programmable Gate Arrays, ser. FPGA '06. New York, NY, USA: ACM, Feb. 2006, pp. 21--30.
[23]
G. Lemieux, E. Lee, M. Tom, and A. Yu, "Directional and Single-Driver Wires in FPGA Interconnect," in International Conference on Field-Programmable Technology. IEEE, 2004, pp. 41--48.
[24]
S. Li, J. H. Ahn, R. D. Strong, J. B. Brockman, D. M. Tullsen, and N. P. Jouppi, "McPAT: an integrated power, area, and timing modeling framework for multicore and manycore architectures," in Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture, ser. MICRO 42. New York, NY, USA: ACM, 2009, pp. 469--480.
[25]
C. Mead, Analog VLSI and Neural Systems. Addison-Wesley, 1989.
[26]
P. Merolla, J. Arthur, F. Akopyan, N. Imam, R. Manohar, and D. Modha, "A digital neurosynaptic core using embedded crossbar memory with 45pJ per spike in 45nm," in IEEE Custom Integrated Circuits Conference. IEEE, Sep. 2011, pp. 1--4.
[27]
T. K. Moon and W. C. Stirling, Mathematical Methods and Algorithms for Signal Processing. Prentice Hall, 1999.
[28]
A. Rose, Advances in Electronics, Vol. 1, L. N. Y. A. P. Martin, Ed., 1948.
[29]
U. Rutishauser and R. J. Douglas, "State-dependent computation using coupled recurrent networks," Neural computation, vol. 21, no. 2, pp. 478--509, 2009.
[30]
R. Sarpeshkar and M. O'Halloran, "Scalable hybrid computation with spikes." Neural computation, vol. 14, no. 9, pp. 2003--2038, 2002.
[31]
J. Schemmel, J. Fieres, and K. Meier, "Wafer-scale integration of analog neural networks," in International Joint Conference on Neural Networks. Ieee, Jun. 2008, pp. 431--438.
[32]
R. Serrano-Gotarredona, M. Oster, P. Lichtsteiner, A. Linares-Barranco, R. Paz-Vicente, F. Gomez-Rodriguez, L. Camunas-Mesa, R. Berner, M. Rivas-Perez, T. Delbruck, S.-C. Liu, R. Douglas, P. Hafliger, G. Jimenez-Moreno, A. Civit Ballcels, T. Serrano-Gotarredona, A. J. Acosta-Jimenez, and B. Linares-Barranco, "CAVIAR: a 45k neuron, 5M synapse, 12G connects/s AER hardware sensory-processing- learning-actuating system for high-speed visual object recognition and tracking." IEEE transactions on neural networks, vol. 20, no. 9, pp. 1417--38, Sep. 2009.
[33]
S. Sethumadhavan, R. Roberts, and Y. Tsividis, "A Case for Hybrid Discrete-Continuous Architectures," IEEE Computer Architecture Letters, vol. 99, no. RapidPosts, 2011.
[34]
R. Silver, K. Boahen, S. Grillner, N. Kopell, and K. L. Olsen, "Neurotech for neuroscience: unifying concepts, organizing principles, and emerging tools." The Journal of neuroscience: the official journal of the Society for Neuroscience, vol. 27, no. 44, pp. 11 807--19, Oct. 2007.
[35]
M. V. Srinivasan and G. D. Bernard, "A proposed mechanism for multiplication of neural signals," Biological Cybernetics, vol. 21, no. 4, pp. 227--236, 1976.
[36]
Steve Keckler, "Life After Dennard and How I Learned to Love the Picojoule (keynote)," in International Symposium on Microarchitecture, Sao Paolo, Dec. 2011, p. Keynote presentation.
[37]
O. Temam, "A Defect-Tolerant Accelerator for Emerging High-Performance Applications," in International Symposium on Computer Architecture, Portland, Oregon, 2012.
[38]
O. Temam and R. Heliot, "Implementation of signal processing tasks on neuromorphic hardware," in International Joint Conference on Neural Networks. IEEE, Jul. 2011, pp. 1120--1125.
[39]
W. Thies, M. Karczmarek, and S. P. Amarasinghe, "StreamIt: A Language for Streaming Applications," in Compiler Construction, ser. Lecture Notes in Computer Science, vol. 2304. Berlin, Heidelberg: Springer, Mar. 2002.
[40]
B. P. Tripp and C. Eliasmith, "Population models of temporal differentiation." Neural computation, vol. 22, no. 3, pp. 621--659, 2010.
[41]
A. van Schaik, "Building blocks for electronic spiking neural networks." Neural networks, vol. 14, no. 6-7, pp. 617--628, 2001.
[42]
G. Venkatesh, J. Sampson, N. Goulding-hotta, S. K. Venkata, M. B. Taylor, and S. Swanson, "QsCORES: Trading Dark Silicon for Scalable Energy Efficiency with Quasi-Specific Cores Categories and Subject Descriptors," in International Symposium on Microarchitecture, 2011.
[43]
R. J. Vogelstein, U. Mallik, J. T. Vogelstein, and G. Cauwenberghs, "Dynamically reconfigurable silicon array of spiking neurons with conductance-based synapses," IEEE Transactions on Neural Networks, vol. 18, no. 1, pp. 253--265, 2007.

Cited By

View all
  • (2023)Near-optimal multi-accelerator architectures for predictive maintenance at the edgeFuture Generation Computer Systems10.1016/j.future.2022.10.030140:C(331-343)Online publication date: 8-Feb-2023
  • (2021)Functional Approximation and Approximate Parallelization with the ACCEPT compiler2021 IEEE 33rd International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD)10.1109/SBAC-PAD53543.2021.00030(188-197)Online publication date: Oct-2021
  • (2021)Highly parallelized memristive binary neural networkNeural Networks10.1016/j.neunet.2021.09.016144:C(565-572)Online publication date: 1-Dec-2021
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
ISCA '13: Proceedings of the 40th Annual International Symposium on Computer Architecture
June 2013
686 pages
ISBN:9781450320795
DOI:10.1145/2485922
  • cover image ACM SIGARCH Computer Architecture News
    ACM SIGARCH Computer Architecture News  Volume 41, Issue 3
    ICSA '13
    June 2013
    666 pages
    ISSN:0163-5964
    DOI:10.1145/2508148
    Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

  • IEEE CS

In-Cooperation

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 23 June 2013

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article

Funding Sources

Conference

ISCA'13
Sponsor:

Acceptance Rates

ISCA '13 Paper Acceptance Rate 56 of 288 submissions, 19%;
Overall Acceptance Rate 543 of 3,203 submissions, 17%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)20
  • Downloads (Last 6 weeks)1
Reflects downloads up to 17 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2023)Near-optimal multi-accelerator architectures for predictive maintenance at the edgeFuture Generation Computer Systems10.1016/j.future.2022.10.030140:C(331-343)Online publication date: 8-Feb-2023
  • (2021)Functional Approximation and Approximate Parallelization with the ACCEPT compiler2021 IEEE 33rd International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD)10.1109/SBAC-PAD53543.2021.00030(188-197)Online publication date: Oct-2021
  • (2021)Highly parallelized memristive binary neural networkNeural Networks10.1016/j.neunet.2021.09.016144:C(565-572)Online publication date: 1-Dec-2021
  • (2019)Design Space Evaluation of a Memristor Crossbar Based Multilayer Perceptron for Image Processing2019 International Joint Conference on Neural Networks (IJCNN)10.1109/IJCNN.2019.8852005(1-8)Online publication date: Jul-2019
  • (2018)In-DRAM near-data approximate acceleration for GPUsProceedings of the 27th International Conference on Parallel Architectures and Compilation Techniques10.1145/3243176.3243188(1-14)Online publication date: 1-Nov-2018
  • (2018)Single-Channel Dataflow for Convolutional Neural Network Accelerator2018 IEEE 4th Information Technology and Mechatronics Engineering Conference (ITOEC)10.1109/ITOEC.2018.8740349(966-970)Online publication date: Dec-2018
  • (2018)GANAXProceedings of the 45th Annual International Symposium on Computer Architecture10.1109/ISCA.2018.00060(650-661)Online publication date: 2-Jun-2018
  • (2018)Memristor devices for neural networksJournal of Physics D: Applied Physics10.1088/1361-6463/aae22352:2(023003)Online publication date: 30-Oct-2018
  • (2017)DaDianNaoIEEE Transactions on Computers10.1109/TC.2016.257435366:1(73-88)Online publication date: 1-Jan-2017
  • (2017)Survey of progress in deep neural networks for resource-constrained applicationsIECON 2017 - 43rd Annual Conference of the IEEE Industrial Electronics Society10.1109/IECON.2017.8217271(7259-7266)Online publication date: Oct-2017
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media