Skip to main content
Log in

Neural Network Guided Spatial Fault Resilience in Array Processors

  • Published:
Journal of Electronic Testing Aims and scope Submit manuscript

Abstract

Computing with large die-size graphical processors, that involves huge arrays of identical structures, in the late CMOS era is abounding with challenges due to spatial non-idealities arising from chip-to-chip and within-chip variation of MOSFET threshold voltage. In this paper, we propose a software-framework using machine learning for in-situ prediction and correction of computation corrupted due to threshold voltage variation of transistors. Semi-supervised training is imparted to a fully connected cascade feed-forward (FCCFF) neural network (NN). This FCCFF-NN then creates an accurate spatial map of faulty processing elements (PE), which are avoided in computing. Besides correcting spatial faults, any transient errors (such as single-event upsets) are also tracked and corrected if the number of affected PEs is large enough to cause noticeable computing errors. For experimental validation, we consider a 256 × 256 PE array. Each PE is comprised of add-accumulate-multiply (AAM) block with three 8-bit registers (two for inputs and a third for storing the computed result). One thousand instances of this processor array are created and PEs in each instance are randomly perturbed with threshold voltage variation. Common image processing operations such as low pass filtering and edge enhancement are performed on each of these 1,000 instances. A fraction of these images (about 10 %) is used to train the NN for spatial non-idealities. Based on this training, the NN is able to accurately predict the spatial extremities in 95 % of all the remaining 90 % of the cases. The proposed NN based error tolerance produces superior quality processed images whose degradation is no longer visually perceptible.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  1. Borkar S (2009) Design perspectives on 22nm CMOS and beyond. In: Proceedings 46th ACM/IEEE design automation conference, pp 93–94

  2. Breuer MA, Gupta SK, Mak TM (2004) Defect and error tolerance in the presence of massive numbers of defects. IEEE Des Test Comput 21(3): 216–227

    Article  Google Scholar 

  3. Cao Y, Sato T, Sylvester D, Orshansky M, Hu C (2000) New paradigm of predictive MOSFET and interconnect modeling for early circuit design. In: Proceedings 41st design automation conference, pp 201–204

  4. Chong IS, Ortega A (2005) Hardware testing for error tolerant multimedia compression based on linear transforms. In: Proceedings 20th IEEE international symposium defect and fault tolerance in VLSI systems, pp 523–531

  5. Chung H, Ortega A (2005) Analysis and testing for error tolerant motion estimation. In: Proceedings 20th IEEE international symposium defect and fault tolerance in VLSI systems, pp 514–522

  6. Davari B, Dennard RH, Shahidi GG (1995) CMOS scaling for high performance and low power—the next ten years. Proc IEEE 83(4):595–606

    Article  Google Scholar 

  7. Gonzalez RC, Woods RE (2010) Digital image processing. Pearson Education International, New Jersey

    Google Scholar 

  8. Hegde R, Shanbhag NR (1999) Energy-efficient signal processing via algorithmic noise-tolerance. In: Proceedings international symposium low power electronics and design, pp 30–35

  9. International Technology Roadmap for Semiconductors (ITRS) 2009 Edition. http://www.itrs.net/Links/2009ITRS/Home2009.htm. Accessed 23 June 2013

  10. LTspice IV, Linear Technology. http://www.linear.com/designtools/software/. Accessed 23 June 2013

  11. Palem KV (2005) Energy aware computing through probabilistic switching: a study of limits. IEEE Trans Comput 54(9):1123–1137

    Article  Google Scholar 

  12. Predictive Technology Models for Nano-CMOS. http://ptm.asu.edu/. Accessed 23 June 2013

  13. Quarles TL, Pederson DO, Newton AR, Sangiovanni-Vincentelli AL, Wayne C, Rabaey JMThe spice page. http://bwrcs.eecs.berkeley.edu/Classes/IcBook/SPICE/. Accessed 23 June 2013

  14. Sadiku MNO, Mazzara M (1993) Computing with neural networks. IEEE Potentials 12(3):14–16

    Article  Google Scholar 

  15. Sakurai T (2004) Alpha power-law MOS model. IEEE Solid-State Circ Soc Newsl 9(4):4–5

    MathSciNet  Google Scholar 

  16. Sakurai T, Newton AR (1990) Alpha-power law MOSFET model and its applications to CMOS inverter delay and other formulas. IEEE J Solid-State Circ 25(2):584–594

    Article  Google Scholar 

  17. Sindia S, Agrawal VD (2012) Towards spatial fault resilience in array processors. In: Proceedings 30th IEEE VLSI test symposium pp 288–293

  18. Sindia S, Dai FF, Agrawal VD, Singh V (2012) Impact of process variations on computers used for image processing. In: Proceedings IEEE international symposium circuits and systems, pp 1444–1447

  19. Synopsys Design Compiler. http://tinyurl.com/synopsysRTL2gates. Accessed 30 June 2013

  20. Varatkar GV, Shanbhag NR (2008) Error-resilient motion estimation architecture. IEEE Trans Very Large Scale Integr Syst 16(10):1399–1412

    Article  Google Scholar 

  21. Wang Z, Li Q (2011) Information content weighting for perceptual image quality assessment. IEEE Trans Image Process 20(5):1185–1198

    Article  MathSciNet  Google Scholar 

  22. Wilamowski BMNeuron by neuron trainer 2.0. http://131.204.128.91/NNTrainer/index.php. Accessed 10 Oct 2011

  23. Wilamowski BM, Yu H (2010) Neural network learning without backpropagation. IEEE Trans Neural Netw 21(11):1793–1803

    Article  Google Scholar 

  24. Wilamowski BM, Hunter D, Malinowski A (2003) Solving parity-N problems with feedforward neural networks. In: Proceedings international joint conference neural networks, pp 2546–2551

  25. Yuan X, Shimizu T, Mahalingam U, Brown JS, Habib KZ, Tekleab DG, Su T-C, Satadru S, Olsen CM, Lee H, Pan L-H, Hook TB, Han J-P, Park J-E, Na M-H, Rim K (2011) Transistor mismatch properties in deep-submicrometer CMOS technologies. IEEE Trans Electron Devices 58(2):335–342

    Article  Google Scholar 

  26. Zhao W, Cao Y (2006) New generation of predictive technology model for sub-45nm early design exploration. IEEE Trans Electron Devices 53(11):2816–2823

    Article  Google Scholar 

Download references

Acknowledgments

This research was supported in part by the National Science Foundation Grant CCF-1116213. Authors thank Professor Bogdan Wilamowski of the ECE Department at Auburn University for many useful discussions, provision of the neural network training software, and his continued interest in this work.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Suraj Sindia.

Additional information

Responsible Editor: M. Violante

Rights and permissions

Reprints and permissions

About this article

Cite this article

Sindia, S., Agrawal, V.D. Neural Network Guided Spatial Fault Resilience in Array Processors. J Electron Test 29, 473–483 (2013). https://doi.org/10.1007/s10836-013-5394-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10836-013-5394-8

Keywords

Navigation