Abstract
Computing with large die-size graphical processors, that involves huge arrays of identical structures, in the late CMOS era is abounding with challenges due to spatial non-idealities arising from chip-to-chip and within-chip variation of MOSFET threshold voltage. In this paper, we propose a software-framework using machine learning for in-situ prediction and correction of computation corrupted due to threshold voltage variation of transistors. Semi-supervised training is imparted to a fully connected cascade feed-forward (FCCFF) neural network (NN). This FCCFF-NN then creates an accurate spatial map of faulty processing elements (PE), which are avoided in computing. Besides correcting spatial faults, any transient errors (such as single-event upsets) are also tracked and corrected if the number of affected PEs is large enough to cause noticeable computing errors. For experimental validation, we consider a 256 × 256 PE array. Each PE is comprised of add-accumulate-multiply (AAM) block with three 8-bit registers (two for inputs and a third for storing the computed result). One thousand instances of this processor array are created and PEs in each instance are randomly perturbed with threshold voltage variation. Common image processing operations such as low pass filtering and edge enhancement are performed on each of these 1,000 instances. A fraction of these images (about 10 %) is used to train the NN for spatial non-idealities. Based on this training, the NN is able to accurately predict the spatial extremities in 95 % of all the remaining 90 % of the cases. The proposed NN based error tolerance produces superior quality processed images whose degradation is no longer visually perceptible.
Similar content being viewed by others
References
Borkar S (2009) Design perspectives on 22nm CMOS and beyond. In: Proceedings 46th ACM/IEEE design automation conference, pp 93–94
Breuer MA, Gupta SK, Mak TM (2004) Defect and error tolerance in the presence of massive numbers of defects. IEEE Des Test Comput 21(3): 216–227
Cao Y, Sato T, Sylvester D, Orshansky M, Hu C (2000) New paradigm of predictive MOSFET and interconnect modeling for early circuit design. In: Proceedings 41st design automation conference, pp 201–204
Chong IS, Ortega A (2005) Hardware testing for error tolerant multimedia compression based on linear transforms. In: Proceedings 20th IEEE international symposium defect and fault tolerance in VLSI systems, pp 523–531
Chung H, Ortega A (2005) Analysis and testing for error tolerant motion estimation. In: Proceedings 20th IEEE international symposium defect and fault tolerance in VLSI systems, pp 514–522
Davari B, Dennard RH, Shahidi GG (1995) CMOS scaling for high performance and low power—the next ten years. Proc IEEE 83(4):595–606
Gonzalez RC, Woods RE (2010) Digital image processing. Pearson Education International, New Jersey
Hegde R, Shanbhag NR (1999) Energy-efficient signal processing via algorithmic noise-tolerance. In: Proceedings international symposium low power electronics and design, pp 30–35
International Technology Roadmap for Semiconductors (ITRS) 2009 Edition. http://www.itrs.net/Links/2009ITRS/Home2009.htm. Accessed 23 June 2013
LTspice IV, Linear Technology. http://www.linear.com/designtools/software/. Accessed 23 June 2013
Palem KV (2005) Energy aware computing through probabilistic switching: a study of limits. IEEE Trans Comput 54(9):1123–1137
Predictive Technology Models for Nano-CMOS. http://ptm.asu.edu/. Accessed 23 June 2013
Quarles TL, Pederson DO, Newton AR, Sangiovanni-Vincentelli AL, Wayne C, Rabaey JMThe spice page. http://bwrcs.eecs.berkeley.edu/Classes/IcBook/SPICE/. Accessed 23 June 2013
Sadiku MNO, Mazzara M (1993) Computing with neural networks. IEEE Potentials 12(3):14–16
Sakurai T (2004) Alpha power-law MOS model. IEEE Solid-State Circ Soc Newsl 9(4):4–5
Sakurai T, Newton AR (1990) Alpha-power law MOSFET model and its applications to CMOS inverter delay and other formulas. IEEE J Solid-State Circ 25(2):584–594
Sindia S, Agrawal VD (2012) Towards spatial fault resilience in array processors. In: Proceedings 30th IEEE VLSI test symposium pp 288–293
Sindia S, Dai FF, Agrawal VD, Singh V (2012) Impact of process variations on computers used for image processing. In: Proceedings IEEE international symposium circuits and systems, pp 1444–1447
Synopsys Design Compiler. http://tinyurl.com/synopsysRTL2gates. Accessed 30 June 2013
Varatkar GV, Shanbhag NR (2008) Error-resilient motion estimation architecture. IEEE Trans Very Large Scale Integr Syst 16(10):1399–1412
Wang Z, Li Q (2011) Information content weighting for perceptual image quality assessment. IEEE Trans Image Process 20(5):1185–1198
Wilamowski BMNeuron by neuron trainer 2.0. http://131.204.128.91/NNTrainer/index.php. Accessed 10 Oct 2011
Wilamowski BM, Yu H (2010) Neural network learning without backpropagation. IEEE Trans Neural Netw 21(11):1793–1803
Wilamowski BM, Hunter D, Malinowski A (2003) Solving parity-N problems with feedforward neural networks. In: Proceedings international joint conference neural networks, pp 2546–2551
Yuan X, Shimizu T, Mahalingam U, Brown JS, Habib KZ, Tekleab DG, Su T-C, Satadru S, Olsen CM, Lee H, Pan L-H, Hook TB, Han J-P, Park J-E, Na M-H, Rim K (2011) Transistor mismatch properties in deep-submicrometer CMOS technologies. IEEE Trans Electron Devices 58(2):335–342
Zhao W, Cao Y (2006) New generation of predictive technology model for sub-45nm early design exploration. IEEE Trans Electron Devices 53(11):2816–2823
Acknowledgments
This research was supported in part by the National Science Foundation Grant CCF-1116213. Authors thank Professor Bogdan Wilamowski of the ECE Department at Auburn University for many useful discussions, provision of the neural network training software, and his continued interest in this work.
Author information
Authors and Affiliations
Corresponding author
Additional information
Responsible Editor: M. Violante
Rights and permissions
About this article
Cite this article
Sindia, S., Agrawal, V.D. Neural Network Guided Spatial Fault Resilience in Array Processors. J Electron Test 29, 473–483 (2013). https://doi.org/10.1007/s10836-013-5394-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10836-013-5394-8