Abstract
Efficient utilization of on-chip Static Random Access Memory (SRAM) space is more important on processor core design in modern Field Programmable Gate Array (FPGA) based Digital Signal Processing (DSP) applications. In the proposed High-performance Approximate Single Port (HASP) SRAM architecture, a significant amount of data is stored to achieve high performance. The constraints involved with high performance are counterbalanced to provide high accuracy, high speed, low power and area efficiency. In the proposed High-performance Approximate Sub-Bank Dual Port (HASBDP1 and HASBDP2) memory architectures, HASP has been employed and modified to work as a True DP SRAM with energy and area efficiency. The performance of the proposed memories is investigated by comparing its speed, area and power with those of the existing approaches. The proposed HASP SRAM provides 14.99% less power consumption and thirteen numbers of logic elements savings in the resource utilization than the existing conventional SP SRAM. By considering the design metrics, the proposed HASBDP SRAMs outperform than the conventional TDP and sub-bank DP SRAMs approaches. The proposed HASBDP2 exhibits 29.09%, 22.37% higher PSNR and 32.94%, 28.48% higher SSIM than the truncated least significant bit and static segment on-chip approximate memories respectively.
Similar content being viewed by others
References
Altera Corporation (2010) Embedded design handbook, Chapter 7:1–18
Ang SS, Constantinides GA, Luk W, Cheung PYK (2008) Custom parallel caching schemes for hardware accelerated image compression. J Real-Time Image Proc 3(4):289–302
Bajwa H (2007) An area-efficient, high-performance, low-power multi-port cache memory architecture, Ph.D. Thesis, Department of Electrical Engineering, City University of New York pp.109
Bonato V, Marques E, Constantinides GA (2009) A floating-point extended kalman filter implementation for autonomous mobile robots. J Sign Process Syst Sign Image Video Technol 56:41–50
Compton K, Hauck S (2002) Reconfigurable computing: a survey of systems and software. ACM Comput Surv 34(2):171–210
D’ıaz J, Ros E, Sabatini SP, Solari F, Mota S (2007) A phase-based stereo vision system-on-a-chip. BioSystems 87(2–3):314–321
Deepa P, Vasanthanayaki C (2012) FPGA based efficient on-chip memory for image processing algorithms. Microelectron J 43(11):916–928
Dı’az J, Ros E, Pelayo F, Ortigosa EM, Mota S (2006) FPGA-based real-time opticalflow and system. IEEE Trans Circuits Syst Video Technol 16:274–279
Donald GB (2011) Design for embedded image processing on FPGAs. Wiley, Hoboken
Renesas Technology Develops 90 nm Dual-Port SRAM for SoC (2004) Featuring High-Level Density and Low Power: https://www.businesswire.com/news/home/20040217006411/en/Renesas-Technology-Develops-90-nm-Dual-Port-SRAM
Esmaeilzadeh H, Sampson A, Ceze L, Burger D (2012) Neural acceleration for general-purpose approximate programs, Proc. of the 45th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO45), pp. 449–460
Ferrari G, Colavolpe G, Raheli R (2003) On trellis-based truncated-memory detection, GLOBECOM '03. IEEE Global Telecommunications Conference (IEEE Cat. No.03CH37489), San Francisco, CA, vol.4. pp. 2218–2222
Guo Z, Najjar W, Vahid F, Vissers K (2004) A quantitative analysis of the speedup factors of FPGAs over processors, In FPGA ‘04: Proceedings of the 2004 ACM/SIGDA 12th International Symposium on Field Programmable Gate Arrays. New York, NY, USA: ACM. 12, pp. 162–170
Jacobsson R (2007) Building integrated remote control systems for electronics boards, in: Real-Time Conference, 15th IEEE-NPSS, 2007, pp. 1–6 https://doi.org/10.1109/RTC.2007.4382741
Jothin R, Vasanthanayaki C (2018) High performance static segment on-Chip memory for image processing applications. J Electron Test 34(4):389–404
Khudia DS, Zamirai B, Samadi M, Mahlke S 2015 Rumba: an online quality management system for approximate computing, Proc. of the 42nd Annual International Symposium on Computer Architecture (ISCA-42), pp. 554–566
Kuon I, Rose J (2006) Measuring the gap between FPGAs and ASICs, FPGA ‘06: Proceedings of the 2006 ACM/SIGDA 14th International Symposium on Field Programmable Gate Arrays. New York, NY, USA: ACM. 14, pp.21–30
Liu Q, Constantinides G, Masselos K, Cheung P (2007) Automatic on-chip memory minimization for data reuse, in: 15th Annual IEEE Symposium on Field- Programmable Custom Computing Machines, pp. 251–260
Samadi M, Lee J, Jamshidi DA, Hormati A, Mahlke S (2013) SAGE: self-tuning approximation for graphics engines, Proc. of the 46th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-46), pp. 13–24
Samadi M, Jamshidi DA, Lee J, Mahlke S (2014) Paraprox: pattern-based approximation for data parallel applications, Proc. of the 19th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS-XIX), pp. 35–50
Sampson A, Dietl W, Fortuna E, Gnanapragasam D, Ceze L, Grossman D (2011) EnerJ: approximate data types for safe and general low-power computation, Proc. of the 32nd ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2011), IEEE, pp 164–174
Sampson A, Nelson J, Strauss K, Ceze L (2013) Approximate storage in solid-state memories, Proc. of the 46th Annual IEEE/ACM International Symposium on Micro architecture (MICRO-46), pp. 25–36
San Miguel J, Albericio J, Moshovos A, Jerger NE (2015) Doppelganger: a cache for approximate computing, Proc. of the 48th International Symposium on Microarchitecture (MICRO-48), pp. 50–61
Sidiroglou-Douskos S, Misailovic S, Hoffmann H, Rinard M (2011) Managing performance vs. accuracy trade-offs with loop perforation, Proc. of the 19th ACM SIGSOFT Symposium and the 13th European Conference on Foundations of Software Engineering (ESEC/FSE 2011), pp. 124–134
Silva F, Magalhães W, Silveira J, Ferreira JM, Magalhães P, Lima OA, Marcon C (2017) Evaluation of multiple bit upset tolerant codes for NoCs buffering, Circuits & Systems (LASCAS) 2017 IEEE 8th Latin American Symposium on, pp. 1–4
Stallings W (2005) Computer Organization and Architecture: Designing for Performance, 7th edn. Prentice Hall, Boston
Stefania P, Pasquale C (2011) Efficient memory architecture for image processing. Int J Circuit Theory Appl 39:351–356
Venkataramani S, Chippa VK, Chakradhar ST, Roy K, Raghunathan A (2013) Quality programmable vector processors for approximate computing, Proc. of the 46th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-46), pp. 1–12
Ranjith K, Volkan K (2009) Temperature adaptive voltage scaling for enhanced energy efficiency in subthreshold memory arrays. Microelectron J 40(6):1013–1025
Wang Y, Chen S, Bermak A (2008) Novel VLSI implementation of Peano–Hilbert curve address generator, IEEE International Symposium on Circuits and Systems, ISCAS 2008. pp. 476–479
You L, Xiangqing H (2008) A novel area-efficient and full current-mode dual-port SRAM, Proceedings of IEEE International Conference on Communications, Circuits and Systems, 2008. ICCCAS 2008, pp. 1079–1082
Author information
Authors and Affiliations
Corresponding author
Additional information
Responsible Editor: S. T. Chakradhar
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Jothin, R., Mohamed, M.P. High Performance Approximate Memories for Image Processing Applications. J Electron Test 36, 419–428 (2020). https://doi.org/10.1007/s10836-020-05879-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10836-020-05879-0