skip to main content
10.1145/2593069.2593082acmotherconferencesArticle/Chapter ViewAbstractPublication PagesdacConference Proceedingsconference-collections
research-article

An Approximate Computing Technique for Reducing the Complexity of a Direct-Solver for Sparse Linear Systems in Real-Time Video Processing

Published: 01 June 2014 Publication History

Abstract

Many video processing algorithms are formulated as least-squares problems that result in large, sparse linear systems. Solving such systems in real time is very demanding. This paper focuses on reducing the computational complexity of a direct Cholesky-decomposition-based solver. Our approximation scheme builds on the observation that, in well-conditioned problems, many elements in the decomposition nearly vanish. Such elements may be pruned from the dependency graph with mild accuracy degradation. Using an example from image-domain warping, we show that pruning reduces the amount of operations per solve by over 75%, resulting in significant savings in computing time, area or energy.

References

[1]
A. Smolic et. al. Disparity-Aware Stereo 3D Production Tools. In CVMP, pages 165--173, 2011.
[2]
P. Amestoy, T. Davis, and I. Duff. Algorithm 837: AMD, An Approximate Minimum Degree Ordering Algorithm. ACM TOMS, 30(3):381--388, 2004.
[3]
J. Bates. Processing With Compact Arithmetic Processing Element, Dec. 24 2010. WO Patent 2,010,148,054.
[4]
H. Cho, J. Lee, and Y. Kim. Efficient Implementation of Linear System Solution Block Using LDLT Factorization. SoC 2008, 03, 2008.
[5]
S. Demirsoy and M. Langhammer. Cholesky Decomposition Using Fused Datapath Synthesis. In ACM/SIGDA FPGA 2009, pages 241--244, 2009.
[6]
Y. Depeng, G. D. Peterson, and H. Li. Compressed Sensing and Cholesky Decomposition on FPGAs and GPUs. Parallel Computing, 38(8):421--437, 2012.
[7]
J. Detrey and F. De Dinechin. A Tool for Unbiased Comparison between Logarithmic and Floating-point Arithmetic. J VLSI SIG PROC SYST, May 2007.
[8]
R. S. Eaton, J. C. McBride, and J. Bates. Reliable ISR Algorithms for a Very-low-power Approximate Computer. In SPIE DSS, pages 871312--871312, 2013.
[9]
F. De Dinechin et. al. An FPGA-Specific Approach to Floating-point Accumulation and Sum-of-products. In ICECE Technology, 2008. FPT., pages 33--40, 2008.
[10]
A. George. Nested dissection of a regular finite element mesh. SIAM Journal on Numerical Analysis, 1973.
[11]
G. H. Golub and C. F. Van Loan. Matrix Computations, volume 3. JHU Press, 2012.
[12]
L. Itti, C. Koch, and E. Niebur. A Model of Saliency-based Visual Attention for Rapid Scene Analysis. IEEE TPAMI, 20(11): 1254--1259, 1998.
[13]
M. Lang et. al. Practical Temporal Consistency for Image-based Graphics Applications. ACM ToG, 2012.
[14]
O. Maslennikow et. al. Parallel implementation of Cholesky LLT-Algorithm in FPGA-based processor. In PPAM, pages 137--147. Springer, 2008.
[15]
P. Greisen et. al. Evaluation and FPGA Implementation of Sparse Linear Solvers for Video Processing Applications. IEEE TCSVT, Aug. 2013.
[16]
P. Krähenbühl et. al. A System For Retargeting of Streaming Video. ACM ToG, 28(5):1, Dec. 2009.
[17]
K. Palem and A. Lingamneni. Ten Years of Building Broken Chips: The Physics and Engineering of Inexact Computing. ACM TECS, 12(2s), May 2013.
[18]
RMIT Univ. An Uncompressed Stereoscopic 3D HD Video Library, Nov. 2013. http://www.rmit3dv.com.
[19]
Y. Saad. Iterative Methods for Sparse Linear Systems Second Edition. SIAM, 2003.
[20]
D. Sonawane and M. Sutaone. High Throughput Iterative VLSI Architecture for Cholesky Factorization Based Matrix Inversion. IJCA, 35(8), 2011.
[21]
J. Sun, G. Peterson, and O. Storaasli. High-performance Mixed-Precision Linear Solver for FPGAs. IEEE TC, 57(12): 1614--1623, 2008.
[22]
The Xiph Open-Source Community. Test Media, Nov. 2013. http://media.xiph.org.
[23]
J. H. Wilkinson. A Priori Error Analysis of Algebraic Processes. In Intern. Congress Math, 1968.

Cited By

View all
  • (2023)Approximation Opportunities in Edge Computing Hardware: A Systematic Literature ReviewACM Computing Surveys10.1145/357277255:12(1-49)Online publication date: 3-Mar-2023
  • (2023)Energy-Efficient Hardware Implementation of Fully Connected Artificial Neural Networks Using Approximate Arithmetic BlocksCircuits, Systems, and Signal Processing10.1007/s00034-023-02363-w42:9(5428-5452)Online publication date: 24-Apr-2023
  • (2020)Efficient Hardware Implementation of Artificial Neural Networks Using Approximate Multiply-Accumulate Blocks2020 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)10.1109/ISVLSI49217.2020.00027(96-101)Online publication date: Jul-2020
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
DAC '14: Proceedings of the 51st Annual Design Automation Conference
June 2014
1249 pages
ISBN:9781450327305
DOI:10.1145/2593069
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

In-Cooperation

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 June 2014

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Approximate Computing
  2. Cholesky Decomposition
  3. Hardware Accelerator
  4. Video Processing

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

DAC '14

Acceptance Rates

Overall Acceptance Rate 1,770 of 5,499 submissions, 32%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)5
  • Downloads (Last 6 weeks)0
Reflects downloads up to 01 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2023)Approximation Opportunities in Edge Computing Hardware: A Systematic Literature ReviewACM Computing Surveys10.1145/357277255:12(1-49)Online publication date: 3-Mar-2023
  • (2023)Energy-Efficient Hardware Implementation of Fully Connected Artificial Neural Networks Using Approximate Arithmetic BlocksCircuits, Systems, and Signal Processing10.1007/s00034-023-02363-w42:9(5428-5452)Online publication date: 24-Apr-2023
  • (2020)Efficient Hardware Implementation of Artificial Neural Networks Using Approximate Multiply-Accumulate Blocks2020 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)10.1109/ISVLSI49217.2020.00027(96-101)Online publication date: Jul-2020
  • (2018)Do Iterative Solvers Benefit from Approximate Computing? An Evaluation Study Considering Orthogonal Approximation MethodsArchitecture of Computing Systems – ARCS 201810.1007/978-3-319-77610-1_22(297-310)Online publication date: 8-Mar-2018
  • (2017)Energy-efficient and error-resilient iterative solvers for approximate computing2017 IEEE 23rd International Symposium on On-Line Testing and Robust System Design (IOLTS)10.1109/IOLTS.2017.8046244(237-239)Online publication date: Jul-2017
  • (2016)Applying efficient fault tolerance to enable the preconditioned conjugate gradient solver on approximate computing hardware2016 IEEE International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems (DFT)10.1109/DFT.2016.7684063(21-26)Online publication date: Sep-2016
  • (2015)ApproxEigenProceedings of the IEEE/ACM International Conference on Computer-Aided Design10.5555/2840819.2840934(824-830)Online publication date: 2-Nov-2015
  • (2015)DRAM or no-DRAM?Proceedings of the 2015 Design, Automation & Test in Europe Conference & Exhibition10.5555/2755753.2755915(707-712)Online publication date: 9-Mar-2015
  • (2015)ApproxEigen: An approximate computing technique for large-scale eigen-decomposition2015 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)10.1109/ICCAD.2015.7372656(824-830)Online publication date: Nov-2015
  • (2015)A Novel Method for the Approximation of Multiplierless Constant Matrix Vector MultiplicationProceedings of the 2015 IEEE 13th International Conference on Embedded and Ubiquitous Computing (EUC)10.1109/EUC.2015.27(98-105)Online publication date: 21-Oct-2015

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media