research-article

An Approximate Computing Technique for Reducing the Complexity of a Direct-Solver for Sparse Linear Systems in Real-Time Video Processing

Authors:

Michael Schaffner,

Frank K. Gürkaynak,

Hubert Kaeslin,

Luca BeniniAuthors Info & Claims

DAC '14: Proceedings of the 51st Annual Design Automation Conference

Pages 1 - 6

https://doi.org/10.1145/2593069.2593082

Published: 01 June 2014 Publication History

Abstract

Many video processing algorithms are formulated as least-squares problems that result in large, sparse linear systems. Solving such systems in real time is very demanding. This paper focuses on reducing the computational complexity of a direct Cholesky-decomposition-based solver. Our approximation scheme builds on the observation that, in well-conditioned problems, many elements in the decomposition nearly vanish. Such elements may be pruned from the dependency graph with mild accuracy degradation. Using an example from image-domain warping, we show that pruning reduces the amount of operations per solve by over 75%, resulting in significant savings in computing time, area or energy.

References

[1]

A. Smolic et. al. Disparity-Aware Stereo 3D Production Tools. In CVMP, pages 165--173, 2011.

Digital Library

[2]

P. Amestoy, T. Davis, and I. Duff. Algorithm 837: AMD, An Approximate Minimum Degree Ordering Algorithm. ACM TOMS, 30(3):381--388, 2004.

Digital Library

[3]

J. Bates. Processing With Compact Arithmetic Processing Element, Dec. 24 2010. WO Patent 2,010,148,054.

[4]

H. Cho, J. Lee, and Y. Kim. Efficient Implementation of Linear System Solution Block Using LDLT Factorization. SoC 2008, 03, 2008.

[5]

S. Demirsoy and M. Langhammer. Cholesky Decomposition Using Fused Datapath Synthesis. In ACM/SIGDA FPGA 2009, pages 241--244, 2009.

Digital Library

[6]

Y. Depeng, G. D. Peterson, and H. Li. Compressed Sensing and Cholesky Decomposition on FPGAs and GPUs. Parallel Computing, 38(8):421--437, 2012.

Digital Library

[7]

J. Detrey and F. De Dinechin. A Tool for Unbiased Comparison between Logarithmic and Floating-point Arithmetic. J VLSI SIG PROC SYST, May 2007.

Digital Library

[8]

R. S. Eaton, J. C. McBride, and J. Bates. Reliable ISR Algorithms for a Very-low-power Approximate Computer. In SPIE DSS, pages 871312--871312, 2013.

[9]

F. De Dinechin et. al. An FPGA-Specific Approach to Floating-point Accumulation and Sum-of-products. In ICECE Technology, 2008. FPT., pages 33--40, 2008.

[10]

A. George. Nested dissection of a regular finite element mesh. SIAM Journal on Numerical Analysis, 1973.

[11]

G. H. Golub and C. F. Van Loan. Matrix Computations, volume 3. JHU Press, 2012.

[12]

L. Itti, C. Koch, and E. Niebur. A Model of Saliency-based Visual Attention for Rapid Scene Analysis. IEEE TPAMI, 20(11): 1254--1259, 1998.

Digital Library

[13]

M. Lang et. al. Practical Temporal Consistency for Image-based Graphics Applications. ACM ToG, 2012.

Digital Library

[14]

O. Maslennikow et. al. Parallel implementation of Cholesky LLT-Algorithm in FPGA-based processor. In PPAM, pages 137--147. Springer, 2008.

Digital Library

[15]

P. Greisen et. al. Evaluation and FPGA Implementation of Sparse Linear Solvers for Video Processing Applications. IEEE TCSVT, Aug. 2013.

Digital Library

[16]

P. Krähenbühl et. al. A System For Retargeting of Streaming Video. ACM ToG, 28(5):1, Dec. 2009.

Digital Library

[17]

K. Palem and A. Lingamneni. Ten Years of Building Broken Chips: The Physics and Engineering of Inexact Computing. ACM TECS, 12(2s), May 2013.

Digital Library

[18]

RMIT Univ. An Uncompressed Stereoscopic 3D HD Video Library, Nov. 2013. http://www.rmit3dv.com.

[19]

Y. Saad. Iterative Methods for Sparse Linear Systems Second Edition. SIAM, 2003.

Digital Library

[20]

D. Sonawane and M. Sutaone. High Throughput Iterative VLSI Architecture for Cholesky Factorization Based Matrix Inversion. IJCA, 35(8), 2011.

[21]

J. Sun, G. Peterson, and O. Storaasli. High-performance Mixed-Precision Linear Solver for FPGAs. IEEE TC, 57(12): 1614--1623, 2008.

Digital Library

[22]

The Xiph Open-Source Community. Test Media, Nov. 2013. http://media.xiph.org.

[23]

J. H. Wilkinson. A Priori Error Analysis of Algebraic Processes. In Intern. Congress Math, 1968.

Cited By

Damsgaard HOmetov ANurmi J(2023)Approximation Opportunities in Edge Computing Hardware: A Systematic Literature ReviewACM Computing Surveys10.1145/357277255:12(1-49)Online publication date: 3-Mar-2023
https://dl.acm.org/doi/10.1145/3572772
Esmali Nojehdeh MAltun M(2023)Energy-Efficient Hardware Implementation of Fully Connected Artificial Neural Networks Using Approximate Arithmetic BlocksCircuits, Systems, and Signal Processing10.1007/s00034-023-02363-w42:9(5428-5452)Online publication date: 24-Apr-2023
https://doi.org/10.1007/s00034-023-02363-w
Esmali Nojehdeh MAksoy LAltun M(2020)Efficient Hardware Implementation of Artificial Neural Networks Using Approximate Multiply-Accumulate Blocks2020 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)10.1109/ISVLSI49217.2020.00027(96-101)Online publication date: Jul-2020
https://doi.org/10.1109/ISVLSI49217.2020.00027
Show More Cited By

Index Terms

An Approximate Computing Technique for Reducing the Complexity of a Direct-Solver for Sparse Linear Systems in Real-Time Video Processing

Recommendations

Sparse Approximate Solutions to Linear Systems

The following problem is considered: given a matrix $A$ in ${\bf R}^{m \times n}$, ($m$ rows and $n$ columns), a vector $b$ in ${\bf R}^m$, and ${\bf \epsilon} > 0$, compute a vector $x$ satisfying $\| Ax - b \|_2 \leq {\bf \epsilon}$ if such exists, ...
A Sparse Approximate Inverse Preconditioner for Nonsymmetric Linear Systems

This paper is concerned with a new approach to preconditioning for large, sparse linear systems. A procedure for computing an incomplete factorization of the inverse of a nonsymmetric matrix is developed, and the resulting factorized sparse approximate ...
The design and use of a sparse direct solver for skew symmetric matrices

We consider the LDL^T factorization of sparse skew symmetric matrices. We see that the pivoting strategies are similar, but simpler, to those used in the factorization of sparse symmetric indefinite matrices, and we briefly describe the algorithms used ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

DAC '14: Proceedings of the 51st Annual Design Automation Conference

June 2014

1249 pages

ISBN:9781450327305

DOI:10.1145/2593069

Copyright © 2014 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

In-Cooperation

EDAC: Electronic Design Automation Consortium
SIGBED: ACM Special Interest Group on Embedded Systems
SIGDA: ACM Special Interest Group on Design Automation
IEEE-CEDA

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 June 2014

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

DAC '14

DAC '14: The 51st Annual Design Automation Conference 2014

June 1 - 5, 2014

CA, San Francisco, USA

Acceptance Rates

Overall Acceptance Rate 1,770 of 5,499 submissions, 32%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

10
Total Citations
View Citations
277
Total Downloads

Downloads (Last 12 months)5
Downloads (Last 6 weeks)0

Reflects downloads up to 01 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Damsgaard HOmetov ANurmi J(2023)Approximation Opportunities in Edge Computing Hardware: A Systematic Literature ReviewACM Computing Surveys10.1145/357277255:12(1-49)Online publication date: 3-Mar-2023
https://dl.acm.org/doi/10.1145/3572772
Esmali Nojehdeh MAltun M(2023)Energy-Efficient Hardware Implementation of Fully Connected Artificial Neural Networks Using Approximate Arithmetic BlocksCircuits, Systems, and Signal Processing10.1007/s00034-023-02363-w42:9(5428-5452)Online publication date: 24-Apr-2023
https://doi.org/10.1007/s00034-023-02363-w
Esmali Nojehdeh MAksoy LAltun M(2020)Efficient Hardware Implementation of Artificial Neural Networks Using Approximate Multiply-Accumulate Blocks2020 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)10.1109/ISVLSI49217.2020.00027(96-101)Online publication date: Jul-2020
https://doi.org/10.1109/ISVLSI49217.2020.00027
Bromberger MHoffmann MRehrmann R(2018)Do Iterative Solvers Benefit from Approximate Computing? An Evaluation Study Considering Orthogonal Approximation MethodsArchitecture of Computing Systems – ARCS 201810.1007/978-3-319-77610-1_22(297-310)Online publication date: 8-Mar-2018
https://doi.org/10.1007/978-3-319-77610-1_22
Scholl ABraun CWunderlich H(2017)Energy-efficient and error-resilient iterative solvers for approximate computing2017 IEEE 23rd International Symposium on On-Line Testing and Robust System Design (IOLTS)10.1109/IOLTS.2017.8046244(237-239)Online publication date: Jul-2017
https://doi.org/10.1109/IOLTS.2017.8046244
Scholl ABraun CWunderlich H(2016)Applying efficient fault tolerance to enable the preconditioned conjugate gradient solver on approximate computing hardware2016 IEEE International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems (DFT)10.1109/DFT.2016.7684063(21-26)Online publication date: Sep-2016
https://doi.org/10.1109/DFT.2016.7684063
Zhang QTian YWang TYuan FXu QMarculescu DLiu F(2015)ApproxEigenProceedings of the IEEE/ACM International Conference on Computer-Aided Design10.5555/2840819.2840934(824-830)Online publication date: 2-Nov-2015
https://dl.acm.org/doi/10.5555/2840819.2840934
Schaffner MGürkaynak FSmolic ABenini LNebel WAtienza D(2015)DRAM or no-DRAM?Proceedings of the 2015 Design, Automation & Test in Europe Conference & Exhibition10.5555/2755753.2755915(707-712)Online publication date: 9-Mar-2015
https://dl.acm.org/doi/10.5555/2755753.2755915
Zhang QTian YWang TYuan FXu Q(2015)ApproxEigen: An approximate computing technique for large-scale eigen-decomposition2015 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)10.1109/ICCAD.2015.7372656(824-830)Online publication date: Nov-2015
https://doi.org/10.1109/ICCAD.2015.7372656
Aksoy LFlores PMonteiro J(2015)A Novel Method for the Approximation of Multiplierless Constant Matrix Vector MultiplicationProceedings of the 2015 IEEE 13th International Conference on Embedded and Ubiquitous Computing (EUC)10.1109/EUC.2015.27(98-105)Online publication date: 21-Oct-2015
https://dl.acm.org/doi/10.1109/EUC.2015.27

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten