Elsevier

Neurocomputing

Volume 429, 14 March 2021, Pages 89-100
Neurocomputing

A fast DC-based dictionary learning algorithm with the SCAD penalty

https://doi.org/10.1016/j.neucom.2020.12.003Get rights and content

Abstract

In recent years, there has been growing concerns on the study of dictionary learning with the nonconvex sparsity-including penalty. However, how to efficiently address the dictionary learning with the nonconvex penalty is still an open problem. In this paper, we present an efficient DC-based algorithm for dictionary learning with the nonconvex smoothly clipped absolute deviation (SCAD) penalty for strong sparsity and accurate estimation. The optimization problem we considered can be generalized as a minimization of the representation error with the SCAD penalty. The approach we proposed is based on a decomposition scheme which decomposes the whole problem into a set of subproblems with regard to single-vector factors. For handling the nonconvexity of the representation error in the subproblems, we use an alternating optimization scheme to update one factor with the other factor fixed. For tackling the nonconvexity of the SCAD penalty in the subproblems, we apply the Difference of Convex functions (DC) technology to convert the nonconvex subproblem into the resulting convex problems and thus employ DC algorithm to solve the corresponding optimization; thus the simple and straightforward solutions in the closed form can be easily derived. As verified by the numerical experiments with synthetic and real-world data, the proposed algorithm performs better than the state-of-the-art algorithms with different sparsity-including constraints.

Introduction

Nowadays, sparse representation plays a significant and crucial role in the domain of signal processing. It has been widely used in many applications, such as classification [1], [2], denoising [3], [4], recognition [5], [6] and so on. Sparse representation is to represent the signal by the linear combination of few atoms from an overcomplete dictionary. The classical sparse representation model on a given signal sRm can be formulated as sNx=i=1rxini, where xRr is sparse associated with x0r. Let NRm×r be an overcomplete dictionary, whose dimension m is far less than the number of the atom r. The optimization problem of sparse representation consists in addressing the minimization of the representation error s-NxF2 combined with the sparsity-including regularizer Sp(x), which can be denoted as:minxs-NxF2+λSp(x),where λ refers to the regularization parameter controlling the sparsity.

A pivotal issue related to sparse representation is to select a suitable dictionary on which the signal is represented sparsely. The dictionary for sparse representation can be designed by either choosing one from a predefined set of linear transformations or learning a dictionary according to a set of training signals. The predefined dictionaries involve wavelets [7], wavelet packet basis, discrete cosine transform (DCT) and more. Selecting a predefined dictionary for sparse representation is usually fast and simple. Nevertheless, the predefined dictionary cannot fit the intrinsic structure of the signal well. Inversely, the learned dictionary can match the intrinsic structure of the signal well and has potentially preferable performance. Hence, plenty of works have been actively committed to learning dictionary for sparse representation.

Most existing algorithms [8], [9], [10], [11], [12], [13], [14] for dictionary learning utilize an iteratively alternating optimization scheme, which contains sparse coding phase (updating sparse representation with the fixed dictionary) and dictionary update phase (updating the dictionary with the fixed representation). Some existing dictionary learning methods often use l0-norm constraint for strong sparsity [8], [10]. But the l0-norm constrained optimization problem is intractable, because the exact determination of the sparse coding using the l0-norm has been proved to be NP-hard and cannot be applied to high-dimensional data [10]. Some works employed the convex relaxation l1-norm to approximate the l0-norm. It is convex and the corresponding optimization can be solved easily. However, the sparsity of solutions using the l1-norm is weak. Furthermore, the l1-norm often leads to overpenalization of large elements in a sparse vector, leading to biased estimation [15]. In order to overcome these deficiencies, researchers are quite active in employing the nonconvex relaxation approaches to constrain the dictionary learning problem because the nonconvex regularizers can yield stronger sparsity and more accurate solutions [16]. Although the nonconvex regularizer is popular in dictionary learning, the corresponding nonconvex optimization problem is challenging [17]. Therefore, choosing a suitable sparsity constraint to obtain the strong sparsity-promoting solutions accurately, and moreover whose corresponding optimization easily to be solved is imperative for the dictionary learning problem.

In this paper, we employ the semi-continuous and nonconvex smoothly clipped absolute deviation (SCAD) [18] sparsity penalty instead of the l0-norm to enforce stronger sparsity and to obtain less biased estimation compared with the l1-norm. The dictionary learning problem can be generalized as a minimization of the representation error with the SCAD penalty, which is nonconvex due to the nonconvex representation error and the nonconvex SCAD penalty. To optimize the problem efficiently, we decompose the overall problem into some subproblems over single-vector factors. To address the nonconvexity of the representation error, we employ the alternating optimization to update one factor with the other factor fixed. To address the nonconvexity of SCAD, we employ the Difference of Convex functions (DC) [19] technology which decomposes the nonconvex function into two convex functions and then optimize the resulting convex problems based on the DC algorithm [20]. In particular, the closed-form solutions can be explicitly obtained, leading to a fast DC-based dictionary learning algorithm with the SCAD penalty (FDCDL-SCAD). To our best knowledge, it is the first study that addresses the dictionary learning with the SCAD penalty based on the decomposition scheme and the DC algorithm. Furthermore, the experimental results show that the proposed FDCDL-SCAD performs better in terms of the dictionary recovery and the recovery sparsity, which demonstrate that FDCDL-SCAD could obtain sparser and more accurate solutions than the l1-norm based algorithms. The main contribution can be summarized as:

  • 1.

    We employ the SCAD function as a sparsity penalty in the sparse coding phase. The SCAD penalty results in an estimator simultaneously satisfying strong sparsity, unbiasedness and continuity [18], so that the resulting estimation is sparser and more accurate compared with the l1-norm based methods.

  • 2.

    We propose to use a decomposition scheme in which the problem with regard to the matrix factors can be cast to a number of subproblems over the single-vector factors, so that the convergence speed of the algorithm can be improved.

  • 3.

    To handle the nonconvexity of the subproblems, we first employ the alternating scheme to update one factor with the other factor fixed. Then, we adopt the DC technology to decompose the nonconvex SCAD penalized problem into two convex subproblems and employ the DC algorithm to solve the resulting subproblems, so that the closed-form solutions can be derived explicitly and easily, leading to a fast and efficient dictionary learning algorithm.

This paper is constructed as follows. Section 2 describes the dictionary learning problem, including sparse coding and dictionary update. The FDCDL-SCAD algorithm in detail is elaborated in Section 3. In Section 4, numerical experiments verify the practical benefits of the proposed algorithm. We discuss dictionary recovery results with the synthetic data and show one general application (image denoising) based on FDCDL-SCAD algorithm. In the end, the paper is concluded in Section 5.

Section snippets

Dictionary learning problem

This section concisely states the problem of dictionary learning, containing the sparse coding problem and the dictionary update problem, and describes some state-of-the-art approaches for dictionary learning.

Fast DC-based dictionary learning with the scad penalty

In this section, we elaborate the problem formulation and the algorithmic details of FDCDL-SCAD. We also analyze the computational complexity of the FDCDL-SCAD algorithm.

Experiment study

In this section, we make the experiments to evaluate the performances of our dictionary learning algorithm (FDCDL-SCAD). Experiments on the synthetic signals and real signals are described respectively. They have been operated on a windows machine with 4 Gb of memory and a Intel(R) Core(TM) i5-5200U CPU clocked at 2.2 GHz. All the codes have been accomplished in Matlab.

Conclusion

This paper developed a novel and efficient DC-based dictionary learning algorithm with the nonconvex SCAD penalty for strong sparsity and accurate solution. The minimization problem composed of the representation error with the SCAD penalty is nonconvex and nonsmooth. We employed the decomposition scheme which decomposed the whole problem into the subproblems with regard to single-vector factors. Then, we employed the alternating optimization scheme updating one factor with the other factor

CRediT authorship contribution statement

Zhenni Li: Methodology, Writing - review & editing. Chao Wan: Conceptualization, Writing - original draft. Benying Tan: Writing - review & editing. Zuyuan Yang: Supervision, Writing - review & editing. Shengli Xie: Supervision.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgements

This work was supported in part by the National Natural Science Foundationof China under grants 61803096, 61722304, 61703113, 61727810, and in part by the Science and Technology Plan Project of Guangzhou under Grant 202002030289, and in part by the Key Areas of Research and Development Plan Project of Guangdong under Grant 2019B010147001 and the Major Research Project on Industry Technology of Guangzhou under Grant 201902020014.

Zhenni Li received the B.Sc. degree in 2009 from School of Physical Science and Electronics, Shanxi Datong University, China. She received the M.Sc degree in 2012 from School of Physics and Optoelectronic, Dalian University of Technology, China, and received Ph.D. degree in School of Computer Science and Engineering, University of Aizu, Japan. Now she is an Associate professor in Guangdong Key Laboratory of IoT Information Technology, School of Automation, Guangdong University of Technology,

References (29)

  • Z. Li et al.

    Manifold-optimization based analysis dictionary learning with a l1/2-norm regularizer

    Neural Networks

    (2018)
  • Z. Wu et al.

    Correntropy based scale icp algorithm for robust point set registration

    Pattern Recogn.

    (2019)
  • Z. Li et al.

    Incoherent dictionary learning with log-regularizer based on proximal operators

    Digital Signal Process.

    (2017)
  • V. Singhal et al.

    Row-sparse discriminative deep dictionary learning for hyperspectral image classification

    IEEE J. Selected Topics Appl. Earth Observations Remote Sensing

    (2018)
  • Z. Yang et al.

    Adaptive method for nonsmooth nonnegative matrix factorization

    IEEE Trans. Neural Networks Learn Syst.

    (2017)
  • L. Jia et al.

    Image denoising via sparse representation over grouped dictionaries with adaptive atom size

    IEEE Access

    (2017)
  • C. Bao et al.

    Dictionary learning for sparse coding: Algorithms and convergence analysis

    IEEE Trans. Pattern Anal. Mach. Intell.

    (2016)
  • I.W. Selesnick et al.

    The dual-tree complex wavelet transform

    IEEE Signal Process. Mag.

    (2005)
  • M. Aharon et al.

    K-svd: An algorithm for designing overcomplete dictionaries for sparse representation

    IEEE Trans. Signal Process.

    (2006)
  • M. Yaghoobi et al.

    Dictionary learning for sparse approximations with the majorization method

    IEEE Trans. Signal Process.

    (2009)
  • C. Bao et al.

    A convergent incoherent dictionary learning algorithm for sparse coding

    European Conference on Computer Vision, Zurich

    (2014)
  • A. Rakotomamonjy

    Direct optimization of the dictionary learning problem

    IEEE Trans. Signal Process.

    (2013)
  • A. Beck et al.

    A fast iterative shrinkage-thresholding algorithm for linear inverse problems

    SIAM J. Imaging Sci.

    (2009)
  • W. Dai et al.

    Simultaneous codeword optimization for dictionary update and learning

    IEEE Trans. Signal Process.

    (2012)
  • Cited by (0)

    Zhenni Li received the B.Sc. degree in 2009 from School of Physical Science and Electronics, Shanxi Datong University, China. She received the M.Sc degree in 2012 from School of Physics and Optoelectronic, Dalian University of Technology, China, and received Ph.D. degree in School of Computer Science and Engineering, University of Aizu, Japan. Now she is an Associate professor in Guangdong Key Laboratory of IoT Information Technology, School of Automation, Guangdong University of Technology, China. Her research interests include machine learning, sparse representation and resource allocation in cloud/edge computing.

    Chao Wan received bachelor’s degree from Nanjing University of Science and Technology ZiJin College, Nanjing, China, in 2018. He is currently a graduate student at Guangdong University of Technology, Guangzhou, China. His research interests mainly focus on dictionary learning, image processing.

    Benying Tan received the B.E. degree from Huazhong University of Science and Technology, China, in 2009, received the M.Sc. degree from University of Aizu, Japan, in 2017, and received Ph.D. degree in School of Computer Science and Engineering, University of Aizu, Japan, in 2020. He is currently an Assistant Professor in the School of Artificial Intelligence, Guilin University of Electronic Technology, China. His main research interests include sparse representation, optimization, and machine learning.

    Zuyuan Yang (M’15) received the B.E. degree from the Hunan University of Science and Technology, Xiangtan, China, in 2003, and the Ph.D. degree from the South China University of Technology, Guangzhou, China, in 2010. He is currently a Professor in Guangdong Key Laboratory of IoT Information Technology, School of Automation, Guangdong University of Technology, China. His current research interests include blind source separation, nonnegative matrix factorization, and image processing. Dr. Yang won the Excellent Ph.D. Thesis Award Nomination of China. He joined the National Program for New Century Excellent Talents in University and received the Guangdong Distinguished Young Scholar Award.

    Shengli Xie, (M’01-F’19) received the M.S. degree in mathematics from Central China Normal University, Wuhan, China, in 1992, and the Ph.D. degree in control theory and applications from the South China University of Technology, Guangzhou, China, in 1997. He is the Director of the Laboratory for Intelligent Information Processing (LIIP) and a Full Professor with the Guangdong University of Technology, Guangzhou. He has authored or co-authored two monographs and more than 100 scientific papers published in journals and conference proceedings. His current research interests include automatic control and signal processing, especially blind signal processing and image processing.

    View full text