Associations between MSE and SSIM as cost functions in linear decomposition with application to bit allocation for sparse coding

doi:10.1016/j.neucom.2020.10.018

Neurocomputing

Volume 422, 21 January 2021, Pages 139-149

https://doi.org/10.1016/j.neucom.2020.10.018 Get rights and content

Abstract

The traditional image quality assessments, such as the mean squared error (MSE), the signal-to-noise ratio (SNR), and the Peak signal-to-noise ratio (PSNR), are all based on the absolute error of images. Structural similarity (SSIM) index is another important image quality assessment which has been shown to be more effective in the human vision system (HVS). Although there are many essential differences between MSE and SSIM, some important associations exist between them. In this paper, the associations between MSE and SSIM as cost functions in linear decomposition are investigated. Based on the associations, a bit-allocation algorithm for sparse coding is proposed by considering both the reconstructed image quality and the reconstructed image contrast. In the proposed algorithm, the space occupied by a linear coefficient of a basis in sparse coding is reduced to only 9 to 10 bits, in which 1 bit is used to save the sign of linear coefficient, 3 bits are used to save the number of powers of 10 in scientific notation, and only 5 to 6 bits are used to save the significance digits. The experimental results show that the proposed bit-allocation algorithm for sparse coding can maintain both the image quality and the image contrast well.

Introduction

Sparse coding was proposed to imitate the mammalian visual cortex for image representation in 1996 [1], [2]. Now it has been widely used as an unsupervised learning method for sparse representation in various fields [3], [4], [5]. Although lots of works have been conducted in sparse coding and its applications, few efforts have been taken to discuss bit allocation of the coded data in sparse coding. Bit allocation is very important for some applications of sparse coding, such as compressed sensing [6] and fractal image coding [7]. In these applications, signal is encoded by sparse coding, and then the encoded data need to be saved and used to reconstruct the original signal. As an important part of coding technology, the spaces occupied by the data have great influence on the compression rate and the quality of reconstructed signal.

The absolute error-based image assessments, such as the mean square error (MSE), the signal-to-noise ratio (SNR), and the Peak signal-to-noise ratio (PSNR), are the most commonly used measurements to measure the similarity of images. For two image blocks $x$ and $y$ , if the pixels in $x$ are $x_{1}, x_{2}, \dots, x_{p}$ , and the pixels in $y$ are $y_{1}, y_{2}, \dots, y_{p}$ , then the MSE value between $x$ and $y$ can be calculated as following, $MSE (x, y) = \frac{1}{p} \sum_{i = 1}^{p} {(y_{i} - x_{i})}^{2} .$

Therefore, MSE is a pixel error-based measurement. SNR and PSNR are both derived from MSE. These absolute error-based assessments are not only used to measure image quality, but also used to measure almost all kinds of signals.

Structural similarity (SSIM) index, proposed by Wang and Bovik [8], aims to improve the effectiveness of image quality assessment (IQA) in human visual systems (HVS). In SSIM, the errors are taken as three parts: the luminance error, the contrast error, and the structure error. For two image blocks $x$ and $y$ , if $μ_{x}$ and $μ_{y}$ are the means of the pixels in the image blocks $x$ and $y$ , respectively, $σ_{x}$ and $σ_{y}$ are the standard deviations of the pixels in the image blocks $x$ and $y$ , respectively, and $σ_{xy}$ is the covariance between $x$ and $y$ , then SSIM gets a form as $SSIM (x, y) = [\frac{2 μ_{x} μ_{y} + ε_{1}}{μ_{x}^{2} + μ_{y}^{2} + ε_{1}}] [\frac{2 σ_{xy} + ε_{2}}{σ_{x}^{2} + σ_{y}^{2} + ε_{2}}],$ where $ε_{1}, ε_{2} < < 1$ are two small positive constants. If the variance of a given image block $y$ is zero, then $y$ can be losslessly linearly expressed by $1$ with all ones. Because this paper focuses on linear decomposition and sparse coding, here we only consider the image blocks with non-zero variance. At this case, we can set $ε_{1} = ε_{2} = 0$ and SSIM gets a simpler form as $SSIM (x, y) = \frac{4 μ_{x} μ_{y} σ_{xy}}{(μ_{x}^{2} + μ_{y}^{2}) (σ_{x}^{2} + σ_{y}^{2})} .$

SSIM does not have an absolute advantage over PSNR. For example, Some researches found that PSNR is better than SSIM for Gaussian blur [9], [10]. Dosselmann and Yang showed that SSIM index between two images can be predicted by PSNR between them [11]. Although SSIM has some limitations, it achieves better performance in many synthetic datasets, and has been widely accepted as an effective image quality assessment and applied in many fields [12], [13].

As described above, MSE is a pixel-based image assessment, and SSIM is a structure-based image assessment for HVS. Thus, there are many significant and essential differences between them. Here we do not focus on these important differences. On the contrary, some interesting associations are discussed when we take the image quality assessments as the cost functions in linear decomposition. Firstly, the selected bases from a basis set for a target vector are the same in the linear decomposition schemes with different cost functions MSE and SSIM. Secondly, for a target vector, the ratio of the corresponding linear coefficients of the selected bases in the MSE-based linear decomposition scheme and the SSIM-based scheme is a constant, which is just the value of Pearson’s correlation coefficient between the target vector and its estimated vector.

According to these interesting associations between MSE and SSIM, the absolute value of the linear coefficient of a selected basis in the SSIM-based linear decomposition scheme is always not less than that of the same basis in the MSE-based scheme. The reconstructed image with larger linear coefficients has higher image contrast. By considering both the reconstructed image quality and the reconstructed image contrast, a bit-allocation algorithm for coded data in sparse coding is proposed here. For linear coefficient s of a selected basis, if $| s | = a \times 10^{b}$ in scientific notation, we use 1 bit to store the sign of s, 3 bits to store b, and 5 to 6 bits to store a. The experiment results show that the proposed algorithm can maintain image quality and image contrast well.

The rest of this paper is structured as following. In Section 2 we briefly introduce linear decomposition. The associations between MSE and SSIM as cost functions in linear decomposition are studied in Section 3. Then the bit-allocation algorithm for sparse coding is proposed in Section 4. In Section 5, several experiments are conducted to discuss the number of bits used to save the parameters of sparse coding in the proposed bit-allocation algorithm. Finally, the conclusions are drawn in Section 6.

Section snippets

Linear decomposition

Linear decomposition plays an important role in various fields such as linear approximation, sparse coding, and portfolio [14]. Especially, sparse coding is an important tool in image processing. Suppose we have a vector set $X$ with n vectors { $x_{1}, x_{2}, \dots, x_{n}$ }, and each vector is an image block with size l $\times$ $l, p$ = l $\times$ l. For an image block $y$ with size l $\times$ l, we need to find a linear transformation $x = s_{1} x_{1} + s_{2} x_{2} + \dots + s_{n} x_{n} + o 1$ to linearly approximate $y$ , where $s_{1}, s_{2}, \dots, s_{n}$ and o are the linear scalar coefficients and

Linear decomposition with different cost functions MSE and SSIM

In linear decomposition, a linear transformation $s_{1} x_{1} + s_{2} x_{2} + \dots + s_{n} x_{n} + o 1$ with a few non-zero $s_{i}$ needs to be found to approximate a target vector $y, i = 1, 2, \dots, n$ . Without loss of generality, assume $x_{1}, x_{2}, \dots, x_{m}$ are the selected bases with non-zero value of $s_{i}$ , and $x = s_{1} x_{1} + s_{2} x_{2} + \dots + s_{m} x_{m} + o 1$ is the best linear approximation for the target vector $y$ . Let the standard deviation of the elements in $x_{i}$ be $σ_{i}$ , the standard deviations of the elements in $x$ and $y$ be $σ_{x}$ and $σ_{y}$ , respectively, the means of the elements in $x_{i}$

Bit allocation for sparse coding

Although lots of works have been conducted in sparse coding and its applications, few works have been made for bit allocation of the coded sparse data in sparse coding. Here, based on the associations between the MSE-based linear decomposition scheme and the SSIM-based scheme, we discuss bit-allocation algorithm for sparse coding.

Suppose we have an overcomplete basis set $X$ with n bases { $x_{1}, x_{2}, \dots, x_{n}$ }, and each basis is a p-dimensional zero-mean and unit-length vector. Here the word “overcomplete”

Experimental analysis

According to the above section, the linear coefficient s in sparse coding can be saved with $k + 4$ bits, the index of $x_{i}$ in the basis set can be saved with $q + 1$ bits, and the offset o can be saved with $l + 8$ bits, in which the value of q can be determined by the number of bases in the basis set. Here we discuss the values of k and l by experiments.

Conclusion

As two important image quality assessments, mean square error (MSE) and the structural similarity (SSIM) index have been widely accepted and used to measure the similarity of images. Although there are many essential differences between them, some interesting associations between them are investigated in this paper. Firstly, when MSE and SSIM are used as cost functions in a linear decomposition scheme with a fixed sparsity, the same bases will be searched for a given target vector. Secondly,

CRediT authorship contribution statement

Jianji Wang: Methodology, Writing - original draft. Pei Chen: Validation. Nanning Zheng: Methodology, Supervision. Jose C. Principe: Supervision, Validation. Fei-Yue Wang: Writing - review & editing.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgment

This work was supported in part by the National Key Research and Development Program of China under grant 2016YFB1000901, the key project of Trico-Robot plan of NSFC under grant No. 91748208, and the National Natural Science Foundation of China under Grants 91648208.

Jianji Wang is currently an Associate Professor in the Institute of Artificial Intelligence and Robotics (IAIR) at Xi’an Jiaotong University. His research interests include image processing, machine learning, and correlation analysis.

References (24)

B.A. Olshausen et al.
Sparse coding with an overcomplete basis set: A strategy employed by V1?
Vision Res.
(1997)
W. Zheng et al.
A novel approach inspired by optic nerve characteristics for few-shot occluded face recognition
Neurocomputing
(2020)
T. Han et al.
A sparse autoencoder compressed sensing method for acquiring the pressure array information of clothing
Neurocomputing
(2018)
J. Gao et al.
SCAR: Spatial-/channel-wise attention regression networks for crowd counting
Neurocomputing
(2019)
J. Wang et al.
Parameter analysis of fractal image compression and its applications in image sharpening and smoothing
Signal Process. Image Commun.
(2013)
B.A. Olshausen et al.
Emergence of simple-cell receptive field properties by learning a sparse code for natural images
Nature
(1996)
X.Y. Zhang et al.
Time-frequency audio feature extraction based on tensor representation of sparse coding
Electr. Lett.
(2015)
K. Fotiadou et al.
Snapshot High Dynamic Range Imaging via Sparse Representations and Feature Learning
IEEE Trans. Multimedia
(2020)
J. Wang, Y. Liu, P. Wei, Z. Tian, Y. Li, and N. Zheng, “Fractal image coding using SSIM, Proc. IEEE Conf. Image...
Z. Wang et al.
Image quality assessment: from error visibility to structural similarity
IEEE Trans. Image Process.
(2004)

I. Avcibas et al.

Statistical evaluation of image quality measures

J. Electr. Imag.

(2002)

A. Hore et al.

Image quality metrics: PSNR vs

SSIM, Proc. IEEE Int. Conf. Pattern Recognition (ICPR)

(2010)

Cited by (21)

Crafting transferable adversarial examples via contaminating the salient feature variance
2023, Information Sciences
Adversarial attacks play a vital role in the development of deep learning techniques, which can evaluate the robustness of deep neural networks (DNNs) as well as explore their decision mechanism. Recently, feature-level attacks have been proposed to contaminate the internal feature maps of the source model at each iteration, providing a new method to produce transferable adversarial examples. In this paper, we uncover two neglected problems behind current feature-level attacks and propose an ingenious Salient Feature Variance Attack (SFVA). Concretely, we first apply a Combined Feature Enhancement Transformation (CFET) on the copies of clean images to estimate the optimal feature weight. Then we construct an efficient objective based on the variance of salient features and adopt a classical attack MI-FGSM (MI) to add adversarial noises to the clean image along the direction of gradients. Moreover, we also make it possible to combine the ensemble strategy with feature-level attacks. Abundant experiments on the ImageNet dataset forcefully confirm the superiority of SFVA, which has become a state-of-the-art feature-level attack. Furthermore, we also evaluate the robustness of the practical online model with SFVA, where the 90% attack success rate reveals a worrying fact that the real-world deployed models are subject to serious security threats.
Machine learning-aided optimization of coal decoupling combustion for lowering NO and CO emissions simultaneously
2022, Computers and Chemical Engineering
Decoupling combustion technology enables significant suppression of NO_x and CO emissions from solid fuel combustion, but calls for optimizing reactor structure to make full use of its superiority. Taking a coal stove as an example, three different network models were established and trained to predict the steady-state NO and CO emissions from coal decoupling combustion well. The two GRU-DNN models have higher prediction accuracy and better generalization ability than the DNN model, but they both need to be fed with complex sequence data, leading to long training and response time to new inputs. The DNN model with simple fuel properties and structural parameters as the inputs was used to forecast the steady-state NO and CO emissions from various coal-stove combinations with acceptable accuracy, so facilitating the optimization of stove structure and further coal decoupling combustion to lower the NO and CO emissions simultaneously.
Intelligent diagnosis of mechanical faults of in-wheel motor based on improved artificial hydrocarbon networks
2022, ISA Transactions
Citation Excerpt :
AHNs is improved for classification problems on account of the basic theory, and the classification error and the position error are focused simultaneously to structure a new model representing the behavior of AHNs. Since mean squared error (MSE) as a tradition scheme can quantize the difference between the actual value and the predicted value to judge the position error of independent variable, then acquire the model with higher accuracy [46], which can also reflect the precision of the model. The cross entropy loss (CEL) can calculate the distance of probability distributions between the two to infer the classification error, and can solve the class imbalance and improve the accuracy of the model [47].
For the driving safety of electric vehicle (EV), intelligent diagnosis based on artificial hydrocarbon networks (AHNs) is proposed to detect mechanical faults of in-wheel motor (IWM) which is a promising force pattern of EV. AHNs, a novel mathematical model of supervised learning algorithm, can encapsulate or inherit or mix any information, then are adapted to deal with serious external interference and the variable operating conditions. Based on the basic AHNs, complex error function is proposed to optimize more information of classification targets, and distance error ratio is defined to evaluate the performance. Then, the improved AHNs is employed to build two intelligent diagnosis systems namely one-stop diagnosis and sequential diagnosis, which select the same and different symptom parameters as the object of a follow-on process, respectively. The effectiveness of the proposed methods is validated by two case studies of Case Western Reserve University dataset and mechanical faults data from IWM’s test bench.
Storing Images in DNA via base128 Encoding
2024, Journal of Chemical Information and Modeling
A Novel Deep-Learning-Based CADx Architecture for Classification of Thyroid Nodules Using Ultrasound Images
2023, Interdisciplinary Sciences – Computational Life Sciences
Multi-angle lensless ptychographic imaging via adaptive correction and the Nesterov method
2023, Applied Optics

View all citing articles on Scopus

Pei Chen is currently a PhD Candidate at the Institute of Artificial Intelligence and Robotics (IAIR), Xi’an Jiaotong University. His research interests include image processing and intelligent vehicle system.

Nanning Zheng received a PhD degree from Keio University, Japan, in 1985. He is currently a professor and the director of the Institute of Artificial Intelligence and Robotics at Xi’an Jiaotong University. His research interests include computer vision, pattern recognition, computational intelligence, image processing, and hardware implementation of intelligent systems. Since 2000, he has been the Chinese representative on the Governing Board of the International Association for Pattern Recognition. He became a member of the Chinese Academy of Engineering in 1999. He is a Fellow of IEEE.

Badong Chen is currently a professor at the Institute of Artificial Intelligence and Robotics (IAIR), Xi’an Jiaotong University. His research interests are in signal processing, information theory, machine learning, and their applications in cognitive science and engineering. He is an associate editor of IEEE Transactions on Neural Networks and Learning Systems and Journal of the Franklin Institute, and has been on the editorial board of Entropy.

Jose C. Principe is currently the Distinguished Professor of electrical and biomedical engineering at the University of Florida, Gainesville, FL, USA. He is the BellSouth Professor and the Founder and Director of the University of Florida Computational Neuro-Engineering Laboratory. He is involved in biomedical signal processing, in particular, the electroencephalogram (EEG) and the modeling and applications of adaptive systems. He is the past Editor-in-Chief of the IEEE Transactions on Biomedical Engineering, the past President of the International Neural Network Society, and the former Secretary of the Technical Committee on Neural Networks of the IEEE Signal Processing Society. He is an AIMBE Fellow and received the IEEE Engineering in Medicine and Biology Society Career Service Award. He is also a former member of the Scientific Board of the Food and Drug Administration, and a member of the Advisory Board of the McKnight Brain Institute at the University of Florida.

Fei-Yue Wang is currently the Director of The State Key Laboratory for Management and Control of Complex Systems. Dr. Wang’s current research focuses on methods and applications for parallel systems, social computing, and knowledge automation. He was the Founding Editor-in-Chief of the International Journal of Intelligent Control and Systems (1995–2000), Founding EiC of IEEE ITS Magazine (2006–2007), EiC of IEEE Intelligent Systems (2009–2012), and EiC of IEEE Transactions on ITS (2009–2016). Currently he is EiC of China’s Journal of Command and Control. Since 1997, he has served as General or Program Chair of more than 20 IEEE, INFORMS, ACM, ASME conferences. He was the President of IEEE ITS Society (2005–2007), Chinese Association for Science and Technology (CAST, USA) in 2005, the American Zhu Kezhen Education Foundation (2007–2008), and the Vice President of the ACM China Council (2010– 2011). Since 2008, he is the Vice President and Secretary General of Chinese Association of Automation. Dr. Wang is elected Fellow of IEEE, INCOSE, IFAC, ASME, and AAAS. In 2007, he received the 2nd Class National Prize in Natural Sciences of China and awarded the Outstanding Scientist by ACM for his work in intelligent control and social computing. He received IEEE ITS Outstanding Application and Research Awards in 2009 and 2011, and IEEE SMC Norbert Wiener Award in 2014.

View full text

Associations between MSE and SSIM as cost functions in linear decomposition with application to bit allocation for sparse coding

Abstract

Introduction

Section snippets

Linear decomposition

Linear decomposition with different cost functions MSE and SSIM

Bit allocation for sparse coding

Experimental analysis

Conclusion

CRediT authorship contribution statement

Declaration of Competing Interest

Acknowledgment

Vision Res.

Neurocomputing

Neurocomputing

Neurocomputing

Signal Process. Image Commun.

Emergence of simple-cell receptive field properties by learning a sparse code for natural images

Nature

Time-frequency audio feature extraction based on tensor representation of sparse coding

Electr. Lett.

Snapshot High Dynamic Range Imaging via Sparse Representations and Feature Learning

IEEE Trans. Multimedia

Image quality assessment: from error visibility to structural similarity

IEEE Trans. Image Process.

Statistical evaluation of image quality measures

J. Electr. Imag.

Image quality metrics: PSNR vs

SSIM, Proc. IEEE Int. Conf. Pattern Recognition (ICPR)