BE-ACGAN: Photo-realistic residual bit-depth enhancement by advanced conditional GAN

doi:10.1016/j.displa.2021.102040

Displays

Volume 69, September 2021, 102040

https://doi.org/10.1016/j.displa.2021.102040 Get rights and content

Highlights

•
An advanced conditional generative adversarial network is proposed for BE task.
•
The BE-ACGAN is trained to reconstruct residual images rather than HBD ones.
•
The discriminator is improved by adopting ZP images as additional inputs.
•
A more reliable loss function is involved to stabilize the adversarial training.

Abstract

Since more demands for high quality visualization have been raised in various fields, monitors with higher bit-depth (HBD) become popular in recent years. However, most digital images are at low bit-depth (LBD) and usually of low visual quality with annoying false contours when displayed on HBD monitors directly. To reconstruct visually pleasant HBD images, many bit-depth enhancement (BE) algorithms have been proposed from various aspects, but the recovered HBD images are usually unsatisfactory with conspicuous false contours or over-blurred textures. Inspired by discriminative learning, we propose a residual BE algorithm based on advanced conditional generative adversarial network (BE-ACGAN), in which the discriminator adversarially helps assess image quality and train the generator to achieve more photo-realistic recovery performance. Besides, since it is hard to distinguish between the reconstructed and real HBD images with similar structures, the discriminator takes residual images as input and further takes LBD images as conditions to achieve more reliable performance. In addition, we present a novel loss function to deal with the difficulty of unstable adversarial training. The proposed algorithm outperforms the state-of-the-art methods on large-scale benchmark datasets. Source codes are available at https://github.com/TJUMMG/BE- ACGAN/.

Introduction

Using more colors to display allows finer color graduations, smoother gradients and more details, which can improve the visual quality significantly. Therefore, a number of studies on wide color gamut (WCG) and high dynamic range (HDR) [1] have been conducted for vivid and realistic displays. 10-bit (i.e., 1,024 colors) and 16-bit (i.e., 65,536 colors) monitors [2] have been increasingly used for high quality visualization. However, most mainstream images and digital image acquisition equipments are of 8-bit or lower bit-depth. Besides, some image lossy compress [3] operations also decrease the content bit-depth. These LBD images suffer from false contour artifacts and chroma distortions when linearly de-quantized to HBD ones and displayed on high dynamic range (HDR) screens. Note that bit-depth enhancement is the inverse problem of quantization, which aims to reconstruct the missing least significant bits. It is quite different from inverse tone mapping [4], which tries to recover the lost details due to non-linear tone mapping [5] operators.

Numerous BE algorithms have been proposed from various aspects, and they have achieved relative good performance. Pixel-wise algorithms, including Zero Padding (ZP) and Bit Replication (BR) [6]. ZP is the basis of the BE algorithm, which directly fills the missing least significant bits (LSBs) [7] with zero. BR is slightly improved on the basis of ZP, which enhances the overall brightness of the image. Though they are of high efficiency, the reconstructed HBD images are visually uncomfortable with annoying false contour artifacts since the surrounding structural features are ignored. To eliminate these false contour artifacts, plenty of context-aware algorithms are proposed. Content Adaptive Image Bit-depth Expansion (CA) [8] and Contour Region Reconstruction (CRR) [9] reconstruct HBD pixel values by neighborhood-flooding based on the distances from the surrounding false contours. They can greatly eliminate false contours, but the details in local minimum/maximum regions are usually over-blurred. In addition, Maximum a Posteriori Estimation of AC Signal (ACDC) [10] and Intensity Potential for Adaptive De-quantization (IPAD) [11] are proposed to reconstruct HBD images from the aspect of graph signal processing and natural image statistics. Although they outperforms other unsupervised algorithms, the false contours are not entirely eliminated. Recently, supervised BE algorithms based on deep learning have been proposed. Bit-Depth Enhancement via Convolutional Neural Network (BE-CNN) [12] is a simple end-to-end network, which takes advantage of gradually expanded receptive field. Bit-depth Enhancement by Concatenating All Level Features of Deep Neural Network (BE-CALF) [13] introduced skip connections between every two layers to perserve structural features. Besides, it is discovered that the residual images are easier to reconstruct than the HBD images. Additionally, Deep Bit-Depth Expansion Network (BDEN) [14] proposed a two-stream network to construct flat and non-flat areas separately, which further suppressed false contours in flat areas. Effective CNN frameworks are carefully designed to better reconstruct HBD images, however, the HBD images recovered by these methods still loss some details and are far away from photo-realistic.

Recently, GAN [15] and its variants [16], [17], [18] have been widely adopted for many computer vision tasks to generate photo-realistic images. Many training procedures [19], [20], [21] are also proposed since it is challenging to find a Nash equilibrium in adversarial training. Moreover, GAN generally outperforms simple generative networks for various image enhancement tasks, such as super-resolution [22], [23], [24], [25], and there is no reason for the BE task to be an exception.

Since BE is a specific image enhancement task in bit-depth of pixel value, the expected finer quantized HBD images can be simply obtained by pixel-wise adding the coarser quantized LBD images and corresponding residual images. Moreover, the residual images with pixel values limited to a quantization step are easier to reconstruct/distinguish than HBD ones. Therefore, in this paper, we propose to reconstruct HBD images indirectly by reconstructing the residual between HBD images and LBD ones with BE-ACGAN, in which the generator learns to reconstruct residual images, and the discriminator is trained to evaluate the residual image recovery performance and help train the generator adversarially. Besides, to improve the distinguish performance and further enhance the reliability of the generator, LBD images are used as conditional inputs of the discriminator. We also propose a more reliable adversarial loss function for the BE task, which can greatly stabilize the adversarial training. Extensive experiments on large-scale datasets show superior recovery performance of BE-ACGAN both objectively and subjectively. The main contributions of BE-ACGAN are summarized as follows:

(1)
We explore the superior performance of GAN for BE task and present a novel algorithm known as BE-ACGAN, which involves residual learning approach.
(2)
In order to enhance the discriminative performance, LBD images with useful quantization information are fed into the discriminator as conditions.
(3)
We compare the proposed BE-ACGAN from other GAN-based image enhancement algorithms, and found that the GAN-based algorithm is more suitable for BE task with the proposed residual learning approach.

The rest of the paper is organized as follows. We thoroughly describe the proposed algorithm in Section 2. Then the experimental results and analysis are given in Section 3. Section 4 compares the difference between the BE-ACGAN and other GAN-based image enhancement methods. Finally, the conclusion of the paper is provided in Section 5.

Section snippets

The Proposed BE-ACGAN

The proposed algorithm reconstruct HBD images indirectly by recovering the residual between HBD and LBD images by BE-ACGAN, which consists of a generator and a discriminator network, as illustrated in Fig. 1. The input LBD image $I_{LBD}$ is zero-padded to the HBD version $I_{ZP}$ , which is fed into the generator to reconstruct the residual image ${\hat{I}}_{residual}$ . Besides, to further improve the generator by adversarial training, residual images are differentiated by the discriminator with zero-padding HBD

Experimental Results

The experiments are performed on three datasets. Sintel [32] is a lossless 16-bit image dataset, consisting of more than 20,000 cartoon images. UST-HK [10] is composed of 40 natural 16-bit images, and KODAK [33] contains 24 natural 8-bit images. The proposed network is trained on 1,000 16-bit images random selected from Sintel with batch size setting to 5, and tested on another 50 16-bit images random selected from the rest of Sintel, all 40 16-bit images from UST-HK, and all 24 8-bit images

Discussion

In this work, we explore the superior perceptual performance of GAN for BE task. It is shown that both subjective and objective evaluations are improved by the proposed BE-ACGAN. It seems in contrary to other image enhancement algorithm such as super-resolution (SR) where the involvement of adversarial training would bring subjective improvement but objected degradation in terms of PSNR and SSIM (e.g., SRGAN [22]). We ascribe the specific characteristics of BE task and the proposed residual

Conclusion

Most existing BE algorithms have trouble recovering smooth gradient areas or reconstructing detailed structures. To overcome these problems, we attempt to reconstruct photo-realistic HBD images from the aspect of conditional GAN. It is discovered that the structures of residual images are easier for the discriminator to differentiate. Besides, taking zero-padded images as conditional inputs can involve image content information and quantization information, thereby further improving the

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

References (35)

G.H. An et al.
Perceptual brightness-based inverse tone mapping for high dynamic range imaging
Displays
(2018)
H. Su et al.
Adaptive tone mapping for display enhancement under ambient light using constrained optimization
Displays
(2019)
W.L. Xu et al.
An improved least-significant-bit substitution method using the modulo three strategy
Displays
(2016)
J. Liu et al.
Recurrent conditional generative adversarial network for image deblurring
Accepted by IEEE Access
(2018)
A. Laghrib et al.
A multiframe super-resolution technique based on a nonlocal bregman distance of bilateral total variation term
Displays
(2018)
K.J. Kwon, M.B. Kim, C. Heo, S.G. Kim, J.S. Baek, Y.H. Kim, Wide color gamut and high dynamic range displays using rgbw...
A.G. Rempel et al.
Ldr2hdr:on-the-fly reverse tone mapping of legacy video and photographs
ACM Transactions on Graphics
(2007)
G. Toderici et al.
Full resolution image compression with recurrent neural networks
R.A. Ulichney, S. Cheung, Pixel bit-depth increase by bit replication, in: Proceedings of Color Imaging:...
P. Wan et al.
From 2D extrapolation to 1D interpolation: Content adaptive image bit-depth expansion

C. Cheng et al.

Bit-depth expansion by contour region reconstruction

P. Wan et al.

Image bit-depth enhancement via maximum a posteriori estimation of AC signal

IEEE Trans. Image Process.

(2016)

J. Liu, G. Zhai, X. Yang, C. Chen, IPAD: Intensity potential for adaptive de-quantization, IEEE Transactions on Image...

J. Liu et al.

Bit-depth enhancement via convolutional neural network

J. Liu et al.

BE-CALF: Bit-depth enhancement by concatenating all level features of DNN

IEEE Trans. Image Process.

(2019)

Y. Zhao et al.

Deep reconstruction of least significant bits for bit-depth expansion

IEEE Trans. Image Process.

(2019)

I. Goodfellow et al.

Generative adversarial nets

Cited by (9)

Classification of birdsong spectrograms based on DR-ACGAN and dynamic convolution
2023, Ecological Informatics
Birdsongs are highly valuable for bird studies as they provide insights into various aspects such as species distribution, population structures, and habitat. Recognizing birdsongs plays a crucial role in bird conservation efforts. However, manually collecting a large number of birdsongs from the natural environment is expensive and time-consuming. Moreover, using limited birdsong data often results in low classification accuracy of the models. To better identification of birdsongs, we utilize wavelet transform(WT) to convert them into spectrograms, which contain abundant energy and frequency information. Effectively extracting these features is vital to improve the classification accuracy of the model. To address this problem, we proposed an improved ACGAN model based on residual structure and attention mechanism named DR-ACGAN, which achieved stable training of the model and high-quality generated birdsong spectrograms. The dynamic convolution kernel is then fused with MobileNetV2, ResNet18, and VGG16 models and trained on different datasets, which used different ways of mixing the generated and original spectrograms. The experimental results show that the classification accuracy after data augmentation improves by 6.66%, 4.35%, and 2.29% compared to the original dataset in the three base classifiers. After adding dynamic convolutional kernel structure, the accuracy is further improved by 1.68%, 0.67%, and 0.38% on average which the VGG16 model achieves the highest accuracy of 97.60%.
Rendering the image of glare effect based on paired and unpaired dual generative adversarial network
2023, Displays
It is a great challenge to rendering glare on image as the current rendering algorithms did not consider well the refraction of human eyes, thus the effect of rendering, in some critical application such as vehicle headlamps, is not real and may affect the safety evaluation. The traditional glare rendering algorithm relies on a large number of hand-designed wave optics processing operators, not only cannot complete the rendering work online in real time, but also cannot cope with the complex and changeable imaging conditions in reality. The mainstream generative adversarial network based algorithms in the field of image style translation are introduced to generate glare effect, which could be rendering online in a real time, however they still fail to render some effects such as detail distortion. In this work, we present a novel glare simulation generation method which is the first algorithm to apply a generative model based style transfer method to glare rendering. In a nutshell, a new method named Glare Generation Network is proposed to aggregate the benefits of content diversity and style consistency, which combines both paired and unpaired branch in a dual generative adversarial network. Our approach increase the structural similarity index measure by at least 0.039 on the custom darkroom vehicle headlamp dataset. We further show our method significantly improve the inference speed.
Enhancing infrared images via multi-resolution contrast stretching and adaptive multi-scale detail boosting
2024, Visual Computer
Adaptive False Contour Elimination Filter-Based Bit Depth Enhancement
2023, Dianzi Keji Daxue Xuebao/Journal of the University of Electronic Science and Technology of China
Bit-depth enhancement detection for compressed video
2022, arXiv
Learning Weighting Map for Bit-Depth Expansion within a Rational Range
2022, arXiv

View all citing articles on Scopus

^☆: This work is supported in part by Tianjin Science Foundation (20JCQNJC01150), in part by Innovation Fund of Tianjin Univeristy (2001), and in part by National Science Foundation of China (61701341).

View full text

BE-ACGAN: Photo-realistic residual bit-depth enhancement by advanced conditional GAN☆

Highlights

Abstract

Introduction

Section snippets

The Proposed BE-ACGAN

Experimental Results

Discussion

Conclusion

Declaration of Competing Interest

Displays

Displays

Displays

Accepted by IEEE Access

Displays

Ldr2hdr:on-the-fly reverse tone mapping of legacy video and photographs

ACM Transactions on Graphics

Full resolution image compression with recurrent neural networks

From 2D extrapolation to 1D interpolation: Content adaptive image bit-depth expansion

Bit-depth expansion by contour region reconstruction

Image bit-depth enhancement via maximum a posteriori estimation of AC signal

IEEE Trans. Image Process.

Bit-depth enhancement via convolutional neural network

BE-CALF: Bit-depth enhancement by concatenating all level features of DNN

IEEE Trans. Image Process.

Deep reconstruction of least significant bits for bit-depth expansion

IEEE Trans. Image Process.

Generative adversarial nets