Finger-Vein Image Inpainting Based on an Encoder-Decoder Generative Network

Li, Dan; Guo, Xiaojing; Zhang, Haigang; Jia, Guimin; Yang, Jinfeng

doi:10.1007/978-3-030-03398-9_8

Finger-Vein Image Inpainting Based on an Encoder-Decoder Generative Network

Dan Li^20,21,
Xiaojing Guo²¹,
Haigang Zhang^20,21,
Guimin Jia^20,21 &
…
Jinfeng Yang^20,21

Conference paper
First Online: 02 November 2018

2250 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 11256))

Abstract

Finger-vein patterns are usually used for biometric recognition. There may be spots or stains on fingers when capturing finger-vein images. Therefore, the obtained finger-vein images may have irregular incompleteness. In addition, due to light attenuation in biological tissue, the collected finger-vein images are often seriously degraded. It is essential to establish an image inpainting and enhancement model for the finger-vein recognition scheme. In this paper, we proposed a novel image restoration mechanism for finger-vein image including three main steps. First, the finger-vein images are enhanced by the combination of Gabor filter and Weber’s low. Second, an encoder-decoder generative network is employed to make image inpainting. Finally, different loss functions are taken into consideration for the model optimization. In the simulation part, we carry out some comparative experiments, which demonstrates the effectiveness and practicality of the proposed finger-vein image restoration mechanism.

You have full access to this open access chapter, Download conference paper PDF

1 Introduction

Finger-vein recognition technology uses the texture of the finger-vein to perform identity verification, which is harmless and difficult to be forged. It is relatively easy for the acquisition of finger-vein image, and the recognition process is user-friendly. Therefore, the finger-vein recognition technology can be widely applied to the access control system in the fields of banking finance and government agencies.

Finger-vein is distributed below the skin with complex shape. The morphology of finger-vein is the result of the interaction of human DNA and finger development. Different fingers in the same person have different morphologies. These biological properties guarantee the uniqueness of the finger-vein. It also laid a solid biological foundation for the development of finger-vein biometrics.

Typically, finger-vein images are captured by near-infrared (NIR) light in a transillumination manner [1]. During the process of transmission, the NIR light is absorbed by hemoglobin flowing in the venous blood [2]. Then the finger-vein image with light and dark vascular lines is formed. The quality of the finger-vein images is very poor due to the attenuation of light in tissues [3]. Therefore, it is often difficult to extract reliable finger-vein features directly from original finger-vein images [4].

In some cases, finger-vein images may have irregular incompleteness due to external factors, like spots or stains on fingers, when capturing finger-vein images, as shown in Fig. 1. Hence, it is a common phenomenon that vascular networks are incomplete in the finger-vein images.

For the accurate feature extraction, it is an important topic to generate a realistic finger-vein vascular network based on the obtained finger-vein image. As far as we know, there is a few works to deal with the incomplete finger-vein collection, which motivates our work.

Recently, Convolutional Neural Networks (CNNs) have been widely applied in computer vision, especially in the field of image classification and image generation [5]. They also could be used to solve the problem of image inpainting and reconstruction. The finger-vein images with spots or stains belongs to the problem of image inpainting. Therefore, CNNs-based can also use in the finger-vein images inpainting and reconstruction. In [6], a multi-scale neural patch synthesis method is proposed, which achieves better performance for high-resolution image inpainting on the ImageNet dataset. In general, to achieve reasonable inpainting results, a lot of images are needed to train models. In [7], an image inpainting method based on contextual attention is proposed, which is very effective on large datasets such as the CelebA faces dataset. However, our dataset does not have so many images for training models. In addition, low-resolution grayscale images can affect the inpainting result. [8] proposes a context encoder approach for image inpainting using the combination of the reconstruction (L2) loss and adversarial loss [9]. Nevertheless, for the inpainting of the finger-vein image, blurred without smooth edges of vein is generated.

In this paper, inspired by these methods, we propose an inpainting scheme for finger-vein image with spots or stains. The detailed presentation of the proposed scheme as follows. First, the combination of Gabor filter and Weber’s low are used for image enhancement by removing illumination variation in finger-vein images. Second, we design a novel finger-vein image inpainting frame based on an encoder-decoder network. Finally, different loss functions are used to optimize the inpainting frame. Experimental results show that the proposed method can achieve better performance in finger-vein image inpainting of irregular incompleteness.

2 Finger-Vein Image Acquisition

Finger is the most flexible part of the human body. Finger-vein images can be captured by placing the finger into imaging device. To obtain finger-vein images, we have designed a homemade finger-vein image acquisition device [10], as shown in Fig. 2(a). The device uses a NIR light to illuminate a finger. A vascular network of finger-vein is acquired by image sensor.

Extraction of ROI regions is essential for improving the accuracy of finger-vein recognition. We employ the effective method proposed in [11] to locate the ROIs from finger-vein images, as shown in Fig. 2(b). Some finger-vein ROIs of the same collector are listed in Fig. 2(c).

The homemade dataset is included 5,850 grayscale images of finger-vein, which are commonly used for biometric recognition. The ROIs of captured finger-vein images are resized to $91\times 200$ pixel. We enhance the grayscale images and resize them to $96\times 192$ pixel. Most of the finger-vein images are complete, and only a few are incomplete during the acquisition process. The imbalanced class distribution can destroy the training of the model. Therefore, we have manually added some samples of finger-vein images with spots or stains, as shown in Fig. 3. They are incomplete images of finger-vein with square-region, single irregular-region and multiple irregular-region. These incomplete situations need to be reconstructed in the experiments. The encoder-decoder network is trained to regress the corrupting pixel values and reconstruct them as complete images.

3 Method

3.1 Image Enhancement

In NIR imaging, finger-vein images are often severely degraded. This results in a particularly poor separation between veins and non-venous regions (see Fig. 4(a)). In order to reliably strengthen the finger-vein networks, finger-vein images need to be effectively enhanced. Here, a bank of Gabor filters [12] with 8 orientations and Weber’s Law Descriptor (WLD) [13] are combined for venous region enhancement and light attenuation elimination (see Fig. 4(b)). The Gabor filter is a linear filter for edge extraction, which is very suitable for texture expression and separation. This paper uses 8 orientations of Gabor filter to extract features. The WLD is used to improve the robustness of illumination.

3.2 Image Inpainting Scheme

The finger-vein images inpainting scheme of the incomplete information can be achieved by four steps. First, finger-vein images with spots or stains are fed into the encoder as input images. The region of spots or stains are represented by larger pixel values in order to appear more apparent. And latent features are learned from the input images. Second, the learned features are propagated to decoder through a channel-wise fully-connected layer. Third, the decoder uses these features representation to obtain the image content of spots or stains. The output images of the encoder-decoder network are generated with the same size as the input images. Finally, the inpainting images are optimized by comparing with the ground-truth images. Figure 5 presents the overall architecture for the proposed image inpainting scheme.

Encoder-Decoder Generative Network. Figure 6 shows an overview of our encoder-decoder generative network architecture. The encoder-decoder generative network consists of three blocks: encoder, channel-wise fully-connected layer and decoder. The encoder is derived from AlexNet architecture [14]. The effect of encoder is to compress high dimensional input data into low dimensional representation. The encoder block has five convolutional layers using $4\times 4$ kernels. The first convolution layer uses a stride of [2, 4] to reduce the spatial dimension. And a square feature of $48\times 48$ is obtained. The following four convolutional layers use a stride of [2, 2]. Given an input image of size $96\times 192$, we use the first five convolutional layers to compress the image into feature representation of $3\times 3\times 768$ dimension. The channel-wise fully-connected layer is a bridge between encoder features and decoder features propagated information (see Fig. 7). The decoder is the final function of training encoder-decoder. It reconstructs the input image using five convolutional layers. The feature representation of $3\times 3\times 768$ dimension abstracted by the encoder use five up-sampling layers to generate an image of size $96\times 192$.

3.3 Loss Function

There are usually multiple ways to fill an image content with spots or stains. Different loss functions result in different inpainting results. Optimizer minimizes the loss between the inpainting images and the ground-truth images. Proper loss function makes the inpainting images very realistic and maintain the consistence with the given context. In this paper, we employ L1 loss to train the proposed finger-vein image inpainting model. In [6], L2+adv loss function has achieved better performance in the field of image inpainting. The comparative experiments use L2 loss function, joint L2 loss with adversarial loss, and joint L1 loss with adversarial loss the same way as the Context Encoder [7]. For each training image, the L1 and L2 loss is defined as:

$$\begin{aligned} L_{L1}(G)={E}_{x,x_{g}}[||x-G(x_{g})||], \end{aligned}$$

(1)

$$\begin{aligned} L_{L2}(G)={E}_{x,x_{g}}[||x-G(x_{g})||^2], \end{aligned}$$

(2)

where x, represents the ground-truth image, $x_{g}$ denotes a finger-vein image with spots or stains, G denotes encoder-decoder generative network, $G(x_{g})$ represents the generated inpainting image.

The adversarial loss is defined as:

$$\begin{aligned} L_{adv}(G)={E}_{x_{g}}[-\log [D(G(x_{g}))+\sigma ]], \end{aligned}$$

(3)

where D is an adversarial discriminator, which predicts the probability that the input image is a real image rather than a generated one, and $\sigma $ is set to a small value in case the logarithm of the true number is equal to zero.

The joint L2 loss with adversarial loss is defined as:

$$\begin{aligned} L=\mu {L}_{L2}(G)+{1-\mu }L_{adv}(G), \end{aligned}$$

(4)

The joint L1 loss with adversarial loss is defined as:

$$\begin{aligned} L=\mu {L}_{L1}(G)+{1-\mu }L_{adv}(G), \end{aligned}$$

(5)

where $\mu $ is the weight of the two losses, which is used to balance the magnitude of the two losses in our experiments.

3.4 Evaluation

Peak Signal to Noise Ratio (PSNR), a full reference image quality evaluation index, which is used to calculate the peak signal-to-noise ratio between the ground-truth image and the inpainting image.

$$\begin{aligned} MSE=\frac{1}{H*W}\sum _1^H\sum _1^W(X(i,j)-Y(i,j))^2, \end{aligned}$$

(6)

$$\begin{aligned} PSNR=10\lg \frac{(2^n-1)^2}{MSE}, \end{aligned}$$

(7)

where X presents the ground-truth image, Y presents the inpainting image; H and W respectively are the height and width of the image; n is the number of bits per pixel, which is generally taken as 8, that is, the pixel grayscale number is 256.

We report our evaluation in terms of mean L1 loss, mean L2 loss and PSNR on test set. Our method performs better in terms of L1 loss, L2 loss and PSNR during the experiment.

4 Experiments

We evaluate the proposed inpainting model on homemade dataset. This dataset includes 5850 finger-vein images, 5616 for training, 117 for validation, and 117 for testing. Our encoder-decoder generative network is trained using four different loss functions respectively to compare their performance. For these loss functions, the parameters of encoder-decoder are set in the same way. The four loss functions are: (a) L2+adv loss, (b) L1+adv loss, (c) L2 loss, (d) L1 loss. From top to bottom, we input three images with arbitrary incompleteness as the first row. And we use (a)-(d) to refer to these loss functions. The ground-truth images correspond to the input images are placed in the last row. In the experiment, incomplete information is randomly generated. In the following, the effectiveness of our method is illustrated by images and specific data. In 4.3, the practicality of the proposed method is verified by the finger-vein images with square-region incomplete.

4.1 Single Irregular-Region Incomplete

We use the four methods discussed above to reconstruct finger-vein images with a spot or stain, that is, an irregular incompleteness needs to be reconstructed. The encoder-decoder generative network is trained with a constant learning rate of 0.0001.The inpainting results for single irregular-region incomplete using the four loss functions are shown in Fig. 8. High-quality inpainting results are not only clear on the finger-vein vascular networks but also consistent with surrounding regions, where the finger-vein images are spots or stains at different shapes. In practice, L2+adv and L1+adv loss produce blurred images without smooth edges of veins. The pixel values of vein region are obviously lost based on L2 loss. Compared with other methods, a smooth and complete finger-vein network is generated based on the method proposed in this paper. Table 1 shows qualitative results from these experiments. As shown in Table 1, our method achieves the lowest mean L1 loss and the highest PSNR.

Table 1. Numerical comparison on single irregular-region incomplete with four methods.

Full size table

4.2 Multiple Irregular-Region Incomplete

Similarly, we use the four loss functions to reconstruct finger-vein images with spots or stains, that is, multiple regions need to be reconstructed. The inpainting results for multiple irregular-region incomplete using the four methods are shown in Fig. 9. In practice, methods based on L2+adv and L1+adv loss produce blurred images without smooth edges of veins and pixels gathered together without any rules. Also, L2 loss makes the original pixel value loss more obvious. However, the inpainting images based on the L1 loss function are close to the ground truth images than the other methods. As Table 2 shows, the PSNR value is higher than the other methods. These results mean that the proposed method can achieve higher similarity with the ground-truth images than the other methods.

Table 2. Numerical comparison on multiple irregular-region incomplete with four methods.

Full size table

4.3 Square-Region Incomplete

The practicality and effectiveness of the proposed method is verified by the finger-vein images with square-region incomplete. Here, we also use the four methods to reconstruct finger-vein images with square-region incomplete. The inpainting results for square-region incomplete using the four loss functions are shown in Fig. 10. We can see that the results of using our proposed method are closer to the ground-truth images. Blurred images are generated based on L2+adv loss and L1+adv loss. In addition, the pixel values are seriously lost in the masked vein regions. Both visual images and specific data can show that our proposed method is effective (Table 3).

Table 3. Numerical comparison on square-region incomplete with four methods.

Full size table

5 Conclusion

In this paper, a method for inpainting finger-vein grayscale images with spots or stains based on L1 loss function is proposed. A series of experiments are performed using four methods, and the proposed method is proved to be effective. As a future work, we plan to extend the method to ensure that all indicators are optimal.

References

Kono, M., Ueki, H., Umemura, S.: Near-infrared finger vein patterns for personal identification. Appl. Opt. 41(35), 7429–7436 (2002)
Article Google Scholar
Zharov, V.P., Ferguson, S.: Infrared imaging of subcutaneous veins. Lasers Surg. Med. 34(1), 56–61 (2010)
Article Google Scholar
Sprawls, P.: Physical principles of medical imaging. Med. Phys. 22(12), 2123–2123 (1995)
Google Scholar
Yang, J.F., Shi, Y.H., Jia, G.M.: Finger-vein image matching based on adaptive curve transformation. Pattern Recogn. 66, 34–43 (2017)
Article Google Scholar
Denton, E., Chintala, S., Szlam, A., Fergus, R.: Deep generative image models using a Laplacian pyramid of adversarial networks. In: International Conference on Neural Information Processing Systems, pp. 1486–1494 (2015)
Google Scholar
Yang, C., Lu, X., et al.: High-resolution image inpainting using multi-scale neural patch synthesis, pp. 4076–4084 (2017)
Google Scholar
Pathak, D., Krahenbuhl, P., Donahue, J., Darrell, T.: Context encoders: feature learning by inpainting. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2536–2544 (2016)
Google Scholar
Zhao, H., Gallo, O., Frosio, I., Kautz, J.: Loss functions for image restoration with neural networks. In: IEEE Transactions on Computational Imaging, pp. 47–57 (2017)
Article Google Scholar
Yu, J., Lin, Z., Yang, J., et al.: Generative image inpainting with contextual attention (2018)
Google Scholar
Yang, J.F., Shi, Y.H.: Towards finger-vein image restoration and enhancement for finger-vein recognition. Inf. Sci. 268(6), 33–52 (2014)
Article MathSciNet Google Scholar
Yang, J., Li, X.: Efficient finger-vein localization and recognition. In: International Conference on Pattern Recognition, pp. 1148–1151 (2010)
Google Scholar
Yang, J., Shi, Y.: Finger-vein ROI localization and vein ridge enhancement. Elsevier Sci. 33(12), 1569–1579 (2012)
Google Scholar
Chen, J., Shan, S., He, C., et al.: WLD: a robust local image descriptor. IEEE Trans. Pattern Anal. Mach. Intell. 32(9), 1705–1720 (2010)
Article Google Scholar
Krizhevsky, A., Sutskever, I. and Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: International Conference on Neural Information Processing Systems, pp. 1097–1105 (2012)
Google Scholar

Download references

Acknowledgements

This work is supported by National Natural Science Foundation of Chi-na (No.6150050657, No.61806208) and the Fundamental Research Funds for the Central Universities (NO.3122017001).

Author information

Authors and Affiliations

Tianjin Key Lab for Advanced Signal Processing, Tianjin, China
Dan Li, Haigang Zhang, Guimin Jia & Jinfeng Yang
Civil Aviation University of China, Tianjin, China
Dan Li, Xiaojing Guo, Haigang Zhang, Guimin Jia & Jinfeng Yang

Authors

Dan Li
View author publications
You can also search for this author in PubMed Google Scholar
Xiaojing Guo
View author publications
You can also search for this author in PubMed Google Scholar
Haigang Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Guimin Jia
View author publications
You can also search for this author in PubMed Google Scholar
Jinfeng Yang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Dan Li .

Editor information

Editors and Affiliations

Sun Yat-sen University, Guangzhou, China
Jian-Huang Lai
Institute of Automation, Chinese Academy of Sciences, Beijing, China
Cheng-Lin Liu
Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China
Xilin Chen
Tsinghua University, Beijing, China
Jie Zhou
Institute of Automation, Chinese Academy of Sciences, Beijing, China
Tieniu Tan
Xi'an Jiaotong University, Xi'an, China
Nanning Zheng
Peking University, Beijing, China
Hongbin Zha

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Li, D., Guo, X., Zhang, H., Jia, G., Yang, J. (2018). Finger-Vein Image Inpainting Based on an Encoder-Decoder Generative Network. In: Lai, JH., et al. Pattern Recognition and Computer Vision. PRCV 2018. Lecture Notes in Computer Science(), vol 11256. Springer, Cham. https://doi.org/10.1007/978-3-030-03398-9_8

Download citation

DOI: https://doi.org/10.1007/978-3-030-03398-9_8
Published: 02 November 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-03397-2
Online ISBN: 978-3-030-03398-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Abstract

1 Introduction

2 Finger-Vein Image Acquisition

3 Method

3.1 Image Enhancement

3.2 Image Inpainting Scheme

3.3 Loss Function

3.4 Evaluation

4 Experiments

4.1 Single Irregular-Region Incomplete

4.2 Multiple Irregular-Region Incomplete

4.3 Square-Region Incomplete

5 Conclusion

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation