Face Verification Between ID Document Photos and Partial Occluded Spot Photos

Zhao, Yunfei; Wei, Shikui; Jiang, Xiang; Ruan, Tao; Zhao, Yao

doi:10.1007/978-3-030-34110-7_9

Yunfei Zhao¹⁴,
Shikui Wei¹⁴,
Xiang Jiang¹⁴,
Tao Ruan¹⁴ &
…
Yao Zhao¹⁴

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 11902))

Included in the following conference series:

International Conference on Image and Graphics

2001 Accesses

Abstract

ID-spot face verification is an important problem in face verification area, which aims to identify whether the spotted face is the same to the ID photo. Although some face verification systems have been deployed in many application scenarios, most of them are used in a constrained environment and many key problems need to be addressed furthermore. In this paper, we focus on a challenging ID-spot face verification task, in which the spot photo is partially occluded. Toward this end, a two-stream network is employed to learn more discriminative feature for distinguishing different ID-Spot face pairs. In addition, to suppress the negative effect of background and occlusion, a global weight pooling method is proposed, which makes the available face area more significant than the background and occlusion. The experimental results show that the proposed method obtains 10% improvements on FAR@0.01 compared with previous schemes.

You have full access to this open access chapter, Download conference paper PDF

Large-Scale Bisample Learning on ID Versus Spot Face Recognition

Article 16 February 2019

FaceID: Verification of Face in Selfie and ID Document

Face Recognition Benchmark with ID Photos

Keywords

1 Introduction

As one of the most important biometrics, face recognition plays an important role in many application scenarios, such as device unlocking, application login and mobile payment. In the past decades, lots of face recognition algorithms are reported and great progress has been made. In recent years, the advances in deep learning techniques have greatly boosted the performance of face recognition. In fact, one of most important research topics lies in the extraction of more discriminative facial features by employing convolutional neural networks, which can be discussed from three aspects. Firstly, more powerful network architectures are adapted by introducing deeper or wider networks from VGG [1] to ResNet [2]. Secondly, large and refined face datasets are constructed to simulate real-world scenarios. For example, more wild MS1M [4] dataset is constructed to replace CASIA-WebFace [3]. Finally, more rigorous loss functions are designed to enable the network to learn more discriminative face features. In fact, most of commercial face recognition systems required the users to actively cooperate with the cameras so as to acquire clear face images. That is, most of them only work well in certain constrained environment. However, in some more tough application scenarios, it is almost impossible to meet the conditions. For example, the face recognition systems in public safety can even acquire a clear and complete face image. In this application scenario, the law enforcement agency has frequent requirements to compare ID document photos with spotted face images. What the face verification system needs to do is to help the police find out the faces of the criminals from these spotted images based on the ID document photos. In fact, this kind of face verification systems is quite different from existing commercial face recognition systems. First, the images are generally acquired in more natural environment, in which some factors like the capturing views and light conditions are uncontrolled. Second, criminals always have a psychological tendency to hide their faces, which further increases the difficulties of verification. Finally, the ID document photos are generally normative but low-quality, while the images from surveillance cameras or spot cameras are generally high-quality but arbitrary. That is, two types of faces are heterogeneous, which makes the matching of faces more difficult. Figure 1 illustrates the difference between ID faces and Spot faces. In this paper, we focus mainly on the ID-Spot face verification problem.

In the ID-Spot face verification scenario, the ID document photo is unified and clear, while may be covered by some stuffs like sunglasses and masks. Due to this special nature, when we follow the standard face feature extraction method, we can’t get a very effective and discriminative face representation. Therefore, we employ a pseudo-siamese network to solve the heterogeneous problem. In the network architecture, the network for processing ID document photos is different from the one for handling spot photos. That is, the two networks don’t share parameters, while they have the same architecture and trained jointly. In this manner, we can effectively enhance the discriminative ability of heterogeneous faces. In addition, a global weight pooling method is proposed to suppress the negative effect of background and occlusion. Using the global weight method, the available face area are assigned more significant weight than the background and occlusion, which makes face representation more discriminative.

In brief, we aim to create a fast and effective face verification system for face verification in wild conditions. In order to achieve this goal, we have proposed a face representation method and achieved good performance. The main contributions of this article are as follows:

1.
We explored the face verification problem with partial occlusion and quantitatively analyzed the impact of occlusion on face verification.
2.
We adjusted the global average pooling in CNN and achieved performance improvement. At FAR = 0.01%, we increased the TAR from 47.58% to 57.63%.
3.
Our model achieved the best results on a Chinese ID-Spot dataset.

2 Related Works

2.1 Face Recognition Based on Deep Learning

Due to the emergence of massive data and the tremendous increase in computing power, deep learning has shown great vitality in the field of computer vision. Face recognition is a special type of task in image classification.

The use of softmax to classify faces is the most basic method for studying face recognition. Since softmax is only used for classification, it has a weak ability to increase the distance between classes and reduce the distance within the class. Therefore, a series of methods such as center loss [5], SphereFace [6], CosineFace [7], and arcface [8] have appeared. Center loss [9] adds an additional supervisory signal for compressing the intraclass distance. In order to enhance the softmax loss ability, the multiplicative margin and the additive margin are introduced into the angle space by ArcFace [8] and SphereFace [6] respectively. CosineFace [7], AMSoftmax [9] adds additive margins to the cosine space to increase the penalty power for more discerning facial representations. The Softmax series method classifies the face of the training set globally. Its advantage is that it can converge quickly. The disadvantage is that when the class of the training set is large, more memory space is needed.

The DeepID [10, 11] combines softmax and validation signals to train the network. Facenet [12] uses the triple loss function to learn facial representations in large-scale databases. Contrastive loss and triplet loss are both data-using strategies. So, we need to build data pairs before training, and the method of building the data pairs will have a big impact on the result. Hard samples are often used as a choice for triples. Such methods search for optimal solutions in local space, and often require longer training time on large-scale training data.

2.2 ID Versus Spot

ID-Spot verification can be considered as a special case of heterogeneous face verification. Although the image structures of the two are the same, the data distribution of the two has a huge gap that is hard to cross. There are usually two types of methods for solving heterogeneous problems. One is to first convert the image so that the two types of data are similarly distributed, and the other is to map the two types of images into the same shared feature space. There are many researchers [13,14,15,16] who have conducted a lot of experiments and explorations on this issue.

Large-scale [17] and DocFace [18, 19] explored ID-Scene validation issues. The two adopted a similar strategy in the general direction. Pre-training on open large-scale datasets to obtain a pretrained model that is sensitive to human faces is the first step. Then, what needs to be done is to fine-tune the model on the ID-Spot dataset, also known as transfer learning, so that the model has a stronger ability to process heterogeneous ID photos and spot photos. More specifically, Large-scale adopted a classification-verification-classification strategy to gradually improve the performance of the model. DocFace designed an optimization method called DWI to update the weights.

2.3 Face Verification with Partial Occlusion

In the development of face recognition, many researchers have conducted great explorations and experiments on occlusion problems. Subspace regression transforms the occlusion face recognition problem into an unoccluded face image and occlusion respectively return to their respective subspaces; robust error coding attempts to separate the occlusion image into occluded and unoccluded regions; Robust feature extraction method decomposes the image features, reduces the mutual interference between the features, and provides sufficient fine features for subsequent recognition.

In recent years, there have been many works on partially occluded faces. Robust LSTM [20] proposes a robust long- and short-term memory network-automatic coding model to restore occluded faces. DFI [21] recognizes a human face by forming a facial star network map by connecting key points of the face area. Enhancing [22] improves the recognition rate by finding areas that have a significant impact on recognition.

3 Approach

In this section, we will describe our approach in detail. This method improves the face verification performance in the case of partial occlusion.

3.1 The Impact of Occlusion on Face Verification

It is well known that partial occlusion like sunglasses can cause trouble for face verification. We conduct a quantitative analysis of the effects of sunglasses on face verification, and give the effect of sunglasses on face verification from numerical values. We first compare the matching similarities between images, and obtain the cosine similarity of scene-scene, glass-glass, and scene-glass. The three groups are all compared by different people. Second, we use the same model to verify the dataset with occlusion and the dataset without occlusion, and get the verification result. In the process, we use the mobilefacenet [23] model trained by ArcFace [8].

3.2 Network Architecture

Global Weight Pooling (GWP).

In practice, face verification should achieve an ideal balance between speed and accuracy. Mobilefacenet [23] is a lightweight network designed for face recognition that can be deployed on mobile devices. Mobilefacenet draws on mobilenetv1 [24], mobilenetv2 [25], and shufflenet [26] networks, which use many separable convolutions to reduce the amount of computation and parameters. Mobilefacenet uses the global separable convolution instead of the global average pooling for down sampling at the end of the convolution.

We use the backbone network of mobilefacenet, so the size of the final feature map is 7*7, and the channel is 512. In the global average pooling layer, we use a global pooling operation with weights. The weights on each channel are not shared, and the weight parameter size is 512*7*7. After each iteration in the training, the 49 weights on each channel are processed to ensure that the result of each inference is a weighted average. Please see Fig. 2.

The global average pooling layer is calculated as:

$$ Output_{GAP - c} = \sum\nolimits_{i,j} {\frac{1}{W*H} \cdot F_{i,j,c} } $$

(1)

The global separable convolution is calculated as:

$$ Output_{GSC - c} = \sum\nolimits_{i,j} {W_{i,j,c} \cdot F_{i,j,c} } $$

(2)

The global weighted average pooling layer is calculated as:

$$ Output_{GWP - c} = \sum\nolimits_{i,j} {W_{i,j,c} \cdot F_{i,j,c} } $$

$$ \sum\nolimits_{i,j} {W_{i,j,c} } = 1 $$

(3)

Where F is the input feature map of size W × H × C. W is the weight matrix of size W × H × C. The (i, j) denotes the spatial position in W and F, and c denotes the channel index.

We use three methods for weight processing to ensure that it is a weighted average. They are:

Option-A: Use Softmax function
Option-B: Use Softmax function after relu
Option-C: Use Rescale process after relu

The GWP structure can increase the effective area weight of the face in the image to obtain more discriminative facial features, and process the parameter matrix into weights, which can highlight the importance of each region of the face.

Pseudo-siamese Network.

Most of the existing face verification networks are based on the siamese network, and the siamese network is used to handle similar input situations. In the ID-Spot verification, although the input is a human face, our task is to verify the face, that is, to find a larger dissimilarity in similarity. Especially in the case of the difference in the distribution of data between the ID photo and the Spot photo, and the existence of obvious heterogeneous characteristics, the pseudo-siamese network will be able to solve the problem better.

We use a pseudo-siamese network to solve the heterogeneity of ID photo and Spot photo. Two networks of the same structure process the ID photo and the spot photo separately, and the two networks do not share parameters. In addition, we do a comparative experiment, the performance of the shared embedded layer will be lower than that of a single network. The embedded layer is shown in Fig. 2 and the structure is BN-FC-BN. Therefore, we used two networks with completely independent parameters. Figure 3 shows the pipeline.

4 Experiments

Our code is based on the MXNet framework. All experiments run on 2 NVIDIA 1080Ti (12G). The specific settings for the experiment will be described in detail below.

4.1 Dataset

MS1M.

MS1M [4] is currently the largest open source face dataset, which contains approximately 100k identities and 10Million images. However, the original MS1M had a lot of noise, and ArcFace [8] cleaned it up and got the cleaned dataset. The cleaned dataset contains approximately 85K identities and 5.8 Million images. We used a refined dataset [27] for training.

LFW, AgeDB, CFP-FP.

lfw [28] is a well-known face test dataset in the field of face recognition, which contains 13,233 images from 5,749 IDs collected from online. The agedb [29] is a test data set for age. The image in cfp-fp [30] emphasizes the side face. 6000 pairs, 6000 pairs, and 7000 pairs of images were obtained from lfw, agedb, and cfp_fp, respectively. We use these three datasets as validation sets to select the optimal model.

IDSpot.

The IDSpot dataset is a private dataset that includes 19,500 pairs of matched images and 100k pairs of unmatched images. Each pair of images is a ID photo and a spot photo. The matching image pairs are from the same ID and do not match from different IDs. The types of ID photos are uniform, with the same size, degree of blur, image type; the style of spot photos is very different, there is a big difference.

We adjust the size of the ID photos to 112*112 by image processing functions such as crop and resize. Then we use MTCNN [31] to preprocess the spot photos, make face alignment on the face images according to the 5 landmarks, and finally adjust the size of the spot photos to 112*112.

Because it is difficult to collect enough occlusion images with labels, we obtain a synthetic occlusion dataset IDSpot-paste by image processing. We chose 100 sunglasses templates and 10 mask templates as occlusions in the dataset. For each face, we first randomly select 3 cases, the first is to do no process, the second is to use the sunglasses template for processing, and the last is mask processing. In the latter two cases, we randomly select a corresponding template again to generate an occlusion face. We paste these templates to the relevant face positions, where the positions of the aligned faces, glasses and mouth corners are known. Figure 4 illustrates some examples of occluded faces generated using this method.

We selected 15600 pairs of matching images for training, using the remaining 3900 pairs of matching images and 100K pairs of unmatched images for verification, and repeating the above process 5 times for 5-fold cross-validation. It should be noted that the IDs appearing in the matching image pair do not appear in the unmatched image pair. This will ensure that the same ID does not appear in the training set and verification set.

4.2 Occlusion Impact Analysis

Random Matching Experiment.

We randomly select 10,000 images from the spot photos, and divide them into 2 groups, named A and B respectively. According to the occlusion generation method, A and B are processed to obtain A_paste and B_paste, respectively. We use the mobilefacenet pretrained model to extract the features of the above four datasets. Finally, we compare three groups according to the four groups of features: first, A and B calculate the cosine similarity, get 5000 cosine similarity, calculate the mean and variance; the second group is to calculate the cosine similarity of A_paste and B_paste, then get the mean and variance. the last group is to calculate the cosine similarity between A and B_paste, B and A_paste and to obtain 10000 cosine similarities, and calculate mean and variance. Table 1 shows the experimental results.

Table 1. Random matching experiment results

Full size table

Face Verification Experiment.

In this experiment, the mobilefacenet pretrained model is used to perform 5-fold cross-validation on the IDSpot and IDSpot-paste datasets respectively. It should be noted that only the verification process is performed here, and there is no training process. We use tar@far as a metric to explore the effects of occlusion. Table 2 shows the experimental results.

Table 2. Face verification experiment results: TAR@FAR is used to metric performance. TAR: True Accept rate, and FAR: False Accept rate

Full size table

We explore partial occlusion through the above two experiments. We can see that occlusion has a huge impact on face verification from Tables 1 and 2. The experimental results show that face verification with local occlusion is a much more difficult task than conventional face verification.

4.3 Metrics

We use two evaluation metrics in all the experiments in this paper. We evaluate the models trained on MS1M on the lfw, agedb, and cfp-fp datasets, and use Accuracy as the evaluation metric. According to Accuracy, the optimal model is selected as the pretrained model. Accuracy is the ratio of the model’s correct number of samples to the total number of samples, which reflects the overall predictive power of the model.

In addition, we also use tar@far as the evaluation metric. Different far values correspond to different tar values. This metric can reflect the prediction ability of the model at a certain extreme, and can pay more attention to a certain aspect of performance.

4.4 Training on MS1M [4]

According to ArcFace [8], we train the mobilefacenet [23] network on the refined MS1M dataset [27]. We take ArcFace with a margin of 0.5 as the loss function. We set the batchsize to 128, the learning rate is divisible from 10 at 50K, 80K, 100K iterations, and the total number of iterations is 140K. We set the momentum to 0.9, and the weight decay to 5e-4.

Finally, we get the pretrained model. The pretrained model achieves the results of 99.47%, 95.67%, and 93.45% on lfw [28], agedb [29], and cfp_fp [30].

4.5 Transfer Learning

In this part of the experiment, we take a triplet loss. We use hard samples to design triples, each time looking for the hardest sample from a batch of samples as a negative sample. We set up a 5-fold cross-validation experiment. With 50 epochs per experiment, the learning rate starts at 0.1 and droppes to 0.01 and 0.001 at the beginning of the 11th epoch and 26 epochs, respectively. We set the batchsize to 32.

In GWP, we use several different ways to process weights. The first is to process by using the softmax function; the second is relu processing before softmax; the third is relu processing and process by using rescaling (Table 3).

Table 3. GAP, GSC, GWC experiments results, TAR@FAR is used to metric performance. TAR: True Accept rate, and FAR: False Accept rate. A, B, C denote option-A, option-B, option-C. S denotes pseudo-siamese network. GAP-S(EMB) denotes pseudo-siamese network sharing embedding layer parameters.

Full size table

From the experimental results, we can analyze that the pseudo-siamese network can effectively improve the representation of heterogeneous human faces, which also proves that the data distribution of ID photos and Spot photos is very different. The sharing of the embedded layer leads to a significant drop in performance, indicating that different distributed data should be subjected to different linear transformations. Both GSC and GWP can achieve performance improvements over GAP. At far = 0.01%, GWP performed best, indicating that GWP performs better under low false accepted rate.

5 Conclusion

In this paper, we explore face verification with partial occlusion and quantitatively analyze the impact of occlusion on face verification. We improve the representation of the face by adjusting the GAP part of the CNN. In addition, pseudo-siamese network is used to explore and analyze the heterogeneity of ID photos and spot photos.

References

Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: ICLR (2015)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Google Scholar
Yi, D., Lei, Z., Liao, S., Li, S.Z.: Learning face representation from scratch. Technical report arXiv:1411.7923 (2014)
Guo, Y., Zhang, L., Hu, Y., He, X., Gao, J.: MS-Celeb-1M: a dataset and benchmark for large-scale face recognition. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9907, pp. 87–102. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46487-9_6
Chapter Google Scholar
Wen, Y., Zhang, K., Li, Z., Qiao, Y.: A discriminative feature learning approach for deep face recognition. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9911, pp. 499–515. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46478-7_31
Chapter Google Scholar
Liu, W., Wen, Y., Yu, Z., Li, M., Raj, B., Song, L.: Sphereface: deep hypersphere embedding for face recognition. In: CVPR (2017)
Google Scholar
Wang, H., Wang, Y., Zhou, Z., Ji, X., Gong, D., Zhou, J., et al.: CosFace: large margin cosine loss for deep face recognition. arXiv:1801.0941 (2018)
Deng, J., Guo, J., Zafeiriou, S.: ArcFace: additive angular margin loss for deep face recognition. arXiv:1801.07698 (2018)
Wang, F., Liu, W., Liu, H., Cheng, J.: Additive margin softmax for face verification. arXiv:1801.05599 (2018)
Sun, Y., Chen, Y., Wang, X., Tang, X.: Deep learning face representation by joint identification-verification. In: NIPS (2014)
Google Scholar
Sun, Y., Wang, X., Tang, X.: Deeply learned face representations are sparse, selective, and robust. In: Computer Vision and Pattern Recognition, pp. 2892–2900 (2015)
Google Scholar
Schroff, F., Kalenichenko, D., Philbin, J.: FaceNet: a unified embedding for face recognition and clustering. In: CVPR (2015)
Google Scholar
Tang, X., Wang, X.: Face photo recognition using sketch. In: ICIP (2002)
Google Scholar
Liu, Q., Tang, X., Jin, H., Lu, H., Ma, S.: A nonlinear approach for face sketch synthesis and recognition. In: CVPR (2005)
Google Scholar
Liao, S., Yi, D., Lei, Z., Qin, R., Li, S.Z.: Heterogeneous face recognition from local structures of normalized appearance. In: Tistarelli, M., Nixon, M.S. (eds.) ICB 2009. LNCS, vol. 5558, pp. 209–218. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-01793-3_22
Chapter Google Scholar
Klare, B.F., Jain, A.K.: Heterogeneous face recognition using kernel prototype similarities. IEEE Trans. PAMI 35(6), 1410–1422 (2013)
Article Google Scholar
Zhu, X., et al.: Large-scale bisample learning on id vs. spot face recognition. arXiv:1806.03018 (2018)
Shi, Y., Jain, A.K.: DocFace: matching ID document photos to selfies. arXiv:1805.02283 (2018)
Shi, Y., Jain, A.K.: DocFace+: ID document to selfie matching. arXiv:1809.05620 (2018)
Zhao, F., Feng, J., Zhao, J., Yang, W., Yan, S.: Robust LSTM-autoencoders for face de-occlusion in the wild. arXiv:1612.08534 (2016)
Singh, A., Patil, D., Reddy, M., Omkar, S.: Disguised face identification (DFI) with facial keypoints using spatial fusion convolutional network. In: The IEEE International Conference on Computer Vision (ICCV), pp. 1648–1655 (2017)
Google Scholar
Trigueros, D.S., Meng, L., Hartnett, M.: Enhancing convolutional neural networks for face recognition with occlusion maps and batch triplet loss. In: Image and Vision Computing (2018)
Google Scholar
Chen, S., Liu, Y., Gao, X., Han, Z.: MobileFaceNets: efficient CNNs for accurate real-time face verification on mobile devices. arXiv:1804.05737 (2018)
Howard, A.G., et al.: MobileNets: efficient convolutional neural networks for mobile vision applications. CoRR, abs/1704.04861 (2017)
Google Scholar
Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.C.: MobileNetV2: inverted residuals and linear bottlenecks. CoRR, abs/1801.04381 (2018)
Google Scholar
Zhang, X., Zhou, X., Lin, M., Sun, J.: ShuffleNet: an extremely efficient convolutional neural network for mobile devices. CoRR, abs/1707.01083 (2017)
Google Scholar
InsightFace github. https://github.com/deepinsight/insightface
Huang, G.B., Ramesh, M., Berg, T., et al.: Labeled faces in the wild: a database for studying face recognition in unconstrained environments (2007)
Google Scholar
Moschoglou, S., Papaioannou, A., Sagonas, C., Deng, J., Kotsia, I., Zafeiriou, S.: AgeDB: the first manually collected in-the-wild age database. In: CVPRW (2017)
Google Scholar
Sengupta, S., Chen, J.-C., Castillo, C., Patel, V.M., Chellappa, R., Jacobs, D.W.: Frontal to profile face verification in the wild. In: WACV, pp. 1–9 (2016)
Google Scholar
Zhang, K., Zhang, Z., Li, Z., Qiao, Y.: Joint face detection and alignment using multi-task cascaded convolutional networks. IEEE Signal Proc. Lett. 23(10), 1499–1503 (2016)
Article Google Scholar

Download references

Acknowledgement

This work was supported in part by National Key Research and Development of China (No. 2017YFC1703503), National Natural Science Foundation of China (No. 61532005, No. 61572065), Program of China Scholarships Council (No. 201807095006), Fundamental Research Funds for the Central Universities (No. 2018JBZ001).

Author information

Authors and Affiliations

School of Computer and Information Technology, Beijing Jiaotong University, Beijing, China
Yunfei Zhao, Shikui Wei, Xiang Jiang, Tao Ruan & Yao Zhao

Authors

Yunfei Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Shikui Wei
View author publications
You can also search for this author in PubMed Google Scholar
Xiang Jiang
View author publications
You can also search for this author in PubMed Google Scholar
Tao Ruan
View author publications
You can also search for this author in PubMed Google Scholar
Yao Zhao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shikui Wei .

Editor information

Editors and Affiliations

Beijing Jiaotong University, Beijing, China
Yao Zhao
The Australian National University, Canberra, Australia
Nick Barnes
Peking University, Peking, China
Baoquan Chen
The Technical University of Munich, München, Bayern, Germany
Rüdiger Westermann
Zhejiang University, Hangzhou, China
Xiangwei Kong
Beijing Jiaotong University, Beijing, China
Chunyu Lin

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhao, Y., Wei, S., Jiang, X., Ruan, T., Zhao, Y. (2019). Face Verification Between ID Document Photos and Partial Occluded Spot Photos. In: Zhao, Y., Barnes, N., Chen, B., Westermann, R., Kong, X., Lin, C. (eds) Image and Graphics. ICIG 2019. Lecture Notes in Computer Science(), vol 11902. Springer, Cham. https://doi.org/10.1007/978-3-030-34110-7_9

Download citation

DOI: https://doi.org/10.1007/978-3-030-34110-7_9
Published: 28 November 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-34109-1
Online ISBN: 978-3-030-34110-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)

Face Verification Between ID Document Photos and Partial Occluded Spot Photos

Abstract

Similar content being viewed by others

Large-Scale Bisample Learning on ID Versus Spot Face Recognition

FaceID: Verification of Face in Selfie and ID Document

Face Recognition Benchmark with ID Photos

Keywords

1 Introduction

2 Related Works

2.1 Face Recognition Based on Deep Learning

2.2 ID Versus Spot

2.3 Face Verification with Partial Occlusion

3 Approach

3.1 The Impact of Occlusion on Face Verification

3.2 Network Architecture

Global Weight Pooling (GWP).

Pseudo-siamese Network.

4 Experiments

4.1 Dataset

MS1M.

LFW, AgeDB, CFP-FP.

IDSpot.

4.2 Occlusion Impact Analysis

Random Matching Experiment.

Face Verification Experiment.

4.3 Metrics

4.4 Training on MS1M [4]

4.5 Transfer Learning

5 Conclusion

References

Acknowledgement

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Societies and partnerships

Search

Navigation