Abstract
Although there is an abundance of current research on facial recognition, it still faces significant challenges that are related to variations in factors such as aging, poses, occlusions, resolution, and appearances. In this paper, we propose a Multi-feature Deep Learning Network (MDLN) architecture that uses modalities from the facial and periocular regions, with the addition of texture descriptors to improve recognition performance. Specifically, MDLN is designed as a feature-level fusion approach that correlates between the multimodal biometrics data and texture descriptor, which creates a new feature representation. Therefore, the proposed MLDN model provides more information via the feature representation to achieve better performance, while overcoming the limitations that persist in existing unimodal deep learning approaches. The proposed model has been evaluated on several public datasets and through our experiments, we proved that our proposed MDLN has improved biometric recognition performances under challenging conditions, including variations in illumination, appearances, and pose misalignments.
Similar content being viewed by others
References
Ahmad MI, Woo WL, Dlay S (2016) Non-stationary feature fusion of face and palmprint multimodal biometrics. Neurocomputing 177:49–61. https://doi.org/10.1016/j.neucom.2015.11.003
Ahuja K, Islam R, Barbhuiya FA, Dey K (2017) Convolutional neural networks for ocular smartphone-based biometrics. Pattern Recogn Lett 91:17–26. https://doi.org/10.1016/j.patrec.2017.04.002
BBC News. In: BBC. http://www.bbc.com/news
Bharati MH, Liu JJ, MacGregor JF (2004) Image texture analysis: methods and comparisons. Chemom Intell Lab Syst 72:57–71. https://doi.org/10.1016/j.chemolab.2004.02.005
Cao Z, Yin Q, Tang X, Sun J (2010) Face recognition with learning-based descriptor. In: Int Conf Comput Vis Pattern Recognit (CVPR). IEEE, San Francisco, CA, USA, p 2707–2714
Cao Y, Steffey S, Jianbiao H, Xiao D, Tao C, Chen P, Müller H (2015) Medical image retrieval: a multimodal approach. Cancer Informat 13:125–136. https://doi.org/10.4137/CIN.S14053
Castrillón-Santana M, Lorenzo-Navarro J, Ramón-Balmaseda E (2016) On using periocular biometric for gender classification in the wild. Pattern Recogn Lett 82:181–189. https://doi.org/10.1016/j.patrec.2015.09.014
Chen Y, Yang J, Wang C, Liu N (2016) Multimodal biometrics recognition based on local fusion visual features and variational Bayesian extreme learning machine. Expert Syst Appl 64:93–103. https://doi.org/10.1016/j.eswa.2016.07.009
Dalal N, Triggs W (2005) Histograms of oriented gradients for human detection. In: Int Conf Comput Vis Pattern Recognit (CVPR). IEEE, San Diego, CA, USA, p 886–893
Delac K, Grgic M, Kos T (2006) Sub-image homomorphic filtering technique for improving facial identification under difficult illumination conditions. In: Int Conf Syst, Signals Image Process. Budapest, Hungary, p 95–98
Devasena CL, Revathí R, Hemalatha M (2011) Video surveillance systems - a survey. Int J Comput Sci 8:635–642
Elhamifar E, Vidal R (2011) Robust classification using structured sparse representation. In: Int Conf Comput Vis Pattern Recognit (CVPR). IEEE, Colorado Springs, CO, USA, p 1873–1879
Fan CN, Zhang FY (2011) Homomorphic filtering based illumination normalization method for face recognition. Pattern Recogn Lett 32:1468–1479. https://doi.org/10.1016/j.patrec.2011.03.023
Feichtenhofer C, Pinz A, Zisserman A (2016) Convolutional two-stream network fusion for video action recognition. In: Int Conf Comput Vis Pattern Recognit (CVPR). IEEE, Las Vegas, Nevada, USA, p 1933–1941
Goswami G, Mittal P, Majumdar A, Vatsa M, Singh R (2016) Group sparse representation based classification for multi-feature multimodal biometrics. Inf Fusion 32:3–12. https://doi.org/10.1016/j.inffus.2015.06.007
Goswami G, Singh R, Vatsa M, Majumdar A (2017) Kernel group sparse representation based classifier for multimodal biometrics. In: Int Joint Conf Neural Networks. IEEE, Anchorage, AK, USA, p 2894–2901
Hayat M, Bennamoun M, An S (2015) Deep reconstruction models for image set classification. IEEE Trans Pattern Anal Mach Intell 37:713–727. https://doi.org/10.1109/TPAMI.2014.2353635
Hayat M, Khan SH, Bennamoun M (2017) Empowering simple binary classifiers for image set based face recognition. Int J Comput Vis 123:479–498. https://doi.org/10.1007/s11263-017-1000-3
He Q, Zhang C, Liu DC (2015) Nonlinear image enhancement by self-adaptive sigmoid function. Int J Signal Process Image Process Pattern Recognit 8:319–328. https://doi.org/10.14257/ijsip.2015.8.11.29
Hu G, Yang Y, Yi D, Kittler J, Christmas W, Li SZ, Hospedales T (2015) When face recognition meets with deep learning: An evaluation of convolutional neural networks for face recognition. In: Int Conf Comput Vis Workshop (ICCVW). IEEE, Santiago, Chile, p 142–150
Internet Movie Database. In: IMDB. http://www.imdb.com
Jagadiswary D, Saraswady D (2016) Biometric authentication using fused multimodal biometric. Procedia Comput Sci 85:109–116. https://doi.org/10.1016/j.procs.2016.05.187
Jain AK, Nandakumar K, Ross A (2016) 50 years of biometric research: accomplishments, challenges, and opportunities. Pattern Recogn Lett 79:80–105. https://doi.org/10.1016/j.patrec.2015.12.013
Kafai M, An L, Bhanu B (2014) Reference face graph for face recognition. IEEE Trans Inf Forensics Secur 9:2132–2143. https://doi.org/10.1109/TIFS.2014.2359548
Kahou SE, Bouthillier X, Lamblin P, Al E (2016) EmoNets: multimodal deep learning approaches for emotion recognition in video. J Multimodal User Interfaces 10:99–111. https://doi.org/10.1007/s12193-015-0195-2
Karpathy A, Joulin A, Fei-Fei L (2014) Deep fragment embeddings for bidirectional image sentence mapping. In: Int Conf Neural Info Process Syst. ACM, Montreal, Canada, p 1889–1897
Kasar MM, Bhattacharyya D, Kim T-H (2016) Face recognition using neural network: a review. Int J Secur Appl 10:81–100. https://doi.org/10.14257/ijsia.2016.10.3.08
Kumar N, Berg AC, Belhumeur PN, Nayar SK (2009) Attribute and simile classifiers for face verification. In: Int Conf Comput Vis (ICCV). IEEE, Kyoto, Japan, p 365–372
Levi G, Hassner T (2015) Emotion recognition in the wild via convolutional neural networks and mapped binary patterns. In: Int Conf Multimodal Interaction. ACM, Seattle, Washington, USA, p 503–510
Li H, Lin Z, Shen X, Brandt J, Hua G (2015) A convolutional neural network approach for face detection. In: Int Conf Comput Vis Pattern Recognit (CVPR). IEEE, Boston, MA, USA, p 5325–5334
Liu Y, Guo Y, Georgiou T, Lew MS (2018) Fusion that matters: convolutional fusion networks for visual recognition. Multimed Tools Appl. https://doi.org/10.1007/s11042-018-5691-4
Lumini A, Nanni L (2017) Overview of the combination of biometric matchers. Inf Fusion 33:71–85. https://doi.org/10.1016/j.inffus.2016.05.003
Martinez A, Benavente R (1998) The AR face database, Barcelona
Min R, Kose N, Dugelay J-L (2014) KinectFaceDB: a Kinect face database for face recognition. IEEE Trans Syst Man, Cybern Syst 44:1534–1548. https://doi.org/10.1109/TSMC.2014.2331215
Mokhayeri F, Granger E, Bilodeau G (2015) Synthetic face generation under various operational conditions in video surveillance. In: Int Conf Image Process (ICIP). IEEE, Quebec City, QC, Canada, p 4052–4056
Naver News. In: Naver. http://news.naver.com/
Ng HW, Winkler S (2014) A data-driven approach to cleaning large face datasets. In: Int Conf Image Process (ICIP). IEEE, p 343–347
Nigam I, Vatsa M, Singh R (2015) Ocular biometrics: a survey of modalities and fusion approaches. Inf Fusion 26:1–35. https://doi.org/10.1016/j.inffus.2015.03.005
Ojala T, Pietikäinen M, Mäenpää T (2002) Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans Pattern Anal Mach Intell 24:971–987. https://doi.org/10.1109/TPAMI.2002.1017623
Padole CN, Proenca H (2012) Periocular recognition: Analysis of performance degradation factors. In: IAPR Int Conf Biometrics (ICB). IEEE, New Delhi, India, p 439–445
Park U, Jillela RR, Ross A, Jain AK (2009) Periocular biometrics in the visible spectrum: A feasibility study. In: Int Conf Biometrics: Theory, Appl, Syst (BTAS). IEEE, p 1–6
Parkhi OM, Vedaldi A, Zisserman A (2015) Deep face recognition. In: British Machine Vision Conf. p 1–12
Pietikäinen M, Hadid A, Zhao G, Ahonen T (2011) Local binary patterns for still images. In: Computer vision using local binary patterns. Springer, Berlin, pp 1689–1699
Poria S, Cambria E, Gelbukh A (2015) Deep convolutional neural network textual features and multiple kernel learning for utterance-level multimodal sentiment analysis. In: Proc Conf Empirical Methods in Natural Language Process. Lisbon, Portugal, p 2539–2544
Raghavendra R, Busch C (2016) Learning deeply coupled autoencoders for smartphone based robust periocular verification. In: Int Conf Image Process (ICIP). IEEE, Phoenix, Arizona, USA, p 325–329
Ramachandram D, Taylor GW (2017) Deep multimodal learning: a survey on recent advances and trends. IEEE Signal Process Mag 34:96–108
Ranjan R, Sankaranarayanan S, Castillo CD, Chellappa R (2017) An all-in-one convolutional neural network for face analysis. In: Int Conf Automatic Face and Gesture Recognit. IEEE, Washington, DC, USA, p 17–24
Ross A, Jain AK (2004) Multimodal biometrics: an overview. In: European Signal Process Conf. Vienna, Austria, p 1221–1224
Schroff F, Kalenichenko D, Philbin J (2015) FaceNet: A unified embedding for face recognition and clustering. In: Int Conf Comput Vis Pattern Recognit (CVPR). IEEE, Boston, MA, USA, p 815–823
Shahamat H, Pouyan A (2014) Face recognition under large illumination variations using homomorphic filtering in spatial domain. J Vis Commun Image Represent 25:970–977
Shekhar S, Patel VM, Nasrabadi NM, Chellappa R (2014) Joint sparse representation for robust multimodal biometrics recognition. IEEE Trans Pattern Anal Mach Intell 36:113–126. https://doi.org/10.1109/TPAMI.2013.109
Simonyan K, Zisserman A (2014) Two-Stream convolutional networks for action recognition in videos. arXiv Prepr. 568–576
Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. arXiv Prepr. 1–14
Srivastava N, Salakhutdinov R (2012) Learning representations for multimodal data with deep belief nets. In: Int Conf Mach Learning Workshop. Edinburgh, Scotland, UK
Štruc V, Pavešić N (2010) The complete Gabor-fisher classifier for robust face recognition. EURASIP J Adv Signal Process 2010:1–26. https://doi.org/10.1155/2010/847680
Tan X, Triggs B (2010) Recognition under difficult lighting conditions. IEEE Trans Image Process 19:1635–1650. https://doi.org/10.1109/TIP.2010.2042645
Tensorflow Library. In: TensorFlow. https://www.tensorflow.org/
Tiong LCO, Kim ST, Ro YM (2017) Multimodal face biometrics by using convolutional neural networks. J Korea Multimed Soc 20:170–178. https://doi.org/10.9717/kmms.2017.20.2.170
Wolf L, Hassner T, Maoz I (2011) Face recognition in unconstrained videos with matched background similarity. In: Int Conf Comput Vis Pattern Recognit (CVPR). IEEE, Colorado Springs, CO, USA, p 529–534
Woodard DL, Pundlik SJ, Lyle JR, Miller PE (2010) Periocular region appearance cues for biometric identification. In: Int Conf Comput Vis Pattern Recognit Workshop (CVPRW). IEEE, San Francisco, CA, USA, p 162–169
Wu X, He R, Sun Z, Tan T (2018) A light CNN for deep face representation with noisy labels. IEEE Trans Inf Forensics Secur 13:2884–2896. https://doi.org/10.1109/TIFS.2018.2833032
Xu Y, Lu Y (2015) Adaptive weighted fusion: a novel fusion approach for image classification. Neurocomputing 168:566–574. https://doi.org/10.1016/j.neucom.2015.05.070
Xu Y, Li Z, Pan JS, Yang JY (2013) Face recognition based on fusion of multi-resolution Gabor features. Neural Comput Appl 23:1251–1256. https://doi.org/10.1007/s00521-012-1066-3
Yang M, Zhang D, Feng X (2011) Fisher discrimination dictionary learning for sparse representation. In: Int Conf Comput Vis (ICCV). IEEE, Barcelona, Spain, p 543–550
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Tiong, L.C.O., Kim, S.T. & Ro, Y.M. Implementation of multimodal biometric recognition via multi-feature deep learning networks and feature fusion. Multimed Tools Appl 78, 22743–22772 (2019). https://doi.org/10.1007/s11042-019-7618-0
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-019-7618-0