Face detection and recognition in an unconstrained environment for mobile visual assistive system
Graphical abstract
Introduction
Computer vision algorithms execute some of the most computational intensive tasks in problems such as pattern recognition and motion analysis [1]. The simple object detection algorithms [2], [3] require a significant amount of computational power due to the amount of data that needs to be processed in large-scale applications. Modern desktop computers are able to execute these applications in real-time, howsoever, the challenge is for mobile applications to handle computationally intensive tasks that produce heat and rapidly consume battery power. Modern computers are able to execute these programs in real-time without any major issues whereas the challenge is for mobile applications due to limitations in battery power and heavy computation that creates heat. Mobile devices can harness the full power of real-world computer vision applications when applications are built taking into account the limitations that are faced by them [4].
Mobile object detection systems have a wide range of applications due to their portability [5], [6]. While detection of static objects in general is a relatively easier task, the detection of moving objects is more challenging [1]. Some of the examples of mobile object detection are assistive systems for disabled persons [6] and iris recognition systems [5]. The inclusion of motion in computer vision applications incorporates major difficulties which can include blur, constant scale and position changes, obstructions, and illumination changes [3]. Advanced detection methods such as neural networks are required to account for these challenges with the hope to achieve satisfactory performance [7], [8]. The SmartVision prototype [9] is an example of a mobile-based assistive system that provides navigation for disabled persons. It used a combination of computer vision, geographic information system and global positioning system for object, obstacle and path detection. Moreover, Willis et al. presented a mobile-based assistive system that allowed users to navigate an environment using a radio frequency Identification (RFID) tag grid [10]. This tag grid had RFID tags programmed with coordinates and descriptions of the surroundings for providing navigation to users. Furthermore, a mobile iris recognition system has been presented where the system provided pupil and iris segmentation with a detection rate of 99% [11].
Neural networks consist of interconnected processors called neurons which are loosely modelled after biological neurons [8]. Convolutional neural networks (CNNs) are specialised neural networks that are primarily designed for image recognition tasks [12]. Some of these include face detection, expression recognition, object detection and object recognition [13], [14], [15].
CNNs have been well suited for difficult problems that include recognition and detection [12] and can also be applied to large-scale video classification problems [16]. Howsoever, they have been mostly deployed for constrained and indoor vision applications that do not have problems of motion blur and noise which results from a moving camera. Therefore, the challenge is for them to be deployed for mobile devices. A cloud-based support system can be a solution to this problem of portability and computation power, however, good internet quality would be required for real-time implementation. Although mobile face detection and recognition has been getting popular [17], we gathered through the literature that there has not been much work done in the area of mobile face detection and recognition in unconstrained environments [18], [19]. Mobile face detection and recognition consists of detection and recognition from a mobile source on stationary subjects and moving subjects which leads to input that contains motion blur and noise.
This paper presents a visual assistive system that features mobile face detection and recognition in an unconstrained environment from a mobile source using CNNs. The goal of the system is to effectively detect and recognise individuals who approach facing towards the person equipped with the system. Due to the shortage of related datasets, we present a dataset of videos captured from a mobile source that features motion blur and noise in an unconstrained environment from the mobile camera. This makes the application a very challenging aspect of face detection and recognition in unconstrained environments. The performance of the detection and recognition problems are evaluated using CNNs and cascade classifiers in different lighting conditions which include artificial light, daylight and moonlight.
The proposed approach contributes to a larger system designed to aid visually impaired persons through mobile face detection and recognition. We also provide a framework for implementation of the system with smartphones and wearable devices for video input and auditory notification from the system. This paper extends previous work that focused on face detection with CNNs [20] and mobile application framework [57].
The rest of the paper is organised as follows. We present the background and related work in Section 2 and the proposed mobile visual assistive system in Section 3. Section 4 describes the experimental design and also presents the experiment results. Section 5 gives a discussion and Section 6 concludes the paper with directions for future work.
Section snippets
Face detection and recognition
Face detection and recognition are the processes of verifying faces in a given environment via computer vision algorithms that usually involve machine learning [15], [19]. Face recognition is performed in a wide range of conditions based on facial features, emerging technologies and learning algorithms [18], [21]. Some of these methods use emerging technologies such as infra-red camera [22] and involve three-dimensional face recognition systems [23]. Some of the related methods for this paper
Mobile visual assistive system
The proposed face detection and face recognition in an unconstrained environment is part of a mobile visual assistive system through a mobile application designed to assist visually impaired persons. We first describe the architecture of the system and their interaction and then provide their implementation details. We note that the major component is the intelligent systems module which can be implemented using either CNNs or cascade classifiers, depending on their performance from simulation
Simulation and results
We present simulation study of the proposed intelligent system module that features detection and recognition using CNNs. Cascade classifiers are used for further comparison of the results. We use the simulation study methodology and video dataset with wide range of conditions described in the previous section.
Discussion
In the experiments, CNNs were compared to cascade classifiers. The performance of the cascade classifier recognition system makes it suitable for deployment on mobile devices since it performs computation in a short amount of time and therefore uses less energy. This makes cascade classifiers suitable for use in scenarios where internet connectivity is not available due to weak mobile network signals, weather conditions, etc. In this case, the mobile application can use cascade classifier-based
Conclusions and future work
This paper presented a visual assistive system that features mobile face detection and recognition in an unconstrained environment from mobile source using CNNs. The system's intelligent systems module included a detection module for face detection and recognition module for face recognition. The performance of the modules were evaluated using CNNs and cascade classifiers in different lighting conditions which included artificial light, daylight and moonlight. A dataset of videos captured from
References (57)
- et al.
Contour based object detection using part bundles
Comput. Vis. Image Underst.
(2010) - et al.
A field study of the accuracy and reliability of a biometric iris recognition system
Sci. Justice
(2013) - et al.
Realtime local navigation for the blind: detection of lateral doors and sound interface
Proc. Comput. Sci.
(2012) - et al.
Definition of artificial neural networks with comparison to other networks
Proc. Comput. Sci.
(2011) Mobile iris recognition systems: an emerging biometric technology
Proc. Comput. Sci.
(2010)- et al.
Subject independent facial expression recognition with robust face detection using a convolutional neural network
Neural Netw.
(2003) - et al.
IR and visible light face recognition
Comput. Vis. Image Underst.
(2005) - et al.
An efficient 3d face recognition approach using local geometrical signatures
Pattern Recognit.
(2014) - et al.
Face recognition for web-scale datasets
Comput. Vis. Image Underst.
(2014) - et al.
Multimodal person verification system using face and speech
Proc. Comput. Sci.
(2010)
Multi-resolution feature fusion for face recognition
Pattern Recognit.
Moving vehicle detection for automatic traffic monitoring
IEEE Trans. Veh. Technol.
Moving obstacle detection in highly dynamic scenes
Dynamic thermal management in mobile devices considering the thermal coupling between battery and application processor
Rapid object detection using a boosted cascade of simple features
The smartvision navigation prototype for blind users
JDCTA Int. J. Dig. Content Technol. Appl.
RFID information grid for blind navigation and wayfinding
Convolutional networks and applications in vision
Flexible, high performance convolutional neural networks for image classification
Deep convolutional network cascade for facial point detection
Large-scale video classification with convolutional neural networks
Labeled faces in the wild: A database for studying face recognition in unconstrained environments, Tech. rep., Technical Report 07-49
A survey of face recognition techniques
J. Inf. Process. Syst.
A survey of recent advances in face detection, Tech. Rep. MSR-TR-2010-66
Unconstrained Face Detection from a Mobile Source Using Convolutional Neural Networks
Face recognition with local binary patterns
Principal component analysis
Wiley Interdiscip. Rev.: Comput. Stat.
The use of multiple measurements in taxonomic problems
Ann. Eugen.
Cited by (17)
A convolutional neural network model for abnormality diagnosis in a nuclear power plant
2021, Applied Soft ComputingCitation Excerpt :CNN is one of the derivatives of deep neural networks inspired by the workings of the visual processing system in the human brain, which only responds to its local receptive field [11]. The network has shown considerable success in image analysis tasks including facial recognition [12,13], handwritten character recognition [14], image semantic segmentation [15], and medical image classification [16,17]. The premise of the current research is that CNNs, with their ability to deal with image data, can be effective in handling the massive amount of plant data generated in real time and diagnosing abnormal events, if the data is properly converted into an image format.
Iris anti-spoofing through score-level fusion of handcrafted and data-driven features
2020, Applied Soft Computing JournalCitation Excerpt :However, due to uncertainty about the textural patterns for real and attack iris images, several distinct patterns belonging to the same class may probably be formed. Therefore, a deliberately designed handcrafted feature may be insufficient to handle all possible patterns [13,14]. On the other side, a wide variety of image-based methods typically incorporate iris segmentation as an essential step prior to engaging the local descriptors for feature extraction [15–18].
Face recognition of remote monitoring under the Ipv6 protocol technology of Internet of Things architecture
2023, Journal of Intelligent SystemsReal-Time Face Detection and Recognition on Raspberry Pi using LBP and Deep Learning
2021, ACM International Conference Proceeding SeriesDeep convolutional neural network for object classification: Under constrained and unconstrained environments
2020, Handbook of Research on Deep Learning-Based Image Analysis Under Constrained and Unconstrained Environments