Abstract
Identifying soft biometric traits such as gender, age group, handedness, hand(s) used, typing skill and emotional states from typing pattern and the inclusion of these traits as additional features in user recognitionis a recent research area in order to improve the performance of keystroke dynamics technique. Knowledge-based user authentication with the combination of keystroke dynamics as biometric characteristics relates the issues to user authentication/identification in cloud computing based applications. Our approach is the one way, where the performance of the keystroke dynamics biometricin user recognition can be improved by using soft biometric traits that provides some additional information about the user which can be extracted from the typing pattern on a computer keyboard or touch screen phone. These soft biometric traits have low discriminating power but can be used to enhance the performance of user recognition in accuracy and time efficiency. In this paper, we are interested in using this technique in thestatic keystroke dynamics user authentication system. It has been observed that the age group (18−30/30+or < 18/18+), gender (male/female), handedness (left-handed/right-handed), hand(s) used (one hand/both hands), typing skill (touch/others) and emotional states (Anger/Excitation) can be extracted from the way of typing on a computer keyboard for single predefined text. This soft biometric information from typing pattern as extra features decreases the Equal Error Rate (EER). We have used two leading machine learning approaches: Support Vector Machine with Radial Basis Function (SVM-RBF) and Fuzzy-Rough Nearest Neighbour with Vaguely Quantified Rough Set (FRNN-VQRS) on multiple publicly available authentic and recognized keystroke dynamics datasets. Our approach on CMU keystroke dynamics datasetsindicates the impact of soft biometric traits.
You have full access to this open access chapter, Download conference paper PDF
Similar content being viewed by others
Keywords
- Keystroke dynamics
- Soft biometric
- Machine learning
- Fuzzy Rough NN (FRNN)
- Vaguely Quantified Rough Set (VQRS)
- LibSVM
- Anomaly detector
1 Introduction
Cloud computing based technology such as e-mail services, social network services, storage services, application services, web hosting services, TV services and more; we are probably using it in each day.As of now, knowledge-based user authentication/verification methods have been applied in cloud based technology. Building a more secure solution is needed due to the risk associated with this technique such as shoulder surfing attack, dictionary attack, brute-force attack. Keystroke dynamics with the combination of knowledge based authentication as a security solution could be used in practice. Since keystroke dynamics is a method where people can be identified through their way of typing. It has been established that habitual typing pattern is a behavioral biometric trait in biometric science relates the issues in user identification/authentication. Nevertheless, being nonintrusive and cost effective, keystroke dynamics is a strong alternative to other biometric modalities that can be easily integrated into any existing knowledge-based user authentication system with minor alternation. Obtained accuracies in previous studies are impressive but not acceptable in practice due to the high rate of Failure to Enroll Rate (FER) or intra-class variation. The performances of behavioral biometrics are not impressive in accuracy than morphological biometric modalities. It is very hard to achieve the acceptable accuracy. As per European Standards, access control system mandates the False Acceptance Rate is 1% and Miss Rate is 0.001% [1]. In behavioral biometric characteristics, keystroke dynamics is in trouble due to high rate of intra-class variations (problems in aging, mental state while typing, hand injury, tiredness, …) or data acquisition methods (cross device validation, timing resolution of the system, features selection, keyboard position, hand(s) used, …) which increase the error rate in keystroke dynamics user authentication technique.
Recent keystroke dynamics studies found that personal traits such as age, gender, dominant hand, hand(s) used, emotional state, and typing skill can be explained through the typing pattern on a computer keyboard [6]. Few studies found only age and gender can be explained through the behavior on the touch screen. The science behind this technique is users’ physical structure, hand weight, fingertips size, neurophysiological and neuropsychological factors reflect on the keyboard which discriminates the typing pattern.
These personal traits affect the typing characteristics and consequently affect the classification performance. For instance, touch size area of the child user might less than the touch type of the adult user. Same as the length of the fingers of the female users might higher than fingers of male users. The right-handed user might type digraph consisting keys from the right side of the keyboard more quickly than type digraph consisting keys from the left side. Typing digraph from a key from the left side and another is from the right side of the keyboard then typing pattern might differ among the users used one hand or both hands. If the user is distracted or frustrated by the unnatural behavior of computer then user’s typing pattern change massively. These are the soft biometric features affect the typing characteristics consequently with the classification performance. Type of text, length, clock resolution of the system, the number of running software(s) and type, keyboard type and size with different layout also affect the performance. Therefore, the experimental results are impressive in the lab-based environment to be used in web-based applications but the performance of keystroke dynamics is not 100% accurate in practice. To improve the performance, inclusions of personal traits such as age, gender, handedness, hand(s) used and typing skill are the new direction of keystroke dynamics research. Predicting of such traits is the new research direction not only to increase the keystroke dynamics user recognition performance; it has separate advantages in social network sites to E-Business. The performance in identifying the personal traits is more important. Otherwise, classification performance might be decreased instead.
Some study has been conducted to improve the performance of predicting personal traits to be used this technique in real life applications with the web-based environment. Some study went step to identify the traits of the users based on typing pattern on a computer keyboard but not provided sufficient evidence to be used it on a touchscreen smartphone which is the most common and popular electronic gadgets. Social networking is becoming more popular to keep touch with the large and diverse body of people and groups. Nowadays, users of social network under age group below 18 are rapidly increasing. They easily reach out the contents which are not supposed to access or not suited for them. They share their personal information with strangers. Most of them have included a large number of strangers in their friend list. Social networking administrators delete thousands of profiles for people who do not meet the age group requirement and who behave unnaturally on site. But no potential method has been applied to identify the age group and gender automatically based on the user’s typing pattern on a computer keyboard and touch screen smartphone instead of taking the age and gender information based on trust. Not only the age, it can be used to detect fraudulent claims of handedness and typing skill.
The journey of keystroke dynamics has been started in 1980. Throughout these three decades more than 500 papers have been published in the form of a journal, conference proceeding, and thesis, still, the accuracy of this technique is not reached its goal. More research work has to be done so this technique can achieve its goal and can be used in practice. Ancillary information can significantly improve the recognition performance of biometric systems.
The studies in theliterature are summarized in Table 1 have been conducted in thelab-based environment. They use AZERTY and/or QWERTY keyboard as asensing device. Text patterns of different studies are varied. Some of the texts are short, where some of thetexts are long. Some of thetexts are simply common words, where some of thetexts are logically complex. Some of the studies used 5 fold cross validation test option, where some of the studies used training-testing ratio test option in performance evaluation. The number of examples used in the previous studies is different, some of them maintained the session in data acquisition. It is clear from Table 1 that studies conducted on different datasets to meet the aim of extracting the personal traits based on typing pattern instead of improving the evaluation performed in identifying traits.
Machine learning technique as a classification method is common in all the listed studies in Table 1. Selection of appropriate method is an important issue in keystroke dynamics domain, where the performance of one method in accuracy jumps from 65% to 90%. In our study, we have applied FRNN with VQRS. The performance of our approach is very impressive, consistent, and significantly better than the previously used leading machine learning method SVM. In this study, we compare our approach with SVM with RBF.
The main objective of this study is to develop a model allowing identify the proper gender, age group, handedness, and typing skill of users through the typing pattern on keyboard and touching screen for a predefined text and improve the accuracy by using this soft biometric information as extra features in keystroke dynamics user authentication technique.
Our objective and contribution of this paper are listed below:
-
This study provides an efficient approach to recognizing ancillary information through typing pattern. The performance is comparable with other approaches in the literature.
-
Evaluate the performance of leading machine learning approaches to determine the soft biometric information.
-
Evaluate and compare the performancesof 9 leading anomaly detectors using and without using soft biometric approach.
We have used authentic and shared CMU keystroke dynamics dataset [8] along with dataset collected through anAndroidhandheld device [9]. The details of the datasets are described in Table 1.
2 Static and Shared Keystroke Dynamics Datasets
Many datasets on keystroke dynamics have been created in the last 30 years but some of them are available on the Internet or we can download on request. Details of the publicly available datasets are summarized in Table 2.
Soft biometric information is not included in all the datasets. Datasets created by Killourhy et al. [2], Idrus et al. [7], Yuzun et al. [5] and El-Abed et al. [11] provided soft biometric information with keystroke dynamics datasets which will be the most suitable datasets for our experiment. We have given some names on each dataset depending on considered text type and data acquisition method in this paper so we can easily identify each dataset throughout this paper. Details are in Table 3.
M, F, L, R, T, and O represent Male, Female, Left-hander, Right-hander, Touch and another type respectively.
3 Research Methodology
The proposed methodology is described in the following subsections. The first objective is to identify personal traits based on typing pattern on different datasets collected in adifferent environment and improve the keystroke dynamics recognition performance with theinclusion of these personality traits as additional features. In order to solve this problem, we have followed following steps.
We used the following equations to extract the features from the selected dataset. Where some of the features are not presented in the dataset. We recalculated all the 8 features by the following equations:
The timing features of the keystroke dynamics are as follows:
Here P and R represent the pressed and released time of each i’th key event. We used a different combination of features to find the best choice of feature subset. Generally speaking, we have not applied any filtered or wrapper approach to select the features. We normalized all the datasets within the range [−1, +1] in order to speed up the process. We have used two leading machine learning approaches: SVM and FRNN. Fuzzy-rough nearest neighbor (FRNN) [17] classification algorithm is an alternative to Sarkar’s fuzzy-rough ownership function (FRNN-O) approach [18].
Some anomaly detection algorithms have been applied to keystroke dynamics pattern with the inclusion of personal traits manually. The results show that inclusion of personal traits increases the performance of the keystroke dynamics user recognition system.
4 Experimental Results
Two leading machine learning algorithms have been applied to each dataset and accuracy with 10 fold cross validation has been listed in Tables 4, 5, 6, 7, 8 and 9 to predict the soft biometric information. As per obtained results, FRNN with VQRS is proved to be the suitable learning methods in both desktop and Android environments. Accuracies were recorded by Weka 3.7.4 simulator with default parameter values.
Tables 10, 11 and 12 represent the performance of 9 anomaly detectors described in [8] after considering soft biometric information. We observed that instead of using the only gender multiple soft biometrics information decreases the EER significantly. Here, we take themedian of samples. In keystroke dynamics, domain median proximity is better than mean.
5 Conclusion
It is possible to predict the gender, age group, handedness, hand(s) used, and typing skill of the user through the way of typing as it is evident from our experiment with impressive results. It can be used to recognize the gender and age group prediction since keystroke dynamics is a common measurable activity to be used in web-based applications. The activities on the keyboard and touch screen are behavioral biometric characteristics and it could be used to predict the gender and age group to deal with the problem of fake accounts and would enable to create a more loyal and authentic social networking sites. This may facilitate social network sites a fake free, genuine and more loyal user base. Automatically identifying and the inclusion of such traits also can be used as additional soft biometric information to reduce the error rate in the keystroke dynamics user recognition system. This technique also could help E-Commerce site to reach out to the right client. Similarly, this could avoid adverse products more efficiently based on the gender and age group. This technique also can be very useful in a web-based environment for auto profiling of the users. The results also show that age group below 18 can be identify based on typing pattern which can be used to protect the kids from Internet threats.
We have used two leading machine learning methods to predict personal traits on multiple publicly available authentic datasets. Our proposed approaches FRNN-VQRS, a new approach to FRNN achieved impressive results significantly better than previously used SVM with RBF to determine the personality traits in desktop and Android environments. This is a very positive outcome in keystroke dynamics system for a single predefined text which can be used as soft biometric additional features in user identification/authentication technique which decrease the EER 10.69 to 2.53 on CMU keystroke dynamics dataset. This is the modest as well as an efficient approach towards the keystroke dynamics user authentication system which could be used in cloud computing based techniques.
References
CENELEC: Alarm systems - Access control systems for use in security applications – Part 1, in System requirements, EN 50133-1 edition (1996)
Killourthy, K.: A Scientific Understanding of Keystroke Dynamics, Doctoral Thesis, School of Computer Science Computer Science Department CarnegieMellon University Pittsburgh, PA 15213 (2012)
Epp, C., Lippold, M., Mandryk, R.L.: Identifying emotional states using keystroke dynamics. In: Proceedings of SIGCHI Conference on Human Factors Computer System, pp. 715–724 (2011)
Frank, E., Hall, M.A., Witten, I.H.: The Weka Workbench Data Mining Practical Machine Learning Tools and Techniques, 4th ed. (1999)
Giot, R., Rosenberger, C.: A new soft biometric approach for keystroke dynamics based on gender recognition. Int. J. Inf. Technol. Manag. Spec. Issue Adv. Trends Biometrics 11, 1–16 (2012)
Idrus, S.Z.S., Cherrier, E., Rosenberger, C., Bours, P.: Soft biometrics for keystroke dynamics. In: Kamel, M., Campilho, A. (eds.) ICIAR 2013. LNCS, vol. 7950, pp. 11–18. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-39094-4_2
Uzun, Y., Bicakci, K., Uzunay, Y.: Could we distinguish child users from adults using keystroke dynamics? (2014)
Killourhy, K.S., Maxion, R.A.: Comparing anomaly-detection algorithms for keystroke dynamics. In: Proceedings of the International Conference on Dependable Systems and Networks, pp. 125–134 (2009)
El-Abed, M., Dafer, M., El Khayat, R.: RHU keystroke: a mobile-based benchmark for keystroke dynamics systems. In: 2014 International Carnahan Conference on Security Technology, pp. 1–4 (2014)
Antal, M., Szabo, L.Z.: An evaluation of one-class and two-class classification algorithms for keystroke dynamics authentication on mobile devices. In: Proceedings - 2015 20th International Conference Control System Comput. Science CSCS 2015, pp. 343–350 (2015)
Loy, C.C., Lim, C.P., Lai, W.K.: Pressure-based typing biometrics user authentication using the fuzzy ARTMAP neural network. In: International Conference on Neural Information Processing (ICONIP) (2005)
Roth, J., Liu, X., Ross, A., Metaxas, D.: Biometric authentication via keystroke sound. In: 2013 International Conference biometrics, pp. 1–8 (2013)
Montalvão, J., Freire, E.O., Bezerra, M.A., Garcia, R.: Contributions to empirical analysis of keystroke dynamics in passwords. Pattern Recogn. Lett. 52, 80–86 (2015)
Bello, L., Bertacchini, M.: Collection and publication of a fixed text keystroke dynamics dataset. In: CACIC 2010 - XVI Congreso ARGENTINO CIENCIAS LA Computer, pp. 822–831 (2010)
Idrus, S.Z.S., Cherrier, E., Rosenberger, C., Bours, P.: Soft biometrics database: a benchmark for keystroke dynamics biometric systems. In: 2013 International Conference of the Biometrics Special Interest Group (BIOSIG), pp. 1–8, September 2013
Giot, R., El-Abed, M., Rosenberger, C.: Web-based benchmark for keystroke dynamics biometric systems: a statistical analysis. In: Intelligent Information hiding and multimedia signal Processing, pp. 11–15 (2012)
Jensen, R., Cornelis, C.: Fuzzy rough nearest neighbour classification and prediction. Theor. Comput. Sci. 412(42), 5871–5884 (2011)
Sarkar, M.: Fuzzy-rough nearest neighbor algorithms in classification. Fuzzy Sets Syst. 158(19), 2134–2152 (2007)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Roy, S., Roy, U., Sinha, D.D. (2017). Efficacy of Typing Pattern Analysis in Identifying Soft Biometric Information and Its Impact in User Recognition. In: Battiato, S., Farinella, G., Leo, M., Gallo, G. (eds) New Trends in Image Analysis and Processing – ICIAP 2017. ICIAP 2017. Lecture Notes in Computer Science(), vol 10590. Springer, Cham. https://doi.org/10.1007/978-3-319-70742-6_30
Download citation
DOI: https://doi.org/10.1007/978-3-319-70742-6_30
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-70741-9
Online ISBN: 978-3-319-70742-6
eBook Packages: Computer ScienceComputer Science (R0)