Efficacy of Typing Pattern Analysis in Identifying Soft Biometric Information and Its Impact in User Recognition

Roy, Soumen; Roy, Utpal; Sinha, D. D.

doi:10.1007/978-3-319-70742-6_30

Soumen Roy¹⁷,
Utpal Roy¹⁸ &
D. D. Sinha¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 10590))

Included in the following conference series:

International Conference on Image Analysis and Processing

1739 Accesses
3 Citations

Abstract

Identifying soft biometric traits such as gender, age group, handedness, hand(s) used, typing skill and emotional states from typing pattern and the inclusion of these traits as additional features in user recognitionis a recent research area in order to improve the performance of keystroke dynamics technique. Knowledge-based user authentication with the combination of keystroke dynamics as biometric characteristics relates the issues to user authentication/identification in cloud computing based applications. Our approach is the one way, where the performance of the keystroke dynamics biometricin user recognition can be improved by using soft biometric traits that provides some additional information about the user which can be extracted from the typing pattern on a computer keyboard or touch screen phone. These soft biometric traits have low discriminating power but can be used to enhance the performance of user recognition in accuracy and time efficiency. In this paper, we are interested in using this technique in thestatic keystroke dynamics user authentication system. It has been observed that the age group (18−30/30+or < 18/18+), gender (male/female), handedness (left-handed/right-handed), hand(s) used (one hand/both hands), typing skill (touch/others) and emotional states (Anger/Excitation) can be extracted from the way of typing on a computer keyboard for single predefined text. This soft biometric information from typing pattern as extra features decreases the Equal Error Rate (EER). We have used two leading machine learning approaches: Support Vector Machine with Radial Basis Function (SVM-RBF) and Fuzzy-Rough Nearest Neighbour with Vaguely Quantified Rough Set (FRNN-VQRS) on multiple publicly available authentic and recognized keystroke dynamics datasets. Our approach on CMU keystroke dynamics datasetsindicates the impact of soft biometric traits.

You have full access to this open access chapter, Download conference paper PDF

Analysis of Typing Pattern in Identifying Soft Biometric Information and Its Impact in User Recognition

Identifying Soft Biometric Traits Through Typing Pattern on Touchscreen Phone

A Machine Learning-Based Approach to Password Authentication Using Keystroke Biometrics

Keywords

1 Introduction

Cloud computing based technology such as e-mail services, social network services, storage services, application services, web hosting services, TV services and more; we are probably using it in each day.As of now, knowledge-based user authentication/verification methods have been applied in cloud based technology. Building a more secure solution is needed due to the risk associated with this technique such as shoulder surfing attack, dictionary attack, brute-force attack. Keystroke dynamics with the combination of knowledge based authentication as a security solution could be used in practice. Since keystroke dynamics is a method where people can be identified through their way of typing. It has been established that habitual typing pattern is a behavioral biometric trait in biometric science relates the issues in user identification/authentication. Nevertheless, being nonintrusive and cost effective, keystroke dynamics is a strong alternative to other biometric modalities that can be easily integrated into any existing knowledge-based user authentication system with minor alternation. Obtained accuracies in previous studies are impressive but not acceptable in practice due to the high rate of Failure to Enroll Rate (FER) or intra-class variation. The performances of behavioral biometrics are not impressive in accuracy than morphological biometric modalities. It is very hard to achieve the acceptable accuracy. As per European Standards, access control system mandates the False Acceptance Rate is 1% and Miss Rate is 0.001% [1]. In behavioral biometric characteristics, keystroke dynamics is in trouble due to high rate of intra-class variations (problems in aging, mental state while typing, hand injury, tiredness, …) or data acquisition methods (cross device validation, timing resolution of the system, features selection, keyboard position, hand(s) used, …) which increase the error rate in keystroke dynamics user authentication technique.

Recent keystroke dynamics studies found that personal traits such as age, gender, dominant hand, hand(s) used, emotional state, and typing skill can be explained through the typing pattern on a computer keyboard [6]. Few studies found only age and gender can be explained through the behavior on the touch screen. The science behind this technique is users’ physical structure, hand weight, fingertips size, neurophysiological and neuropsychological factors reflect on the keyboard which discriminates the typing pattern.

These personal traits affect the typing characteristics and consequently affect the classification performance. For instance, touch size area of the child user might less than the touch type of the adult user. Same as the length of the fingers of the female users might higher than fingers of male users. The right-handed user might type digraph consisting keys from the right side of the keyboard more quickly than type digraph consisting keys from the left side. Typing digraph from a key from the left side and another is from the right side of the keyboard then typing pattern might differ among the users used one hand or both hands. If the user is distracted or frustrated by the unnatural behavior of computer then user’s typing pattern change massively. These are the soft biometric features affect the typing characteristics consequently with the classification performance. Type of text, length, clock resolution of the system, the number of running software(s) and type, keyboard type and size with different layout also affect the performance. Therefore, the experimental results are impressive in the lab-based environment to be used in web-based applications but the performance of keystroke dynamics is not 100% accurate in practice. To improve the performance, inclusions of personal traits such as age, gender, handedness, hand(s) used and typing skill are the new direction of keystroke dynamics research. Predicting of such traits is the new research direction not only to increase the keystroke dynamics user recognition performance; it has separate advantages in social network sites to E-Business. The performance in identifying the personal traits is more important. Otherwise, classification performance might be decreased instead.

Some study has been conducted to improve the performance of predicting personal traits to be used this technique in real life applications with the web-based environment. Some study went step to identify the traits of the users based on typing pattern on a computer keyboard but not provided sufficient evidence to be used it on a touchscreen smartphone which is the most common and popular electronic gadgets. Social networking is becoming more popular to keep touch with the large and diverse body of people and groups. Nowadays, users of social network under age group below 18 are rapidly increasing. They easily reach out the contents which are not supposed to access or not suited for them. They share their personal information with strangers. Most of them have included a large number of strangers in their friend list. Social networking administrators delete thousands of profiles for people who do not meet the age group requirement and who behave unnaturally on site. But no potential method has been applied to identify the age group and gender automatically based on the user’s typing pattern on a computer keyboard and touch screen smartphone instead of taking the age and gender information based on trust. Not only the age, it can be used to detect fraudulent claims of handedness and typing skill.

The journey of keystroke dynamics has been started in 1980. Throughout these three decades more than 500 papers have been published in the form of a journal, conference proceeding, and thesis, still, the accuracy of this technique is not reached its goal. More research work has to be done so this technique can achieve its goal and can be used in practice. Ancillary information can significantly improve the recognition performance of biometric systems.

The studies in theliterature are summarized in Table 1 have been conducted in thelab-based environment. They use AZERTY and/or QWERTY keyboard as asensing device. Text patterns of different studies are varied. Some of the texts are short, where some of thetexts are long. Some of thetexts are simply common words, where some of thetexts are logically complex. Some of the studies used 5 fold cross validation test option, where some of the studies used training-testing ratio test option in performance evaluation. The number of examples used in the previous studies is different, some of them maintained the session in data acquisition. It is clear from Table 1 that studies conducted on different datasets to meet the aim of extracting the personal traits based on typing pattern instead of improving the evaluation performed in identifying traits.

Table 1. Success achieved by researchers to recognize the soft biometric traits on keystroke dynamics datasets

Full size table

Machine learning technique as a classification method is common in all the listed studies in Table 1. Selection of appropriate method is an important issue in keystroke dynamics domain, where the performance of one method in accuracy jumps from 65% to 90%. In our study, we have applied FRNN with VQRS. The performance of our approach is very impressive, consistent, and significantly better than the previously used leading machine learning method SVM. In this study, we compare our approach with SVM with RBF.

The main objective of this study is to develop a model allowing identify the proper gender, age group, handedness, and typing skill of users through the typing pattern on keyboard and touching screen for a predefined text and improve the accuracy by using this soft biometric information as extra features in keystroke dynamics user authentication technique.

Our objective and contribution of this paper are listed below:

This study provides an efficient approach to recognizing ancillary information through typing pattern. The performance is comparable with other approaches in the literature.
Evaluate the performance of leading machine learning approaches to determine the soft biometric information.
Evaluate and compare the performancesof 9 leading anomaly detectors using and without using soft biometric approach.

We have used authentic and shared CMU keystroke dynamics dataset [8] along with dataset collected through anAndroidhandheld device [9]. The details of the datasets are described in Table 1.

2 Static and Shared Keystroke Dynamics Datasets

Many datasets on keystroke dynamics have been created in the last 30 years but some of them are available on the Internet or we can download on request. Details of the publicly available datasets are summarized in Table 2.

Table 2. Details of static and shared datasets on keystroke dynamics through keyboard

Full size table

Soft biometric information is not included in all the datasets. Datasets created by Killourhy et al. [2], Idrus et al. [7], Yuzun et al. [5] and El-Abed et al. [11] provided soft biometric information with keystroke dynamics datasets which will be the most suitable datasets for our experiment. We have given some names on each dataset depending on considered text type and data acquisition method in this paper so we can easily identify each dataset throughout this paper. Details are in Table 3.

Table 3. Details of static and shared datasets used in this study

Full size table

M, F, L, R, T, and O represent Male, Female, Left-hander, Right-hander, Touch and another type respectively.

3 Research Methodology

The proposed methodology is described in the following subsections. The first objective is to identify personal traits based on typing pattern on different datasets collected in adifferent environment and improve the keystroke dynamics recognition performance with theinclusion of these personality traits as additional features. In order to solve this problem, we have followed following steps.

We used the following equations to extract the features from the selected dataset. Where some of the features are not presented in the dataset. We recalculated all the 8 features by the following equations:

The timing features of the keystroke dynamics are as follows:

$$ {\text{Key}}\_{\text{Duration }}\left( {\text{KD}} \right) = R_{i} - P_{i} $$

(1)

$$ {\text{UpUp Key Latency }}\left( {\text{UU}} \right) = R_{i + 1} - R_{i} $$

(2)

$$ {\text{DownDown Key Latency }}\left( {\text{DD}} \right) = P_{i + 1} - P_{i} $$

(3)

$$ {\text{UpDown Key Latency }}\left( {\text{UD}} \right) = P_{i + 1} - R_{i} $$

(4)

$$ {\text{DownUp Key Latency }}\left( {\text{DU}} \right) = R_{i + 1} - P_{i} $$

(5)

$$ {\text{TotalTime Key Latency }}\left( {\text{t}} \right) = R_{n} - P_{1} $$

(6)

$$ {\text{Tri}} - {\text{graph Latency }}\left( {\text{T}} \right) = R_{i + 2} - P_{i} $$

(7)

$$ {\text{Four}} - {\text{graph Latency }}\left( {\text{F}} \right) = R_{i + 3} - P_{i} $$

(8)

Here P and R represent the pressed and released time of each i’th key event. We used a different combination of features to find the best choice of feature subset. Generally speaking, we have not applied any filtered or wrapper approach to select the features. We normalized all the datasets within the range [−1, +1] in order to speed up the process. We have used two leading machine learning approaches: SVM and FRNN. Fuzzy-rough nearest neighbor (FRNN) [17] classification algorithm is an alternative to Sarkar’s fuzzy-rough ownership function (FRNN-O) approach [18].

Some anomaly detection algorithms have been applied to keystroke dynamics pattern with the inclusion of personal traits manually. The results show that inclusion of personal traits increases the performance of the keystroke dynamics user recognition system.

4 Experimental Results

Two leading machine learning algorithms have been applied to each dataset and accuracy with 10 fold cross validation has been listed in Tables 4, 5, 6, 7, 8 and 9 to predict the soft biometric information. As per obtained results, FRNN with VQRS is proved to be the suitable learning methods in both desktop and Android environments. Accuracies were recorded by Weka 3.7.4 simulator with default parameter values.

Table 4. Accuracy with standard deviation in identifying gender on different datasets

Full size table

Table 5. Accuracy with standard deviation in identifying age group (< 30/30 +) on different datasets

Full size table

Table 6. Accuracy with standard deviation in identifying age group (< 18/18 +) on different datasets

Full size table

Table 7. Accuracy with standard deviation in identifying handedness on different datasets

Full size table

Table 8. Accuracy with standard deviation in identifying typing skill (touch/others) on different datasets

Full size table

Table 9. Accuracy with standard deviation in identifying hand(s) used on different datasets

Full size table

Tables 10, 11 and 12 represent the performance of 9 anomaly detectors described in [8] after considering soft biometric information. We observed that instead of using the only gender multiple soft biometrics information decreases the EER significantly. Here, we take themedian of samples. In keystroke dynamics, domain median proximity is better than mean.

Table 10. Comparative analysis of anomaly detectors with inclusion of personal traits by using performance metric Equal Error Rate (EER) in % on Dataset A

Full size table

Table 11. Comparative analysis of anomaly detectorswith inclusion of personal traitsby using performance metric Equal Error Rate (EER) in % on Dataset B

Full size table

Table 12. Comparative analysis of anomaly detectorswith inclusion of personal traitsby using performance metric Equal Error Rate (EER) in % on Dataset C

Full size table

5 Conclusion

It is possible to predict the gender, age group, handedness, hand(s) used, and typing skill of the user through the way of typing as it is evident from our experiment with impressive results. It can be used to recognize the gender and age group prediction since keystroke dynamics is a common measurable activity to be used in web-based applications. The activities on the keyboard and touch screen are behavioral biometric characteristics and it could be used to predict the gender and age group to deal with the problem of fake accounts and would enable to create a more loyal and authentic social networking sites. This may facilitate social network sites a fake free, genuine and more loyal user base. Automatically identifying and the inclusion of such traits also can be used as additional soft biometric information to reduce the error rate in the keystroke dynamics user recognition system. This technique also could help E-Commerce site to reach out to the right client. Similarly, this could avoid adverse products more efficiently based on the gender and age group. This technique also can be very useful in a web-based environment for auto profiling of the users. The results also show that age group below 18 can be identify based on typing pattern which can be used to protect the kids from Internet threats.

We have used two leading machine learning methods to predict personal traits on multiple publicly available authentic datasets. Our proposed approaches FRNN-VQRS, a new approach to FRNN achieved impressive results significantly better than previously used SVM with RBF to determine the personality traits in desktop and Android environments. This is a very positive outcome in keystroke dynamics system for a single predefined text which can be used as soft biometric additional features in user identification/authentication technique which decrease the EER 10.69 to 2.53 on CMU keystroke dynamics dataset. This is the modest as well as an efficient approach towards the keystroke dynamics user authentication system which could be used in cloud computing based techniques.

References

CENELEC: Alarm systems - Access control systems for use in security applications – Part 1, in System requirements, EN 50133-1 edition (1996)
Google Scholar
Killourthy, K.: A Scientific Understanding of Keystroke Dynamics, Doctoral Thesis, School of Computer Science Computer Science Department CarnegieMellon University Pittsburgh, PA 15213 (2012)
Google Scholar
Epp, C., Lippold, M., Mandryk, R.L.: Identifying emotional states using keystroke dynamics. In: Proceedings of SIGCHI Conference on Human Factors Computer System, pp. 715–724 (2011)
Google Scholar
Frank, E., Hall, M.A., Witten, I.H.: The Weka Workbench Data Mining Practical Machine Learning Tools and Techniques, 4th ed. (1999)
Google Scholar
Giot, R., Rosenberger, C.: A new soft biometric approach for keystroke dynamics based on gender recognition. Int. J. Inf. Technol. Manag. Spec. Issue Adv. Trends Biometrics 11, 1–16 (2012)
Google Scholar
Idrus, S.Z.S., Cherrier, E., Rosenberger, C., Bours, P.: Soft biometrics for keystroke dynamics. In: Kamel, M., Campilho, A. (eds.) ICIAR 2013. LNCS, vol. 7950, pp. 11–18. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-39094-4_2
Chapter Google Scholar
Uzun, Y., Bicakci, K., Uzunay, Y.: Could we distinguish child users from adults using keystroke dynamics? (2014)
Google Scholar
Killourhy, K.S., Maxion, R.A.: Comparing anomaly-detection algorithms for keystroke dynamics. In: Proceedings of the International Conference on Dependable Systems and Networks, pp. 125–134 (2009)
Google Scholar
El-Abed, M., Dafer, M., El Khayat, R.: RHU keystroke: a mobile-based benchmark for keystroke dynamics systems. In: 2014 International Carnahan Conference on Security Technology, pp. 1–4 (2014)
Google Scholar
Antal, M., Szabo, L.Z.: An evaluation of one-class and two-class classification algorithms for keystroke dynamics authentication on mobile devices. In: Proceedings - 2015 20th International Conference Control System Comput. Science CSCS 2015, pp. 343–350 (2015)
Google Scholar
Loy, C.C., Lim, C.P., Lai, W.K.: Pressure-based typing biometrics user authentication using the fuzzy ARTMAP neural network. In: International Conference on Neural Information Processing (ICONIP) (2005)
Google Scholar
Roth, J., Liu, X., Ross, A., Metaxas, D.: Biometric authentication via keystroke sound. In: 2013 International Conference biometrics, pp. 1–8 (2013)
Google Scholar
Montalvão, J., Freire, E.O., Bezerra, M.A., Garcia, R.: Contributions to empirical analysis of keystroke dynamics in passwords. Pattern Recogn. Lett. 52, 80–86 (2015)
Article Google Scholar
Bello, L., Bertacchini, M.: Collection and publication of a fixed text keystroke dynamics dataset. In: CACIC 2010 - XVI Congreso ARGENTINO CIENCIAS LA Computer, pp. 822–831 (2010)
Google Scholar
Idrus, S.Z.S., Cherrier, E., Rosenberger, C., Bours, P.: Soft biometrics database: a benchmark for keystroke dynamics biometric systems. In: 2013 International Conference of the Biometrics Special Interest Group (BIOSIG), pp. 1–8, September 2013
Google Scholar
Giot, R., El-Abed, M., Rosenberger, C.: Web-based benchmark for keystroke dynamics biometric systems: a statistical analysis. In: Intelligent Information hiding and multimedia signal Processing, pp. 11–15 (2012)
Google Scholar
Jensen, R., Cornelis, C.: Fuzzy rough nearest neighbour classification and prediction. Theor. Comput. Sci. 412(42), 5871–5884 (2011)
Article MathSciNet MATH Google Scholar
Sarkar, M.: Fuzzy-rough nearest neighbor algorithms in classification. Fuzzy Sets Syst. 158(19), 2134–2152 (2007)
Article MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science and Engineering, University of Calcutta, 92 APC Road, Calcutta, 700 009, India
Soumen Roy & D. D. Sinha
Department of Computer and System Sciences, Visva-Bharati, Santiniketan, 731235, India
Utpal Roy

Authors

Soumen Roy
View author publications
You can also search for this author in PubMed Google Scholar
Utpal Roy
View author publications
You can also search for this author in PubMed Google Scholar
D. D. Sinha
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Soumen Roy .

Editor information

Editors and Affiliations

University of Catania, Catania, Italy
Sebastiano Battiato
University of Catania, Catania, Italy
Giovanni Maria Farinella
University of Catania, Catania, Italy
Marco Leo
University of Catania, Catania, Italy
Giovanni Gallo

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Roy, S., Roy, U., Sinha, D.D. (2017). Efficacy of Typing Pattern Analysis in Identifying Soft Biometric Information and Its Impact in User Recognition. In: Battiato, S., Farinella, G., Leo, M., Gallo, G. (eds) New Trends in Image Analysis and Processing – ICIAP 2017. ICIAP 2017. Lecture Notes in Computer Science(), vol 10590. Springer, Cham. https://doi.org/10.1007/978-3-319-70742-6_30

Download citation

DOI: https://doi.org/10.1007/978-3-319-70742-6_30
Published: 31 December 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-70741-9
Online ISBN: 978-3-319-70742-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)

Efficacy of Typing Pattern Analysis in Identifying Soft Biometric Information and Its Impact in User Recognition

Abstract

Similar content being viewed by others

Analysis of Typing Pattern in Identifying Soft Biometric Information and Its Impact in User Recognition

Identifying Soft Biometric Traits Through Typing Pattern on Touchscreen Phone

A Machine Learning-Based Approach to Password Authentication Using Keystroke Biometrics

Keywords

1 Introduction

2 Static and Shared Keystroke Dynamics Datasets

3 Research Methodology

4 Experimental Results

5 Conclusion

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Societies and partnerships

Navigation

Efficacy of Typing Pattern Analysis in Identifying Soft Biometric Information and Its Impact in User Recognition

Abstract

Similar content being viewed by others

Analysis of Typing Pattern in Identifying Soft Biometric Information and Its Impact in User Recognition

Identifying Soft Biometric Traits Through Typing Pattern on Touchscreen Phone

A Machine Learning-Based Approach to Password Authentication Using Keystroke Biometrics

Keywords

1 Introduction

2 Static and Shared Keystroke Dynamics Datasets

3 Research Methodology

4 Experimental Results

5 Conclusion

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Societies and partnerships

Search

Navigation