The field of continuous touch-based authentication has been rapidly developing over the last decade, creating a fragmented and difficult-to-navigate area for researchers and application developers alike. In this study, we perform a systematic literature analysis of 30 studies on the techniques used for feature extraction, classification, and aggregation in continuous touch-based authentication systems as well as the performance metrics reported by each study. Based on our findings, we design a set of experiments to compare the performance of the most frequently used techniques in the field under clearly defined conditions. In addition, we introduce two new techniques for continuous touch-based authentication: an expanded feature set (consisting of 149 unique features) and a multi-algorithm ensemble-based classifier. The comparison includes 13 feature sets, 11 classifiers, and 5 aggregation methods. In total, 204 model configurations are examined and we show that our novel techniques outperform the current state-of-the-art in each category. The results are also validated across three different publicly available datasets. Our best performing model achieves 4.8% EER using 16 consecutive strokes. Finally, we discuss the findings of our investigation with the aim of making the field more understandable and accessible for researchers and practitioners.
This work was generously supported by a grant from the Engineering and Physical Sciences Research Council [grant number EP/P00881X/1].
AB - AdaBoost ACC - Accuracy
BN - Bayesian Network ANGA - Average Number of Genuine Actions
CPANN - Counter Propagation Artificial Neural Network
DT - Decision Tree ANIA - Average Number of Impostor Actions
EE - Elliptic Envelop AUC - Area Under Curve
ENS - Ensemble FAR - False Acceptance Rate
GB - Gradient Boosting FRR - False Rejection Rate
HMM - Hidden Markov Models HTER - Half Total Error Rate
IF - Isolation Forest ROC - Receiver Operating Characteristic
KDTGR - Kernel Dictionary-based Touch Gesture Recognition
KSRC - Kernel Sparse Representation-based Classification
LOF - Local Outlier Factor NB - Naive Bayes
LR - Logistic Regression NN - Neural Networks
OC-SVM - OneClass Support Vector Machine
PSO-RBFN - Particle Swarm Optimization Radial Basis Function Network
RC - Random Committee RF - Random Forest
SM - Scaled Manhattan SVM - Support Vector Machine
StrOUD - Strangeness based OUtlier Detection
kNN - k Nearest Neighbors
