Abstract
Identifying the temporal variations in mental workload level (MWL) is crucial for enhancing the safety of human–machine system operations, especially when there is cognitive overload or inattention of human operator. This paper proposed a cost-sensitive majority weighted minority oversampling strategy to address the imbalanced MWL data classification problem. Both the inter-class and intra-class imbalance problems are considered. For the former, imbalance ratio is defined to determine the number of the synthetic samples in the minority class. The latter problem is addressed by assigning different weights to borderline samples in the minority class based on the distance and density meaures of the sample distribution. Furthermore, multi-label classifier is designed based on an ensemble of binary classifiers. The results of analyzing 21 imbalanced UCI multi-class datasets showed that the proposed approach can effectively cope with the imbalanced classification problem in terms of several performance metrics including geometric mean (G-mean) and average accuracy (ACC). Moreover, the proposed approach was applied to the analysis of the EEG data of eight experimental participants subject to fluctuating levels of mental workload. The comparative results showed that the proposed method provides a competing alternative to several existing imbalanced learning algorithms and significantly outperforms the basic/referential method that ignores the imbalance nature of the dataset.
Similar content being viewed by others
References
Ayaz H, Shewokis PA, Bunce S et al (2011) Optical brain monitoring for operator training and mental workload assessment. Neuroimage 59(1):36–47
Barua S, Islam MM, Yao X, Murase K (2014) MWMOTE—majority weighted minority oversampling technique for imbalanced data set learning. IEEE Trans Knowl Data Eng 26(2):405–425
Brouwer AM, Hogervorst MA, Van Erp JB, Heffelaar T, Zimmerman PH, Oostenveld R (2012) Estimating workload using EEG spectral power and ERPs in the n-back task. J Neural Eng 9(4):045008
Byrne EA, Parasuraman R (1996) Psychophysiology and adaptive automation. Biol Psychol 42:249–268
Cateni S, Colla V, Vannucci M (2014) A method for resampling imbalanced datasets in binary classification tasks for real-world problems. Neurocomputing 135:32–41
Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP (2002) SMOTE: synthetic minority over-sampling technique. J Artif Intell Res 16(1):321–357
Chawla NV, Japkowicz N, Kotcz A (2004) Editorial: special issue on learning from imbalanced data sets. ACM SIGKDD Explor Newsl 6(1):1–6
Chen S, He H, Garcia EA (2010) RAMOBoost: ranked minority oversampling in boosting. IEEE Trans Neural Netw 21(10):1624–1642
Degani A, Goldman CV, Deutsch O, Tsimhoni O (2017) On human–machine relations. Cognit Technol Work. https://doi.org/10.1007/s10111-017-0417-3 (in press)
Domingos P (1999) Metacost: a general method for making classifiers cost-sensitive. In: Proceedings of 5th ACM SIGKDD international conference on knowledge discovery and data mining, pp 155–164
Elkan C (2001) The foundations of cost-sensitive learning. In: Proceedings of international joint conference on artificial intelligence. Lawrence Erlbaum Associates Ltd, no. 1, vol 17, pp 973–978
Fawcett T (2006) An introduction to ROC analysis. Pattern Recogn Lett 27(8):861–874
Gaikwad KM, Chavan MS (2014) Removal of high frequency noise from ECG signal using digital IIR butterworth filter. In: IEEE global conference on wireless computing and networking (GCWCN)
Garcı S, Triguero I, Carmona CJ, Herrera F (2012) Evolutionary-based selection of generalized instances for imbalanced classification. Knowl Based Syst 25(1):3–12
Girton DG, Kamiya J (1973) A simple on-line technique for removing eye movement artifacts from the EEG. Electroencephalogr Clin Neurophysiol 34(2):212–216
Han H, Wang WY, Mao BH (2005) Borderline-SMOTE: a new over-sampling method in imbalanced data sets learning. In: Huang DS, Zhang X-P, Huang G-B (eds) Advances in intelligent computing. Springer, Berlin, pp 878–887
He H, Garcia EA (2009) Learning from imbalanced data. IEEE Trans Knowl Data Eng 21(9):1263–1284
He H, Bai Y, Garcia EA, Li S (2008) ADASYN: Adaptive synthetic sampling approach for imbalanced learning, in Proc. of IEEE Int. Joint Conf. on Neural Networks (IJCNN2008), 2008:1322–1328
Hefron RG, Borghetti BJ (2016) A new feature for cross-day psychophysiological workload estimation. In: 15th IEEE International conference on machine learning and applications (ICMLA), pp 785–790
Hefron RG, Borghetti BJ, Christensen JC, Kabban CMS (2017) Deep long short-term memory structures model temporal dependencies improving cognitive workload estimation. Pattern Recogn Lett 94:96–104
Hockey GRJ, Wastell DG, Sauer J (1998) Effects of sleep deprivation and user-interface on complex performance: a multilevel analysis of compensatory control. Hum Factors 40:233–253
Hockey GRJ, Gaillard AWK, Burov O (2003) Operator functional state: the assessment and prediction of human performance degradation in complex tasks. Amsterdam, The Netherlands, pp 8–13
Hogervorst MA, Brouwer AM, van Erp JB (2014) Combining and comparing EEG, peripheral physiology and eye-related measures for the assessment of mental workload. Front Neurosci 8:322
Holden RJ (2011) Cognitive performance-altering effects of electronic medical records: an application of the human factors paradigm for patient safety. Cogn Technol Work 13(1):11–29
Jasper HH (1958) The ten-twenty electrode system of the international federation. Electroencephalogr Clin Neurophysiol 10:370–375
Jou Y, Yenn T, Lin CJ, Yang C, Chiang C (2009) Evaluation of operators’ mental workload of human–system interface automation in the advanced nuclear power plants. Nucl Eng Des 239:2537–2542
Kim J, Choi K, Kim G, Suh Y (2012) Classification cost: an empirical comparison among traditional classifier, cost-sensitive classifier, and MetaCost. Expert Syst Appl 39(4):4013–4019
Kontogiannis T (1999) Training effective human performance in the management of stressful emergencies. Cogn Technol Work 1(1):7–24
Kontogiannis T, Malakis S (2013) Strategies in controlling, coordinating and adapting performance in air traffic control: modelling ‘loss of control’ events. Cogn Technol Work 15(2):153–169
Krawczyk B, Woźniak M (2015). Cost-sensitive neural network with roc-based moving threshold for imbalanced classification. In: International conference on intelligent data engineering and automated learning. Springer, New York
Kukar M, Kononenko I (1998) Cost-sensitive learning with neural networks. In: Prade H (ed) In Proceedings of 13th european conference on artificial intelligence (ECAI98), Wiley, pp 445–449
Lal SKL, Craig A (2001) A critical review of the psychophysiology of driver fatigue. Biol Psychol 55:173–194
Lin M, Tang K, Yao X (2013) Dynamic sampling approach to training neural networks for multiclass imbalance classification. IEEE Trans Neural Netw Learn Syst 24(4):647–666
Liu XY, Zhou ZH (2006) The influence of class imbalance on cost-sensitive learning: an empirical study. In: Proceedings of 6th IEEE international conference on data mining (ICDM’06), 2006, pp 970–974
Liu X, Wu J, Zhou Z-H (2009) Exploratory undersampling for class-imbalance learning. IEEE Trans on Systems, Man, and Cybernetics, Part B: Cybernetics 39(2):539–550
Maloof MA (2003) Learning when data sets are imbalanced and when costs are unequal and unknown. In: Proceedings of workshop on learning from imbalanced data sets II (ICML2003), vol 2, pp 1–2
Manzey D, Bleil M, Bahner-Heyne JE et al (2008) AutoCAMS 2.0
Mazurowski MA, Habas PA, Zurada JM, Lo JY, Baker JA, Tourassi GD (2008) Training neural network classifiers for medical decision making: the effects of imbalanced datasets on classification performance. Neural Netw 21(2):427–436
Nuutinen M (2005) Expert identity construct in analysing prerequisites for expertise development: a case study of nuclear power plant operators’ on-the-job training. Cogn Technol Work 7(4):288–305
Palinko O, Kun AL, Shyrokov A, Heeman P (2010) Estimating cognitive load using remote eye tracking in a driving simulator. In: Proceedings of 2010 symposium on eye-tracking research & applications (ETRA’10), pp 141–144
Schütze H, Silverstein C (1997) Projections for efficient document clustering. In: Proceedings of ACM SIGIR forum, vol 31, no SI, pp 74–81
Sellberg C, Susi T (2014) Technostress in the office: a distributed cognition perspective on human–technology interaction. Cogn Technol Work 16(2):187–201
Sneddon A, Mearns K, Flin R (2006) Situation awareness and safety in offshore drill crews. Cogn Technol Work 8(4):255–267
Sun Y, Kamel MS, Wang Y (2006). Boosting for learning multiple classes with imbalanced class distribution. In: Proceedings of IEEE international conference on data mining, Hong Kong, pp 592–602
Sun Y, Wong AK, Kamel MS (2009) Classification of imbalanced data: a review. Int J Pattern Recognit Artif Intell 23(04):687–719
Tomek I (1976) Two modifications of CNN. IEEE Trans Syst Man Cybern 6:769–772
van Westrenen F (2014) Modelling arrival control in a vessel traffic management system. Cogn Technol Work 16(4):501–508
Varga E, Pattynama PMT, Freudenthal A (2013) Manipulation of mental models of anatomy in interventional radiology and its consequences for design of human–computer interaction. Cogn Technol Work 15(4):457–473
Voorhees EM (1986) Implementing agglomerative hierarchic clustering algorithms for use in document retrieval. Inf Process Manag 22(6):465–476
Wang G (2012) Asymmetric random subspace method for imbalanced credit risk evaluation. In: Wu Y (ed) Software engineering and knowledge engineering: theory and practice. Springer, Berlin, pp 1047–1053
Wang S, Yao X (2012) Multiclass imbalance problems: analysis and potential solutions. IEEE Trans Syst Man Cybern Part B Cybern 42(4):1119–1130
Wang S, Yao X (2013) Relationships between diversity of classification ensembles and single-class performance measures. IEEE Trans Knowl Data Eng 25(1):206–219
Wang S et al (2016) Training deep neural networks on imbalanced data sets. In: IEEE international joint conference on neural networks (IJCNN)
Wilson GF, Russell CA (2003) Real-time assessment of mental workload using psychophysiological measures and artificial neural networks. Hum Factors 45:635–643
Yan Y et al (2015). Deep learning for imbalanced multimedia data classification. In: IEEE international symposium on multimedia (ISM)
Yin Z, Zhang J (2011) Classification of process operator functional state based on support vector machine approach. In: Proceedings of 30th Chinese control conference, July 22–24, 2011, Yantai, China, pp 2986–2991
Zhang J, Zulkernine M (2005) Network intrusion detection using random forests. In: PST
Zhang J, Yin Z, Wang R (2015) Recognition of mental workload levels under complex human-machine collaboration by using physiological features and adaptive support vector machines. IEEE Trans Hum Mach Syst 45(2):200–214
Zhou ZH, Liu XY (2006) Training cost-sensitive neural networks with methods addressing the class imbalance problem. IEEE Trans Knowl Data Eng 18(1):63–77
Acknowledgements
The authors would like to thank Mr. Sunan Li for his useful discussions on this work and the developers of the aCAMS software which was used in data collection experiments. The work was supported in part by the National Natural Science Foundation of China under Grant No. 61075070 and Key Grant No. 11232005.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Zhang, J., Cui, X., Li, J. et al. Imbalanced classification of mental workload using a cost-sensitive majority weighted minority oversampling strategy. Cogn Tech Work 19, 633–653 (2017). https://doi.org/10.1007/s10111-017-0447-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10111-017-0447-x