Abstract
Data mining techniques based on Random forests are explored to gain knowledge about data in a Field Operational Test (FOT) database. We compare the performance of a Random forest, a Support Vector Machine and a Neural network used to separate drowsy from alert drivers. 25 variables from the FOT data was utilized to train the models. It is experimentally shown that the Random forest outperforms the other methods while separating drowsy from alert drivers. It is also shown how the Random forest can be used for variable selection to find a subset of the variables that improves the classification accuracy. Furthermore it is shown that the data proximity matrix estimated from the Random forest trained using these variables can be used to improve both classification accuracy, outlier detection and data visualization.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques, 2nd edn. Morgan Kaufmann Series in Data Management Systems. Morgan Kaufmann Publishers Inc., San Francisco (2005)
Shneiderman, B.: Inventing Discovery Tools: Combining Information Visualization with Data Mining. In: Abe, N., Khardon, R., Zeugmann, T. (eds.) ALT 2001. LNCS (LNAI), vol. 2225, p. 58. Springer, Heidelberg (2001)
Zhu, D.: A hybrid approach for efficient ensambles. Decision Support Systems 48, 480–487 (2010)
Bishop, C.: Pattern Recognition and Machine Learning. Springer, Singapore (2006)
Vapnik, V.: Statistical Learning Theory. Whiley, New York (1998)
Devroye, L., Gyorfi, L., Krzyzak, A., Lugosi, G.: On the strong universal consistency of nearest neighbor regression function estimates. Annals of Statistics 22, 1371–1385 (1994)
Breiman, L., Friedman, J., Olshen, R., Stone, C.: Classification and Regression Trees. Wadsworth and Brooks, Monterey (1984)
Kohonen, T.: Self-Organizing Maps. Springer, Berlin (1995) (Second Extended Edition 1997)
Lesemann, M.: Testing and evaluation methods for ict-based safety systems, deliverable D1.1: State of the art and evalue scope. Technical report, eValue project (2008), http://www.evalue-project.eu/pdf/evalue-080402-d11-v14-final.pdf
Kircher, A.: Vehicle control and drowsiness. VTI Meddelande 922A, Swedish National Road Transport Resesarch Institute, Linköping (2002)
Liu, C.C., Hosking, S.G., Lenné, M.G.: Predicting driver drowsiness using vehicle measures: Recent insights and future challenges. Journal of Safety Research 40, 239–245 (2009)
Breiman, L.: Bagging predictors. Machine Learning 24, 123–140 (1996)
Breiman, L.: Random forests. Machine Learning 45, 5–32 (2001)
Breiman, L., Cutler, A.: RFtools—for predicting and understanding data, Technical Report. Berkeley University, Berkeley, USA (2004)
Breiman, L.: Manual on setting up, using, and understanding random forests v3.1. Berkeley University, Berkeley (2002)
Kruskal, J., Wish, M.: Multidimensional scaling. Quantitative applications in the social sciences. Sage Publications (1978)
van der Maaten, L., Hinton, G.: Visualizing high-dimensional data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008)
Wolpert, D.H., Macready, W.G.: No free lunch theorems for search. Technical Report SFI-TR-05-010, Santa Fe Institute (1995)
Verikas, A., Gelzinis, A., Bacauskiene, M.: Mining data with random forests: A survey and results of new tests. Pattern Recognition 44, 330–349 (2011)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Englund, C., Kovaceva, J., Lindman, M., Grönvall, JF. (2012). Using Random Forests for Data Mining and Drowsy Driver Classification Using FOT Data. In: Meersman, R., et al. On the Move to Meaningful Internet Systems: OTM 2012. OTM 2012. Lecture Notes in Computer Science, vol 7566. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-33615-7_20
Download citation
DOI: https://doi.org/10.1007/978-3-642-33615-7_20
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-33614-0
Online ISBN: 978-3-642-33615-7
eBook Packages: Computer ScienceComputer Science (R0)