Skip to main content

Using Random Forests for Data Mining and Drowsy Driver Classification Using FOT Data

  • Conference paper
On the Move to Meaningful Internet Systems: OTM 2012 (OTM 2012)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 7566))

  • 1023 Accesses

Abstract

Data mining techniques based on Random forests are explored to gain knowledge about data in a Field Operational Test (FOT) database. We compare the performance of a Random forest, a Support Vector Machine and a Neural network used to separate drowsy from alert drivers. 25 variables from the FOT data was utilized to train the models. It is experimentally shown that the Random forest outperforms the other methods while separating drowsy from alert drivers. It is also shown how the Random forest can be used for variable selection to find a subset of the variables that improves the classification accuracy. Furthermore it is shown that the data proximity matrix estimated from the Random forest trained using these variables can be used to improve both classification accuracy, outlier detection and data visualization.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques, 2nd edn. Morgan Kaufmann Series in Data Management Systems. Morgan Kaufmann Publishers Inc., San Francisco (2005)

    MATH  Google Scholar 

  2. Shneiderman, B.: Inventing Discovery Tools: Combining Information Visualization with Data Mining. In: Abe, N., Khardon, R., Zeugmann, T. (eds.) ALT 2001. LNCS (LNAI), vol. 2225, p. 58. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  3. Zhu, D.: A hybrid approach for efficient ensambles. Decision Support Systems 48, 480–487 (2010)

    Article  Google Scholar 

  4. Bishop, C.: Pattern Recognition and Machine Learning. Springer, Singapore (2006)

    MATH  Google Scholar 

  5. Vapnik, V.: Statistical Learning Theory. Whiley, New York (1998)

    MATH  Google Scholar 

  6. Devroye, L., Gyorfi, L., Krzyzak, A., Lugosi, G.: On the strong universal consistency of nearest neighbor regression function estimates. Annals of Statistics 22, 1371–1385 (1994)

    Article  MathSciNet  MATH  Google Scholar 

  7. Breiman, L., Friedman, J., Olshen, R., Stone, C.: Classification and Regression Trees. Wadsworth and Brooks, Monterey (1984)

    MATH  Google Scholar 

  8. Kohonen, T.: Self-Organizing Maps. Springer, Berlin (1995) (Second Extended Edition 1997)

    Book  Google Scholar 

  9. Lesemann, M.: Testing and evaluation methods for ict-based safety systems, deliverable D1.1: State of the art and evalue scope. Technical report, eValue project (2008), http://www.evalue-project.eu/pdf/evalue-080402-d11-v14-final.pdf

  10. Kircher, A.: Vehicle control and drowsiness. VTI Meddelande 922A, Swedish National Road Transport Resesarch Institute, Linköping (2002)

    Google Scholar 

  11. Liu, C.C., Hosking, S.G., Lenné, M.G.: Predicting driver drowsiness using vehicle measures: Recent insights and future challenges. Journal of Safety Research 40, 239–245 (2009)

    Article  Google Scholar 

  12. Breiman, L.: Bagging predictors. Machine Learning 24, 123–140 (1996)

    MathSciNet  MATH  Google Scholar 

  13. Breiman, L.: Random forests. Machine Learning 45, 5–32 (2001)

    Article  MATH  Google Scholar 

  14. Breiman, L., Cutler, A.: RFtools—for predicting and understanding data, Technical Report. Berkeley University, Berkeley, USA (2004)

    Google Scholar 

  15. Breiman, L.: Manual on setting up, using, and understanding random forests v3.1. Berkeley University, Berkeley (2002)

    Google Scholar 

  16. Kruskal, J., Wish, M.: Multidimensional scaling. Quantitative applications in the social sciences. Sage Publications (1978)

    Google Scholar 

  17. van der Maaten, L., Hinton, G.: Visualizing high-dimensional data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008)

    MATH  Google Scholar 

  18. Wolpert, D.H., Macready, W.G.: No free lunch theorems for search. Technical Report SFI-TR-05-010, Santa Fe Institute (1995)

    Google Scholar 

  19. Verikas, A., Gelzinis, A., Bacauskiene, M.: Mining data with random forests: A survey and results of new tests. Pattern Recognition 44, 330–349 (2011)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Englund, C., Kovaceva, J., Lindman, M., Grönvall, JF. (2012). Using Random Forests for Data Mining and Drowsy Driver Classification Using FOT Data. In: Meersman, R., et al. On the Move to Meaningful Internet Systems: OTM 2012. OTM 2012. Lecture Notes in Computer Science, vol 7566. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-33615-7_20

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-33615-7_20

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-33614-0

  • Online ISBN: 978-3-642-33615-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics