Skip to main content

Classifying Imbalanced Road Accident Data Using Recurring Concept Drift

  • Conference paper
  • First Online:
Book cover Data Mining (AusDM 2019)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1127))

Included in the following conference series:

Abstract

In New Zealand, road accident casualties have been increasing. Factor analyses and time series analyses show what types of accidents result in casualties, but the results from the analysis can become outdated. We propose a stream classification framework with drift detection to signal and adapt when the factors associated with crash casualties change over time. We propose a drift detection framework, G-mean Adaptive drift Detection (GAD), which adapts a classifier threshold to maximise G-mean. This metric rewards maximising accuracy on each class while keeping these accuracies balanced. As a result, GAD can make concept drift in the minority class easier to detect. We also propose a recurring concept classification framework, G-mean Concept Profiling Framework (GCPF), which reuses previously trained classifiers and uses GAD’s approach to drift detection. Through experimentation, we show GAD improves G-mean without increasing false positive drift detection on imbalanced synthetic and real world datasets. We also show GCPF achieves better G-mean than other state-of-the-art stream classification approaches on the NZ crash data set.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 59.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 74.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://www.nzta.govt.nz/safety/safety-resources/road-safety-information-and-tools/disaggregated-crash-data/.

References

  1. Anderson, R., Koh, Y.S., Dobbie, G.: Predicting concept drifts in data streams using metadata clustering. In: IJCNN. IEEE (2018)

    Google Scholar 

  2. Anderson, R., Koh, Y.S., Dobbie, G., Bifet, A.: Recurring concept meta-learning for evolving data streams. Expert Syst. Appl. 138, 112832 (2019)

    Article  Google Scholar 

  3. Barros, R.S.M., Santos, S.G.T.C.: A large-scale comparison of concept drift detectors. Inf. Sci. 451, 348–370 (2018)

    Article  MathSciNet  Google Scholar 

  4. Bifet, A., Frank, E.: Sentiment knowledge discovery in Twitter streaming data. In: Pfahringer, B., Holmes, G., Hoffmann, A. (eds.) DS 2010. LNCS (LNAI), vol. 6332, pp. 1–15. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-16184-1_1

    Chapter  Google Scholar 

  5. Brzezinski, D., Stefanowski, J.: Prequential AUC: properties of the area under the ROC curve for data streams with concept drift. Knowl. Inf. Syst. 52(2), 531–562 (2017)

    Article  Google Scholar 

  6. Chiu, C.W., Minku, L.L.: Diversity-based pool of models for dealing with recurring concepts. In: IJCNN, pp. 1–8. IEEE (2018)

    Google Scholar 

  7. Frías-Blanco, I., del Campo-Ávila, J., Ramos-Jiménez, G., Morales-Bueno, R., Ortiz-Díaz, A., Caballero-Mota, Y.: Online and non-parametric drift detection methods based on Hoeffding’s bounds. IEEE TKDE 27(3), 810–823 (2015)

    Google Scholar 

  8. Gama, J., Žliobaitė, I., Bifet, A., Pechenizkiy, M., Bouchachia, A.: A survey on concept drift adaptation. ACM Comput. Surv. 46(4), 44:1–44:37 (2014)

    Article  Google Scholar 

  9. Gomes, H.M., et al.: Adaptive random forests for evolving data stream classification. Mach. Learn. 106(9–10), 1469–1495 (2017)

    Article  MathSciNet  Google Scholar 

  10. Gonçalves Jr., P.M., De Barros, R.S.M.: RCD: a recurring concept drift framework. Pattern Recogn. Lett. 34(9), 1018–1025 (2013)

    Article  Google Scholar 

  11. Huang, D.T.J., Koh, Y.S., Dobbie, G., Pears, R.: Detecting volatility shift in data streams. In: 2014 IEEE International Conference on Data Mining, pp. 863–868. IEEE (2014)

    Google Scholar 

  12. IRTAD: IRTAD road safety annual report 2018 (2018). https://www.itf-oecd.org/road-safety-annual-report-2018

  13. Kubat, M., Matwin, S., et al.: Addressing the curse of imbalanced training sets: one-sided selection. In: ICML, Nashville, USA, vol. 97, pp. 179–186 (1997)

    Google Scholar 

  14. Lavrenz, S.M., Vlahogianni, E.I., Gkritza, K., Ke, Y.: Time series modeling in traffic safety research. Accid. Anal. Prev. 117, 368–380 (2018)

    Article  Google Scholar 

  15. Twyford, P., Genter, J.A.: New Zealand government press release: \$1.4 billion to save lives on our roads (2018). https://www.beehive.govt.nz/release/14-billion-save-lives-our-roads

  16. Wang, S., Minku, L.L., Yao, X.: A systematic study of online class imbalance learning with concept drift. IEEE Trans. Neural Netw. Learn. Syst. 99, 1–20 (2018)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Robert Anderson .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Anderson, R., Koh, Y.S., Dobbie, G. (2019). Classifying Imbalanced Road Accident Data Using Recurring Concept Drift. In: Le, T., et al. Data Mining. AusDM 2019. Communications in Computer and Information Science, vol 1127. Springer, Singapore. https://doi.org/10.1007/978-981-15-1699-3_12

Download citation

  • DOI: https://doi.org/10.1007/978-981-15-1699-3_12

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-15-1698-6

  • Online ISBN: 978-981-15-1699-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics