Skip to main content

Twitter Mining for Multiclass Classification Events of Traffic and Pollution

  • Conference paper
  • First Online:
Human Systems Engineering and Design II (IHSED 2019)

Abstract

During the last decade social media have generated tons of data, that is the primal information resource for multiple applications. Analyzing this information let us to discover almost immediately unusual situations, such as traffic jumps, traffic accidents, state of the roads, etc.. This research proposes an approach for classifying pollution and traffic tweets automatically. Taking advantage of the information in tweets, it evaluates several machine learning supervised algorithms for text classification, where it determines that the support vector machine (SVM) algorithm achieves the highest accuracy value of 85,8% classifying events of traffic and not traffic. Furthermore, to determine the events that correspond to traffic or pollution we perform a multiclass classification. Where we obtain an accuracy of 78.9%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. D’Andrea, E., Ducange, P., Lazzerini, B., Marcelloni, F.: Real-time detection of traffic from twitter stream analysis. IEEE Trans. Intell. Transp. Syst. 16(4), 2269–2283 (2015)

    Article  Google Scholar 

  2. Quinlan, J.R.: C4.5: Programs for Machine Learning. Elsevier, Amsterdam (2014)

    Google Scholar 

  3. Cover, T., Hart, P.: Nearest neighbor pattern classification. IEEE Trans. Inf. Theory 13(1), 21–27 (1967)

    Article  Google Scholar 

  4. Aiello, L.M., et al.: Sensing trending topics in Twitter. IEEE Trans. Multimed. 15(6), 1268–1282 (2013)

    Article  MathSciNet  Google Scholar 

  5. Patil, L.H., Atique, M.: A novel feature selection based on information gain using WordNet. In: Proceedings of SAI Conference, London, UK, pp. 625–629 (2013)

    Google Scholar 

  6. Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20(3), 273–297 (1995)

    MATH  Google Scholar 

  7. Aha, D.W., Kibler, D., Albert, M.K.: Instance-based learning algorithms. Mach. Learn. 6(1), 37–66 (1991)

    Google Scholar 

  8. Kohavi, R., et al.: A study of cross-validation and bootstrap for accuracy estimation and model selection. In: Ijcai, Montreal, Canada, vol. 14, pp. 1137–1145 (1995)

    Google Scholar 

  9. Zeng, Z.-Q., Yu, H.-B., Xu, H.-R., Xie, Y.-Q., Gao, J.: Fast training support vector machines using parallel sequential minimal optimization. In: 3rd International Conference on Intelligent System and Knowledge Engineering, ISKE 2008, vol. 1, pp. 997–1001. IEEE (2008)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Richard Rivera .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Chamorro, V. et al. (2020). Twitter Mining for Multiclass Classification Events of Traffic and Pollution. In: Ahram, T., Karwowski, W., Pickl, S., Taiar, R. (eds) Human Systems Engineering and Design II. IHSED 2019. Advances in Intelligent Systems and Computing, vol 1026. Springer, Cham. https://doi.org/10.1007/978-3-030-27928-8_153

Download citation

Publish with us

Policies and ethics