Skip to main content
Log in

Iktishaf: a Big Data Road-Traffic Event Detection Tool Using Twitter and Spark Machine Learning

  • Published:
Mobile Networks and Applications Aims and scope Submit manuscript

This article has been updated

Abstract

Road transportation is the backbone of modern economies despite costing annually millions of human deaths and injuries and trillions of dollars. Twitter is a powerful information source for transportation but major challenges in big data management and Twitter analytics need addressing. We propose Iktishaf, developed over Apache Spark, a big data tool for traffic-related event detection from Twitter data in Saudi Arabia. It uses three machine learning (ML) algorithms to build multiple classifiers to detect eight event types. The classifiers are validated using widely used criteria and against external sources. Iktishaf Stemmer improves text preprocessing, event detection and feature space. Using 2.5 million tweets, we detect events without prior knowledge including the KSA national day, a fire in Riyadh, rains in Makkah and Taif, and the inauguration of Al-Haramain train. We are not aware of any work, apart from ours, that uses big data technologies for event detection of road traffic events from tweets in Arabic. Iktishaf provides hybrid human-ML methods and is a prime example of bringing together AI theory, big data processing, and human cognition applied to a practical problem.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

Change history

  • 23 October 2020

    Springer Nature’s version of this paper was updated to present the correct Arabic characters in the text body.

References

  1. Agarwal S, Mittal N, Sureka A (2018) Potholes and bad road conditions- mining twitter to extract information on killer roads. ACM India Jt Int Conf Data Sci Manag Data CoDS-COMAD 2018

  2. Ni M, He Q, Gao J (2017) Forecasting the Subway passenger flow under event occurrences with social media. IEEE Trans Intell Transp Syst 18(6):1623–1632

    Google Scholar 

  3. Wang D, Al-Rubaie A, Davies J, and Clarke SS (2014) Real time road traffic monitoring alert based on incremental learning from tweets, pp. 50–57

  4. Suma S, Mehmood R, and Albeshri A (2020) Automatic detection and validation of smart city events using HPC and apache spark platforms, pp. 55–78

  5. LauRYK (2017) Toward a social sensor based framework for intelligent transportation,” in 2017 IEEE 18th International Symposium on A World of Wireless, Mobile and Multimedia Networks (WoWMoM), pp. 1–6

  6. Klaithin S and Haruechaiyasak C (2016) Traffic information extraction and classification from Thai Twitter. Comput Sci Softw Eng (JCSSE), 2016 13th Int. Jt. Conf., pp. 1–6

  7. D’Andrea E, Ducange P, Lazzerini B, Marcelloni F (2015) Real-time detection of traffic from twitter stream analysis. IEEE Trans Intell Transp Syst 16(4):2269–2283

    Article  Google Scholar 

  8. Alomari E, Mehmood R (2017) Analysis of tweets in Arabic language for detection of road traffic conditions. in Lecture Notes of the Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering, LNICST 224:98–110

    Article  Google Scholar 

  9. Alomari E, Mehmood R, and Katib I (2019) Sentiment analysis of arabic tweets for road traffic congestion and event detection,” in In: Mehmood R., See S., Katib I., Chlamtac I. (eds) Smart Infrastructure and Applications: Foundations for Smarter Cities and Societies, Springer (https://www.springer.com/us/book/9783030137045), p. to appear

  10. Alomari E, Mehmood R, and Katib I (2019) Road traffic event detection using twitter data, machine learning, and apache spark in IEEE SmartWorld, Ubiquitous Intelligence & Computing, Advanced & Trusted Computing, Scalable Computing & Communications, Cloud & Big Data Computing, Internet of People and Smart City Innovation (SmartWorld/SCALCOM/UIC/ATC/CBDCom/IOP/SCI), 2019, pp. 1888–1895

  11. Mehmood R, Bhaduri B, Katib I, and Chlamtac I, Eds. (2018) Smart societies, infrastructure, technologies and applications, vol. 224. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering (LNICST), Springer, pp. 367

  12. Mehmood R, See S, Katib I, and Chlamtac I, Eds., Smart Infrastructure and Applications: Foundations for Smarter Cities and Societies. EAI/Springer Innovations in Communication and Computing, Springer International Publishing, Springer Nature Switzerland AG, pp. 692, 2020

  13. Muhammed T, Mehmood R, Albeshri A (2018) Enabling reliable and resilient IoT based smart city applications. Lecture Notes of the Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering, LNICST 224:169–184

    Article  Google Scholar 

  14. Alam F, Mehmood R, Katib I, Albogami NN, Albeshri A (2017) Data fusion and IoT for smart ubiquitous environments: a survey. IEEE Access 5:9533–9554

    Article  Google Scholar 

  15. Muhammed T, Mehmood R, Albeshri A, Katib I (2018) UbeHealth: a personalized ubiquitous cloud and edge-enabled networked healthcare system for smart cities. IEEE Access 6:32258–32285

    Article  Google Scholar 

  16. Muhammed T, Mehmood R, Albeshri A, and Alzahrani A (2020) HCDSR: a hierarchical clustered fault tolerant routing technique for IoT-based smart societies, pp. 609–628

  17. Mehmood R, Alam F, Albogami NN, Katib I, Albeshri A, Altowaijri SM (2017) UTiLearn: a personalised ubiquitous teaching and learning system for smart societies. IEEE Access 5:2615–2635

    Article  Google Scholar 

  18. Lin C, He D, Kumar N, Choo KKR, Vinel A, Huang X (Jan. 2018) Security and privacy for the internet of drones: challenges and solutions. IEEE Commun Mag 56(1):64–69

    Article  Google Scholar 

  19. Alomari KM, Elsherif HM, and Shaalan K (2017) Arabic tweets sentimental analysis using machine learning, in In International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems, pp. 602–610

  20. Pandhare KR and Shah MA (2017) Real time road traffic event detection using Twitter and spark, 2017 Int. Conf. Inven. Commun. Comput. Technol., no. Icicct, pp. 445–449

  21. Salas A, Georgakis P, Nwagboso C, Ammari A, and Petalas I (2017) Traffic event detection framework using social media, in 2017 IEEE International Conference on Smart Grid and Smart Cities, ICSGSC, 2017, pp. 303–307

  22. Garg S, Kumar N, Rodrigues JJPC, Rodrigues JJPC (Mar. 2019) Hybrid deep-learning-based anomaly detection scheme for suspicious flow detection in SDN: a social multimedia perspective. IEEE Trans Multimed 21(3):566–578

    Article  Google Scholar 

  23. Mehmood R, Graham G (2015) Big data logistics: a health-care transport capacity sharing model. Procedia Computer Science 64:1107–1114

    Article  Google Scholar 

  24. Mehmood R, Meriton R, Graham G, Hennelly P, Kumar M (Jan. 2017) Exploring the influence of big data on city transport operations: a Markovian approach. Int J Oper Prod Manag 37(1):75–104

    Article  Google Scholar 

  25. Arfat Y, Usman S, Mehmood R, and Katib I (2020) Big data tools, technologies, and applications: a survey, pp. 453–490

  26. Arfat Y, Usman S, Mehmood R, and Katib I (2020) Big data for smart infrastructure design: opportunities and challenges, pp. 491–518

  27. Arfat Y, Suma S, Mehmood R, and Albeshri A (2020) Parallel shortest path big data graph computations of US road network using apache spark: survey, architecture, and evaluation, pp. 185–214

  28. Usman S, Mehmood R, Katib I (2020) Big data and HPC convergence for smart infrastructures: a review and proposed architecture, pp. 561–586

  29. Muhammed T, Mehmood R, Albeshri A, Katib I (Mar. 2019) SURAA: a novel method and tool for loadbalanced and coalesced SpMV computations on GPUs. Appl Sci 9(5):947

    Article  Google Scholar 

  30. Alyahya H, Mehmood R, and Katib I (2020) Parallel Iterative Solution of Large Sparse Linear Equation Systems on the Intel MIC Architecture, pp. 377–407

  31. Usman S, Mehmood R, Katib I, Albeshri A, Altowaijri SM (2019) ZAKI: a smart method and tool for automatic performance optimization of parallel SpMV computations on distributed memory machines. Mob. Networks Appl

  32. Usman S, Mehmood R, Katib I, Albeshri A (2019) ZAKI+: a machine learning based process mapping tool for SpMV computations on distributed memory architectures. IEEE Access 7:81279–81296

    Article  Google Scholar 

  33. Arfat Y et al (2017) Enabling smarter societies through mobile big data fogs and clouds. Procedia Computer Science 109:1128–1133

    Article  Google Scholar 

  34. Mehmood R, Faisal MA, Altowaijri S (2015) Future networked healthcare systems: a review and case study. In: Boucadair M, Jacquenet C (eds) Handbook of research on redesigning the future of internet architectures. IGI Global, Hershey, PA, pp 531–558

    Chapter  Google Scholar 

  35. Tawalbeh LA, Bakhader W, Mehmood R, and Song H (2016) Cloudlet-based mobile cloud computing for healthcare applications, in 2016 IEEE Global Communications Conference (GLOBECOM), pp. 1–6

  36. Schlingensiepen J, Mehmood R, Nemtanu FC, Niculescu M (2014) Increasing sustainability of road transport in european cities and metropolitan areas by facilitating autonomic road transport systems (ARTS), pp. 201–210

  37. Alam F, Mehmood R, Katib I, Altowaijri SM, Albeshri A (2019) TAAWUN: a decision fusion and feature specific road detection approach for connected autonomous vehicles. Mob. Networks Appl, Aug

    Google Scholar 

  38. Shoayee A, Mehmood R, Iyad K (2020) The role of big data and twitter data analytics in healthcare supply chain management, in Smart Infrastructure and Applications, Springer, Cham, pp. 267–279

  39. Alamoudi E, Mehmood R, Albeshri A, Gojobori T (2020) A survey of methods and tools for large-scale DNA mixture profiling, pp. 217–248

  40. Alotaibi S, Mehmood R (2018) Big data enabled healthcare supply chain management: Opportunities and challenges. Lecture Notes of the Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering (LNICST) 224:207–215

    Article  Google Scholar 

  41. Alotaibi S, Mehmood R, Katib I, Rana O, Albeshri A (Feb. 2020) Sehaa: a big data analytics tool for healthcare symptoms and diseases detection using twitter, apache spark, and machine learning. Appl Sci 10(4):1398

    Article  Google Scholar 

  42. Aqib M, Mehmood R, Alzahrani A, Katib I, Albeshri A, Altowaijri SM (May 2019) Smarter traffic prediction using big data, in-memory computing, deep learning and GPUs. Sensors 19(9):2206

    Article  Google Scholar 

  43. Aqib M, Mehmood R, Alzahrani A, Katib I, Albeshri A, Altowaijri SM (May 2019) Rapid transit systems: smarter urban planning using big data, in-memory computing, deep learning, and GPUs. Sustainability 11(10):2736

    Article  Google Scholar 

  44. Alsolami B, Mehmood R,Albeshri A (2020) Hybrid statistical and machine learning methods for road traffic prediction: a review and tutorial,” in Smart Infrastructure and Applications: Foundations for Smarter Cities and Societies Foundations for Smarter Cities and Societies, Springer, Cham, pp. 115–133

  45. Kumar N, Chilamkurti N, Park JH (Dec. 2013) ALCA: agent learning-based clustering algorithm in vehicular ad hoc networks. Pers Ubiquitous Comput 17(8):1683–1692

    Article  Google Scholar 

  46. Miglani A, Kumar N (Dec. 2019) Deep learning models for traffic flow prediction in autonomous vehicles: a review, solutions, and challenges. Veh Commun 20:100184

    Google Scholar 

  47. Al-Dhubhani R, Mehmood R, Katib I, Algarni A (2018) Location privacy in smart cities era. Lecture Notes of the Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering, LNICST 224:123–138

    Article  Google Scholar 

  48. Khanum A, Alvi A, Mehmood R (2018) Towards a semantically enriched computational intelligence (SECI) framework for smart farming. Lecture Notes of the Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering, LNICST 224:247–257

    Article  Google Scholar 

  49. Omar Alkhamisi A Mehmood R (2020) An ensemble machine and deep learning model for risk prediction in aviation systems, in 2020 6th Conference on Data Science and Machine Learning Applications (CDMA), pp. 54–59

  50. Garg S, Kaur K, Kumar N, Kaddoum G, Zomaya AY, Ranjan R (2019) A hybrid deep learning based model for anomaly detection in cloud datacentre networks. IEEE Trans. Netw. Serv. Manag

  51. Liu B (May 2012) Sentiment analysis and opinion mining. Synth Lect Hum Lang Technol 5(1):1–167

    Google Scholar 

  52. Kurniawan DA, Wibirama S, and Setiawan NA (2016) Real-time traffic classification with twitter data mining

    Book  Google Scholar 

  53. Suma S, Mehmood R, and Albeshri A (2019) Automatic detection and validation of smart city events using hpc and apache spark platforms,” in Smart Infrastructure and Applications: Foundations for Smarter Cities and Societies, Springer

  54. Mohammad O, AL-Smadi, Qawasmeh (2016) Knowledge-based approach for event extraction from arabic tweets” Int J Adv Comput Sci Appl, 7(6)

  55. Alsaedi N, Burnap P, and Rana O (2017) Can we predict a riot ? Disruptive event detection using twitter 17(2)

  56. Alabbas W, Al-Khateeb HM, Mansour A, Epiphaniou G, and Frommholz I (2017) Classification of colloquial Arabic tweets in real-time to detect high-risk floods,” in International Conference On Social Media, Wearable And Web Analytics, Social Media, pp. 1–8

  57. Jaafar Y, Bouzoubaa K (2018) A survey and comparative study of Arabic NLP architectures. In: In intelligent natural language processing: trends and applications

    Google Scholar 

  58. Abdulla NA, Ahmed NA, Shehab MA, Al-ayyoub M (2013) Arabic sentiment analysis :lexicon-based and corpus-based. In: IEEE Jordan Conference on Applied Electrical Engineering and Computing Technologies (AEECT)

    Google Scholar 

  59. Abdulla NA, Ahmed NA, Shehab MA, Al-Ayyoub M, Al-Kabi MN, Al-rifai S (2014) Towards improving the lexicon-based approach for Arabic sentiment analysis. Int J Inf Technol Web Eng 9(3):55–71

    Article  Google Scholar 

  60. Alhaj YA, Xiang J, Zhao D, Al-Qaness MAA, Abd Elaziz M, Dahou A (2019) A study of the effects of stemming strategies on Arabic document classification. IEEE Access 7:32664–32671

    Article  Google Scholar 

  61. Diab M, Ghoneim M, Habash N (2007) Arabic diacritization in the context of statistical machine translation. In: Proceedings of MT-summit

    Google Scholar 

  62. Skynews, “Infographic .. Saudi Arabia celebrates the 88th National Day,” 2018.

  63. S. P. A. WAS (2018) Heavy rain in Makkah.

  64. S. P. A. WAS (2018) Civil defense in Riyadh conducts cooling operations for burnt transformers in Al-Nafal neighborhood.

  65. S. P. A. WAS (2018) The launch of the Al-Harameen Express train between Makkah and Madinah, passing through Jeddah and the Economic City

Download references

Acknowledgments

The work carried out in this paper is supported by the HPC Center at King AbdulAziz University (KAU). The experiments reported were performed on the Aziz supercomputer at KAU.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ebtesam Alomari.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Alomari, E., Katib, I. & Mehmood, R. Iktishaf: a Big Data Road-Traffic Event Detection Tool Using Twitter and Spark Machine Learning. Mobile Netw Appl 28, 603–618 (2023). https://doi.org/10.1007/s11036-020-01635-y

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11036-020-01635-y

Keywords