Abstract
In the past decade, social media networks have received much attention among ordinary people, agencies, and research scholars. Twitter is one of the fastest-growing social media tools. By means of the Twitter application on smartphones, users are able to immediately report events happening around them on a real-time basis. The information disseminated by millions of active users every day generates a new version of a dynamic database that contains information about various topics. Twitter data can be utilized as a major traffic data source along with conventional sensors. In this aspect, this paper presents a novel firefly algorithm-based feature selection with a deep learning model for traffic flow analysis (FFAFS-DLTFA) using Twitter data. The goal of FFAFS-DLTFA is to determine the class labels for tweets as relevant to traffic events. The proposed FFAFS-DLTFA encompasses several processes, such as preprocessing, feature extraction, feature selection, and classification. Primarily, tweets are preprocessed in several ways, such as tokenization, removal of stop words, and stemming. At the same time, three types of embedding vectors, unigram, bigram, and POS features, are used. In addition, the firefly algorithm (FFA) is applied for the optimal selection of feature subsets. Finally, a deep neural network (DNN) model is applied for the identification of tweets into three classes, namely, positive, neutral, and negative. The performance validation of FFAFS-DLTFA takes place using the benchmark Kaggle repository, and the results are inspected under different aspects. The experimental values demonstrate the better performance of FFAFS-DLTFA on the other techniques with the maximum accuracy of 98.83%.











Similar content being viewed by others
References
Atefeh F, Khreich W (2013) A survey of techniques for event detection in Twitter. Comput Intell. https://doi.org/10.1111/coin.12017
Gonzalez A, Bergasa LM, Yebes JJ (2014) Text detection and recognition on traffic panels from street-level imagery using visual appearance. IEEE Trans Intell Transp Syst 15(1):228–238
Anastasi G, Antonelli M, Bechini A, Brienza S, D’Andrea E, De Guglielmo D, Ducange P, Lazzerini B, Marcelloni F, Segatori A (2013) Urban and social sensing for sustainable mobility in smart cities. In: Proc. IFIP/IEEE Int. Conf. Sust. Internet and ICT for Sustainability, Palermo, Italy, 2013
Perera K, Dias D (2011) An intelligent driver guidance tool using location based services. In: Proc. IEEE ICSDM, Fuzhou, China, 2011, pp 246–251
Wanichayapong N, Pruthipunyaskul W, Pattara-Atikom W, Chaovalit P (2011) Social-based traffic information extraction and classification. In: Proc. 11th Int. Conf. ITST, St. Petersburg, Russia, 2011, pp 107–112
d’Orey PM, Ferreira M (2014) ITS for sustainable mobility: a survey on applications and impact assessment tools. IEEE Trans Intell Transp Syst 15(2):477–493
Agarwal M, Maze T, Souleyrette R (2005) Impacts of weather on urban freeway traffic flow characteristics and facility capacity. In: Proceedings of the 2005 Mid-Continent Transportation Research Symposium, pp 18–19
Abidin AF, Kolberg M, Hussain A (2015) Integrating twitter traffic information with Kalman filter models for public transportation vehicle arrival time prediction. Big-data analytics and cloud computing. Springer International Publishing, Cham, pp 67–82
Essien A, Petrounias I, Sampaio P, Sampaio S (2018) The impact of rainfall and temperature on peak and offpeak urban traffic. In: International Conference on Database and Expert Systems Applications. Springer, Cham, pp 399–407
Dong X, Lei T, Jin S, Hou Z (2018) Short-term traffic flow prediction based on XGBoost. In: 2018 IEEE 7th Data Driven Control and Learning Systems Conference (DDCLS). IEEE, pp 854–859
Essien A, Petrounias I, Sampaio P, Sampaio S (2019) Deep-PRESIMM: integrating deep learning with microsimulation for traffic prediction. In: IEEE International Conference on Systems, Man, and Cybernetics. IEEE Xplore, pp 1–6
Ahmed MF, Vanajakshi L, Suriyanarayanan R (2019) Real-time traffic congestion information from tweets using supervised and unsupervised machine learning techniques. Transp Dev Econ 5(2):1–11
Jones AS, Georgakis P, Petalas Y, Suresh R (2018) Real-time traffic event detection using Twitter data. Infrastruct Asset Manag 5(3):77–84
Das RD, Purves RS (2019) Exploring the potential of Twitter to understand traffic events and their locations in Greater Mumbai, India. IEEE Trans Intell Transp Syst 21(12):5213–5222
Dabiri S, Heaslip K (2019) Developing a Twitter-based traffic event detection model using deep learning architectures. Expert Syst Appl 118:425–439
Essien A, Petrounias I, Sampaio P, Sampaio S (2021) A deep-learning model for urban traffic flow prediction with traffic events mined from Twitter. World Wide Web 24(4):1345–1368
Ghosh M, Sanyal G (2018) Performance assessment of multiple classifiers based on ensemble feature selection scheme for sentiment analysis. Appl Comput Intell Soft Comput 2018(2018):1–12
Khan A, Gul MA, Uddin MI, Ali Shah SA, Ahmad S, Firdausi A, Dzulqarnain M, Zaindin M (2020) Summarizing online movie reviews: a machine learning approach to big data analytics. Sci Progr 2020(2020):1–13
Marie-Sainte SL, Alalyani N (2020) Firefly algorithm based feature selection for Arabic text classification. J King Saud Univ Comput Inf Sci 32(3):320–328
Wang Z, Li J, Zhang Z, Zuo Y (2020) SBS content detection for modified asphalt using deep neural network. Adv Mater Sci Eng 2020(2020):1–13
D’Andrea E, Ducange P, Lazzerini B, Marcelloni F (2015) Real-time detection of traffic from Twitter stream analysis. IEEE Trans Intell Transp Syst 16(4):2269–2283
Qin H, Zhang H (2021) Intelligent traffic light under fog computing platform in data control of real-time traffic flow. J Supercomput 77:4461–4483
Punitha V, Mala C (2021) Traffic classification for efficient load balancing in server cluster using deep learning technique. J Supercomput 77:8038–8062
Wu X, Xiang Y, Mao G et al (2021) Forecasting air passenger traffic flow based on the two-phase learning model. J Supercomput 77:4221–4243
Funding
The authors received no specific funding for this study.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflicts of interest to report regarding the present study.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Mounica, B., Lavanya, K. Feature selection with a deep learning based high-performance computing model for traffic flow analysis of Twitter data. J Supercomput 78, 15107–15122 (2022). https://doi.org/10.1007/s11227-022-04468-6
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-022-04468-6