Twitter trends: A ranking algorithm analysis on real time data
Introduction
The users’ content generation facility of the social web has changed the conventional World Wide Web (WWW) into the social web. The social web platforms have defined new approaches to interact, and the way people used to live. Now, people can easily communicate with each other all over the world and shares their views about all different topics such as politics, education, health, etc. The transmittance of news is the best example that illustrates the rapid exchange of information all around the world. This method has set people free form government or big organizations-based media options such as television channels for spread of news and information at global level in an instant. Technologies like mobiles, tablets, and smartphones are used to create a plethora of data and spread it through the internet every day on blogs, websites and social network services (SNS’s) like Twitter, YouTube, Facebook, Twitter, etc. These online networks contain information related to personal experiences, opinions, ideas, thoughts, and ideas of people in various modes (Ye, Law, Gu, & Chen, 2011). One can predict the opinions and behaviors based on the opinions and experiences of these people through the content they are posting on these social networks (Figueiredo, Almeida, Gonçalves, & Benevenuto, 2016). As Zhang et al. (2012) are of the view that by collecting the real-time user-generated content related to voting, casting can be used to foresee the election results. The SNS is a platform which uses user-generated data to create relationships and social networks. These networks operate on the Internet and had 2.46 billion users in the whole world in 2017 (Pesonen, 2018). These networks have not only become a part of everyday modern life, but are now an important area of research. They allow researchers on SNS to work on various inquiry topics: the circulation of information in SNS (Scanfeld et al., 2010, Naveed et al., 2011, Weeks et al., 2013), social network structures (Krumm et al., 2008, Kwak et al., 2010, Leskovec et al., 2009), predictions/speculations (Paul and Dredze, 2011, Becker et al., 2010, Figueiredo et al., 2016, Papacharissi et al., 2012) and the influence they have on other resources (Hermida and Thurman, 2008, Ye et al., 2011, Zhang et al., 2012, Papacharissi et al., 2012, Bruns and Burgess, 2012, Lee and Ma, 2012).
The reason behind a hard and hefty research being done in the SNS is due to the importance of the information they can provide. Many of the academic researchers believe that it is important to understand the relation between SNS and news, how they influence each other (Weeks et al., 2013, Lerman and Ghosh, 2010, Lee and Ma, 2012, Nielsen and Schrøder, 2014, Tsagkias et al., 2011) but no one has researched in the characteristics of the contents of each field. Tsagkias, Weerkamp, and De Rijke (2009) state that it is important to study and analyze the characteristics of multiple closely related datasets to understand the hidden informative patterns.
The language of SNS is informal, its grammar is also informal, its keywords are not related to the content, its topic range is private and ranges wide beyond its topic boundaries. This is why it becomes more difficult to encompass any specific event. Text analysis and mining algorithms extract current trends from various text sources which depict the prevailing topics in a particular society. This paper performs trend analysis on the basis of a framework which consists of a number of steps. The first step is a data extraction; it is done by using Twitter streaming API (Application Program Interface). The second step is preprocessing, this step includes data cleaning, data integration, and data reduction. Step three is to calculate Term Frequency-Inverse Document Frequency (TF-IDF), it consists of calculating Term Frequency (TF), calculating Inverse Document Frequency (IDF), and calculating Combined Component Approach (CCA). The fourth and last step is ranking of top topics. This paper analyzes Twitter data by using ranking algorithms. It further extracts helpful and valuable characteristic features of SNS which can also be used in further researches. Overall, the major research contributions of this work are given as follows:
- •
In order to find the relevant trends, we use TF-IDF and CCA model to find the relevant topics for the most frequently occurred keywords in the collection of tweets.
- •
We use BTM to find the topics in the tweets collection on the basis of different categories. BTM model is ideal for short texts specially tweets as it works on the basis of word co-occurrence patterns and aggregated patterns in the whole corpus for learning topics.
- •
We also analyze the frequent keywords of the trending topics from different perspectives and then discuss their characteristics in detail.
- •
We perform a detailed analysis on how Twitter trends change in a particular period of time using the real-time data. In this regard, we analyzed the data of trending Twitter topics that are changing with continuously with time.
The following paper is organized as: Section 2 discusses the related work. Section 3 presents the Problem statement and Research Contributions. Section 4 discusses Research Methodology while Section 5 discusses the Experimental Setup. Section 6 encompasses the analysis done on the datasets. The paper analyzes the common trends of the SNS data; furthermore, it also explores the efficiency of the ranking algorithm. While Section 6 discusses the Research implications before concluding the paper in the next section.
Section snippets
Related work
Numerous academic researchers have done work on the relationship between SNS datasets and the topics used in the real-world (Paul and Dredze, 2011, Wu et al., 2017, Wu et al., 2017). They have used SNS data to explore the effects and causes to particular aspects and to understand the public behavioral patterns in certain circumstances.
The connection between the hotel reservations and the traveler reviews about it is explored (Ye et al., 2011). Their study revealed that if positive reviews about
Problem statement and research objectives
This section discusses the problem statement and major objective of this research.
Research methodology
The approach used in this research is based upon another similar model which is based on news and social media services (Jang and Yoon, 2018). Fig. 1 illustrates the proposed framework showing steps of the research methodology carried out to accomplish this research study. First of all, the data extraction module describes that the data was prepared using Twitter streaming API and the results are stored in the .xlsx file. Then, the next step depicts the data preprocessing steps such as data
Experimental setup
This section introduces the method of selection of dataset for collected corpus, and then presents the ordinary characteristics of data from SNS. Table 1 summarizes the source of the data for Social Network Services SNS that is collected. Data is collected using the Twitter application programming interface API that extracts data from Twitter sources. Datasets were collected for 15 days from November 13 to November 28, 2018. The Twitter API collects data of 63,538 tweets with total volume of
Results and discussions
This section of the paper analyzes the characteristics of social network service data and scrutinizes the efficiency of the ranking algorithm. (Salton, Buckley, & management, 1988) summarized the gains in automatic term weighting and also provided different term indexing models with which the content analysis procedures can be compared. (Robertson & Walker, 1999) provided with a Basic Search System (BSS) using weighting functions and term ranking for selection. They compared the significance of
Research implication
Top research implications of Twitter trend detection are event detection that help in finding what is going on around the world and in the country, top news view as most of the Twitter users are talking about news, influential user’s detection as sometimes there are particular users behind the trends, and many more. Other applications include timeline ranking and search query expansion.
Conclusion and future work
This study is conducted to analyze the social trends of people regarding different aspects of life. The activity on social media networks, blogs and microblogs (as Twitter) is mainly executed through texts, links, informative content and images. Some content attracts more attention, especially the content involving visual images, tagging, commenting, links and sharing. Popular content among a particular set of people indicate current trends, highly alert news, interests or information.
CRediT authorship contribution statement
Hikmat Ullah Khan: Conceptualization, Validation, Formal analysis, Investigation, Writing - review & editing, Resources, Supervision, Project administration. Shumaila Nasir: Methodology, Formal analysis, Data curation, Writing - original draft. Kishwar Nasim: Software, Data curation, Formal analysis. Danial Shabbir: Methodology, Visualization. Ahsan Mahmood: Methodology, Software, Writing - review & editing, Visualization.
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
References (53)
- et al.
The tweeter matters: Factors that affect false memory from Twitter
Computers in Human Behavior
(2017) - et al.
Real-time event detection from the Twitter data stream using the TwitterNews+ Framework
Information Processing & Management
(2019) - et al.
Predicting information diffusion on Twitter-Analysis of predictive features
Journal of Computational Science
(2018) - et al.
The evolution of sentiment analysis—A review of research topics, venues, and top cited papers
Computer Science Review
(2018) - Abel, F., Gao, Q., Houben, G. -J. & Tao, K. (2011). Analyzing user modeling on Twitter for personalized news...
- et al.
Who is retweeting the tweeters? modeling, originating, and promoting behaviors in the Twitter network
ACM Transactions on Management Information Systems (TMIS)
(2012) - et al.
Social media and fake news in the 2016 election
Journal of Economic Perspectives
(2017) - et al.
Communities of followers in tourism Twitter accounts of European countries
European Journal of Tourism, Hospitality and Recreation
(2015) - Becker, H., Naaman, M. & Gravano, L. (2010). Learning similarity metrics for event identification in social media....
- et al.
Multi-class sentiment analysis in Twitter: What if classification is not the answer
IEEE Access
(2018)
Tweeters on campus: Twitter a learning tool in classroom?
Journal of Universal Computer Science
Good friends, bad news-affect and virality in Twitter
Characteristics analysis of data from news and social network services
IEEE Access
Twitter as a teaching practice to enhance active and informal learning in higher education: The case of sustainable tweets
Active Learning in Higher Education
On Modelling for Bias-Aware Sentiment Analysis and Its Impact in Twitter
Journal of Web Engineering
Cited by (33)
“Born in Rome” or “Sleeping Beauty”: Emergence of hashtag popularity on the Chinese microblog Sina Weibo
2023, Physica A: Statistical Mechanics and its ApplicationsIdentifying the influential nodes in complex social networks using centrality-based approach
2022, Journal of King Saud University - Computer and Information SciencesCitation Excerpt :One of the most important applications include viral marketing by accelerating the flow of information for marketing different products and services (Bakshy et al., 2012). A comprehensive analysis of influence patterns can help in formulating effective marketing strategies for understanding rapid shifts in specific trends that can provide unique marketing edge or other valuable gains (Ishfaq et al., 2017; Khan et al., 2021). In addition, curbing the spread of unwanted content, negative behavior, viruses are some of the popular applications of influential user mining (Ma et al., 2016; Xia et al., 2015).
Event detection from real-time twitter streaming data using community detection algorithm
2024, Multimedia Tools and ApplicationsTowards Temporal Event Detection: A Dataset, Benchmarks and Challenges
2024, IEEE Transactions on MultimediaOptimized Ensemble Approach for Multi-model Event Detection in Big data
2023, International Journal on Recent and Innovation Trends in Computing and CommunicationHybrid Multichannel-Based Deep Models Using Deep Features for Feature-Oriented Sentiment Analysis
2023, Sustainability (Switzerland)