Studies on a multidimensional public opinion network model and its topic detection algorithm
Introduction
Internet technology has been extensively applied, and this has caused the Internet to become an important channel for people to release, exchange and obtain information. By the end of 2016, netizens worldwide reached 3.42 billion people (Internet Live Stats, 2016) and netizens in China reached 731 million people. The Internet penetration rate in China is considerably higher than the world average (CNNIC, 2016). The public opinion information on “We the Media” networks, which are the main carriers for transmitting public opinions at present, originates from various sources. In the “We the Media” era, the voices of “mainstream media” have gradually weakened; people are no longer willing to obtain information regarding what is true and what is false solely from the “unified voice”; instead, people tend to make judgments based on independently obtained information. Due to the concealed and complicated transmission of public opinions on “We the Media,” the traditional “point to surface” transmission routes have been replaced with “point to point” peercasting, and the public opinions thus transmitted via the Internet are in the spotlight. The endless stream of public opinion topics are causing increasingly heated discussions, and rumors spread wantonly due to morbid social psychologies. For instance, the PRISM-gate scandal in 2013, the missing Malaysia Airlines Flight MH370 in 2014, and the Tianjin Port explosion in 2015 all generated extensive discussions on “We the Media” platforms, such as Microblog, Facebook, and Twitter. Moreover, they continue to generate many unpredictable secondary public opinion topics. Therefore, studies of online topic detection on “We the Media” networks have gradually attracted attention from scholars, management experts and the public.
Research on the topic detection can be regarded as complex and systematic multidisciplinary programs. Such studies can be conducted from the perspectives of social psychology, systems science, computer science, etc. Studies on topic detection from social psychology perspective mine public opinion contents primarily by counting the number of news items and posts on various platforms, such as websites and microblogs. The totals are then used as indicators of the attention that a topic has attracted (Ali et al., 2016, Griffiths and Steyvers, 2004). The topics are then sequenced in terms of popularity based on frequency and time (Galam, 2008, He et al., 2006, Sobkowicz, 2011). Studies from the perspective of systems science are conducted on basis of the structure of the networks on which the public opinions are released and transmitted (Tian, Zhang, & Liu, 2015). These studies use parameters relevant to social network models such as the modular coefficient, point centrality, and eigenvector centrality (Freeman, 1979, Li and Liu, 2016, Lorenz and Urbig, 2007) to identify opinion leaders or extract hotspot keywords (Choi and Han, 2013, Xu et al., 2014). When conducting studies from a computer science perspective, more attention is given to the selection and optimization of topic clustering methods. Those clustering algorithms were more frequently used in early topic detection attempts and include the K-means algorithm (Nguyen, 1998, Papka, 1999) and the single-pass algorithm (Allah et al., 2007, Papka and Allan, 2002). Then, relevant studies were conducted to compare the effects of the respective clustering algorithms and improve various algorithms to obtain an optimal effect (Chen and Jin, 2016, Makkonen et al., 2004). Specifically, these optimized algorithms include the incremental hierarchical clustering algorithm (Trieschnigg & Kraaij, 2004), the incremental K-means algorithm (Yang, Carbonell, & Brown, 1999), and the K-nearest classification algorithm (Hong, Zhang, & Liu, 2007).
As the number of in-depth studies conducted on public opinion topic detection, related conferences and exchanges have also become more frequent. At the 24th International Conference of the World Wide Web (WWW, 2015), as many as 321 papers had web mining, social network or content analysis themes—some 36.5% of all the papers submitted to the conference. At top-level academic meetings in the field of information retrieval, including SIGIR (Special Interest Group on Information Retrieval) seminars and workshops, studies related to online information mining are regarded as increasingly important. For example, for the first time, integrating the data models of various schools of information mining through reciprocal training was proposed to make the retrieved documents more precise in the paper “A Minimax Game for Unifying Generative and Discriminative Information Retrieval Models”, which won the best paper award at SIGIR 2017 (Wang, Yu, & Zhang, 2017). At the ACM International Conference on Web Search and Data Mining (WSDM), attention focuses on studies from aspects such as algorithms, assessments, and applications concerning online data retrieval and data mining. A paper titled “Information Evolution in Social Networks,” from a certain research program from University of Michigan, was nominated for the best paper award in 2016 (Adamic, Lento, & Adar, 2016). Moreover, close attention is being paid to the retrieval and mining of public opinion information at artificial intelligence or information retrieval conferences such as CIKM, NAACL, IJCAI, and AAAI.
The research approaches choice and the sentiment tendency analysis of public opinion are important aspects of topics detection study. To some extent, the precision of topic detection is constricted by approaches to online public opinion research. The relevant algorithms for topic detection mentioned above are based on probabilistic statistical distribution models (Allan et al., 1998, You et al., 2004), on natural language-processing models (Makkonen et al., 2004, Porte and Chisholm, 1980), or on relational network models (Cheng et al., 2011, Qi et al., 2015, Tan et al., 2011). Through the analysis of text content similarity and text transmission authority (Chakraborty and Chakraborty, 2007, Fiscus and Doddington, 2002), the online topics published by traditional news websites can be detected by using the first two approaches (Hong et al., 2007, You et al., 2004). The relational network model is the main approach used for online topic detection on “We the Media” networks because such models meet the requirement for interactive detection of public opinion topics (Fersini et al., 2017, Li and Du, 2011, Xu et al., 2014). In related research projects, single-layered viewpoint networks or single-layered social networks have been proposed based on the keyword coexistence network model (Cheng et al., 2011, Suchecki and Eguiluz, 2005, Yang et al., 2016). Topic sentiment tendency analysis is an important branch of semantic mining. In the existing studies, by combining the advantages of various machine learning methods (Wu, Wu, & Wu, 2018), the relevant scholars have constructed unsupervised models such as the Joint Sentiment/Topic Model (Lin & He, 2009) and the Aspect and Sentiment Unification Model (Jo & Oh, 2011) to realize joint detection of both text sentiment and topics. In recent years, some scholars have begun to pay attention to the network environments behind the public opinions; this aspect focuses on the use of user concerns relationships or praise relationships to construct sentiment analysis relational models (Nozza et al., 2014, Sathik and Rasheed, 2010). For example, some related scholars build an approval network and proposed a Network Aspect Sentiment model to judge topic sentiment tendency (Fersini, Pozzi, & Messina, 2017).
Because “We the Media” networks are real time, open, and lack a gatekeeper system, the detection of online topics is even more complicated. In most of the existing studies, a topic is mainly defined as an "event" from the perspective of the news-media information flow; most of the specific algorithms for topic detection are based on probabilistic statistical models or on natural language-processing models. However, there are often many different views or topics in a public opinion event on “We the Media” networks. Thus, it is necessary to fully consider the real process of topic generation in the same event. Although some scholars have attempted to adopt the structural similarity of social networks or viewpoint networks to design topic detection algorithm, no analysis has been conducted on the inherent paradigmatic relations between different attribute networks that focuses on systematic transmissions of public opinions. These research projects have paid attention to the interactions between various netizens or emphasized the coexistence of core viewpoints in public opinions; however, such approaches make it difficult to fulfill the need for scientific detection of topics on “We the Media” networks. In the present paper, a multidimensional public opinion model is proposed based on the process in which public opinions in “We the Media” are transmitted. This model also attempts to solve a number of restrictions on public opinion studies suffered by the traditional analytical tools. In the meantime, according to the joint analysis of the structure similarity and content similarity both inside and outside the subnetworks, a topic detection algorithm oriented toward multidimensional networks is designed. Finally, the developed algorithm is applied to the detection of many topics in a specific public opinion event on “We the Media” networks.
Section snippets
The significance of this research
In the “We the Media” era, negative public opinion topics spread and evolve rapidly. Thus, scholars in several related fields have focused on timely detection, accurate identification and scientific judgment of topics. Such efforts are useful for understanding the voices of netizens and the penetration of negative topics in real time, and it is of great practical and theoretical significance to conduct research on public opinion topic detection on “We the Media” networks.
Multidimensional public opinion network models oriented at “We the Media”
In “We the Media” era, public opinion information transmission is affected by the complex differences resulting from the varied social roles of netizens. In this paper, a multidimensional network model oriented at revealing the topological rules for public opinions on “We the Media” has been proposed by applying the concept of an element matrix to analyze the associations between public opinion elements and by using element attribute classifications from social psychology (Milburn, 1991) and
Topic detection algorithms for multidimensional public opinion networks
Public opinion topics on “We the Media” are complex and changeable. In this section, a topic detection algorithm for multidimensional public opinion networks is proposed that considers the structures of the social dimension subnetwork, the psychological dimension subnetwork, and the viewpoint dimension subnetwork as well as the roles of their content attributes in driving topic formation on “We the Media” networks.
Empirical analysis on multidimensional network topic detection algorithms: a case study
To conduct empirical studies, we selected a case study of public opinion concerning the Tianjin Port explosion, whose public opinion development process is relatively complete, and applied the proposed multidimensional network topic detection algorithm.
Conclusion
Since the rapid development of “We the Media” networks, they have increasingly become the main medium for the generation and dissemination of public opinion topics. By scientifically detecting public opinion topics concerning an incident on “We the Media” networks, it is possible to understand people's opinions and their negative views in a timely fashion. Thus, this capability is significant for cultivating a favorable online environment. In the existing topic detection studies, a topic is
Acknowledgments
This research is supported by the National Natural Science Foundation of China (NSFC) (71603250), and the Key projects of state key research and development plan of Ministry of Science And Technology, China (2016YFC0503407).
References (69)
- et al.
Opinion mining based on fuzzy domain ontology and support vector machine: A proposal to automate online review classification
Applied Soft Computing
(2016) - et al.
A fuzzy clustering methodology for linguistic opinions in group decision making
Applied Soft Computing
(2007) - et al.
Representative reviewers for Internet social media
Expert Systems with Applications
(2013) Centrality in social networks: I. conceptual clarification
Social Networks
(1979)- et al.
Similarity measures in scientometric research: The Jaccard index versus Salton's cosine formula
Information Processing & Management
(1989) - et al.
Who is talking? An ontology-based opinion leader identification framework for word-of-mouth marketing in online social blogs
Decision Support Systems
(2011) - et al.
Superedge prediction: What opinions will be mined based on an opinion supernetwork model?
Decision Support Systems
(2014) - et al.
Subjective well-being measurement based on Chinese grassroots blog text sentiment analysis
Information & Management
(2015) - et al.
SSIC model: A multi-layer model for intervention of online rumors spreading
Physica A Statistical Mechanics & Its Applications
(2015) - et al.
Tracing public opinion online: An example of use for social network analysis in communication research
Procedia - Social and Behavioral Sciences
(2013)
Superedge coupling algorithm and its application in coupling mechanism analysis of online public opinion supernetwork
Expert Systems with Applications
A hybrid unsupervised method for aspect term and opinion target extraction
Knowledge-Based Systems
Effect of unfolding on the spectral statistics of adjacency matrices of complex networks
Procedia Computer Science
Information evolution in social networks
On-line single-pass clustering based on diffusion maps
Topic detection and tracking pilot study final report
Virtual round table on ten leading questions for network research
European Physical Journal B
New Avenues in opinion mining and sentiment analysis
IEEE Intelligent Systems
Growth trends prediction of online forum topics based on artificial neural networks
Journal of Convergence Information Technology
Weibo topic detection based on improved TF-IDF algorithm
Science & Technology Review
Research on method of public opinion topic evolution analysis based on time sliced topic
Journal of Central China Normal University
Statistical report on Internet development in China
Approval network: A novel approach for sentiment analysis in social networks
World Wide Web-Internet & Web Information Systems
Topic detection and tracking evaluation overview
Sociophysics: A review of Galam models
International Journal of Modern Physics C
Community structure in social and biological networks
Proceedings of the National Academy of Sciences of the United States of America
Finding scientific topics
Proceedings of the National Academy of Sciences of the United States of America
Research and design of Internet public opinion analysis system
Microcomputer Information
Semi-automatic hot event detection
Topic detection and tracking review
Journal of Chinese Information Processing
Internet users in the world
Aspect and sentiment unification model for online review analysis
Opinion mining using decision tree based feature selection through Manhattan hierarchical cluster measure
Journal of Theoretical & Applied Information Technology
Aspect based topic and opinion mining
International Journal of Computer Trends & Technology
Cited by (49)
A multiple risk coupled propagation model for emergency information considering government information and government mandatory measures
2024, Expert Systems with ApplicationsAbove management: Scale development and empirical testing for public opinion monitoring of marine pollution
2023, Marine Pollution BulletinA novel topic clustering algorithm based on graph neural network for question topic diversity
2023, Information SciencesResearch on Fintech Public Opinion Situation and Risk Identification Based on DTM
2023, Procedia Computer ScienceNetwork public opinion monitoring and semantic event discovery strategy in mobile edge computing scenario
2024, Internet Technology LettersExploring the topic evolution of Dunhuang murals through image classification
2024, Journal of Information Science