SCIFNET: Stance community identification of topic persons using friendship network analysis
Introduction
With the prevalence of telecommunication technologies and the explosive growth in medium digitization, there are now enormous amounts of information on the Internet. As a result, people worldwide can easily obtain information about the latest topics, such as global economic trends, political events, and sports tournament results via the Internet. Usually, people are interested in topics that involve communities with different competing viewpoints or stances. However, they are often overwhelmed by the large number of topic documents that cover every detail of different stance communities. For example, in the topic about the selection of a new International Monetary Fund (IMF) president in 2011, Google News1 collected hundreds of topic documents that reported the development of the campaign. Although the documents covered all perspectives on the topic (i.e., from the interactions between the candidates to the viewpoints of the general public), readers generally had difficulty assimilating the enormous amount of information in the documents. To ease the burden of reading so many topic documents, several topic mining techniques have been developed. For instance, Nallapati et al. [35] grouped topic documents into clusters, each of which presents a theme of a topic; Feng and Allan [22] extracted informative sentences from themes to summarize a topic; and Chen and Chen [5], [6] further organized themes and summaries chronologically to depict the storyline of a topic. The techniques successfully condense the content of a topic. However, readers still need to invest a lot of time in digesting the generated summaries if they are not familiar with the topic.
A topic is basically associated with persons, times, and places [35]. Learning the associations between the persons mentioned in a set of topic documents (called topic persons hereafter) can help readers construct the background knowledge of the topic and digest the information quickly. For instance, in the above mentioned topic about the new IMF president selection, if readers had known that Angela Merkel supported Christine Lagarde (i.e., they are detected in the same community), they would have understood why she said “Christine Lagarde is an ideal embodiment of economics.”
In this paper, we investigate the stance community identification problem, which involves clustering topic persons into stance-coherent communities. For instance, given the documents about the selection of the new IMF president in 2011, the stance community identification method discovers communities of persons, which represent the camps of the different candidates running for election, as shown in Fig. 1. Identifying stance communities of topic persons is a new research area, and to the best of our knowledge, only Chen et al. [7], [8] have addressed the stance community identification problem. They proposed using Principal Component Analysis (PCA) [4]. Specifically, they examine the signs of the entries in the eigenvector associated with the largest eigenvalue to recognize stance communities of topic persons. The method can only handle two-stance topics; however, in practice, many topics involve more than two stances. Here, we present a novel stance community identification method called SCIFNET (Stance Community Identification based on Friendship NETwork), which analyzes a set of topic documents to identify stance communities and the corresponding persons in a topic. First, SCIFNET constructs a friendship network in which the nodes represent topic persons. The co-occurrence of the persons in the topic documents, the documents’ stance orientation, and the co-neighboring level between nodes are leveraged to define the friendship strength between persons (i.e., the edge weights). We model stance community identification as a community detection task and design an objective function to evaluate the results. Stance community expansion and stance community refinement techniques, which are based on the objective function, are designed to iteratively cluster topic persons into stance-coherent communities and detect persons that are stance-irrelevant about the topic of interest. Their convergence proofs are presented such that the identification result converges to a local optimum. Evaluations based on real-world topics demonstrate the effectiveness of SCIFNET, and show that it outperforms well-known clustering and community detection approaches.
The proposed method has the following advantages over the current community detection research. First, most iterative clustering-based community detection methods, such as those in [20], [31], [48], would suffer the early merging problems of a node in a network tending to be merged (clustered) with a community simply because it is close to the community's seed. To get rid of this type of problem, we design the stance community refinement which iteratively refines the detected communities. Second, nodes in a social (friendship) network can play different roles. Differing from the overlapping node, bridge node, and hub node investigated in [13], [14], [21], the proposed method is able to identify stance-irrelevant nodes which stand for persons neutral to the stances of a topic. Finally, since topic persons may have opposing orientations, the constructed friendship network could have negative edges. While several community detection methods, such as [13], [21], [32] analyze network structures to infer communities, our method further examines edge signs to correctly detect stance communities of topic persons.
The remainder of this paper is organized as follows. In the next section, we review related works. Then, we describe SCIFNET in detail, and demonstrate its efficiency in experimental section. Final section contains our conclusions.
Section snippets
Related work
Our research is related to community detection [41]. Given a network of interests, the community detection task involves identifying sub-networks, each of which represents a coherent community [12], [24], [36], [39]. For instance, given a social network, community detection methods identify groups of people with similar preferences [41]. The identified communities are useful to comprehend various social phenomena, such as epidemic spreading [43], and human interactions [14], [15], [40], [42].
Methodology
We proposed a stance community identification method, SCIFNET, which clusters the persons mentioned in topic documents into stance-coherent communities. Fig. 2 shows SCIFNET's system architecture, which is comprised of three components: friendship network construction, stance community expansion, and stance community refinement. Specifically, given a set of documents reporting a topic with K stance communities, SCIFNET first extracts the topic persons mentioned in the documents. Then, it
Experiment
In this section, we introduce the data corpus used in the experiments; demonstrate the effectiveness of each system component; and compare our method's performance with those of other well-known community detection methods and clustering algorithms. Then, we present a stance community identification result and discuss the stance-irrelevant persons detected by our method.
Concluding remarks
The Internet has become a crucial medium for disseminating and acquiring the latest information about topics. However, users are often overwhelmed by the enormous number of topic documents. Basically, times, places, and persons are the key elements of topics. Knowing the associations of topic persons can help readers construct the background knowledge of a topic and comprehend numerous topic documents quickly. In this paper, we defined the problem of stance community identification, which
Acknowledgement
This research was supported in part by MOST 103-2221-E-002-106-MY2 from the Ministry of Science and Technology, Republic of China.
Reference (56)
- et al.
Uncovering overlapping community structures by the key bi-community and intimate degree in bipartite networks
Physica A
(2014) - et al.
Detecting overlapping communities in networks using the maximal sub-graph and the clustering coefficient
Physica A
(2014) - et al.
Detecting community structure via the maximal sub-graphs and belonging degrees in complex networks
Physica A
(2014) - et al.
Approximating web communities using sub-space decomposition
Knowl. Based Syst.
(2014) - et al.
Community detection using local neighborhood in complex networks
Physica A
(2015) - et al.
Overlapping community detection using neighborhood ratio matrix
Physica A
(2015) - et al.
Co-authorship networks in the digital library research community
Inf. Process. Manag.
(2005) - et al.
Detecting overlapping communities by seed community in weighted complex networks
Physica A
(2013) - et al.
Uncovering the overlapping community strcuture of complex networks by maximal cliques
Physica A
(2014) - et al.
An automated system for grammatical analysis of Twitter messages: A learning task application
Knowl. Based Syst.
(2016)
Detecting communities by the core-vertex and initimate degree in complex networks
Physica A
Topic detection and tracking pilot study final report
Communities and balance in signed networks: a spectral approach
Simrank++: query rewriting through link analysis of the click graph
Bayesian Reasoning and Machine Learning
TSCAN: a novel method for topic summarization and content anatomy
TSCAN: a content anatomy approach to temporal topic summarization
IEEE Trans. Knowl. Data Eng.
An unsupervised approach for person name bipolarization using principal component analysis
IEEE Trans. Knowl. Data Eng.
Bipolar person name identification of topic documents using principal component analysis
Detecting communities in social networks using max-min modularity
Local Community Identification in Social Networks
A social hypertext model for finding community in blogs
Finding community structure in very large networks
Phys. Rev. E
A min-max cut algorithm for graph partitioning and data clustering
Overlapping community detection based on network decomposition
Sci. Rep.
Lower bounds for the partitioning of graphs
IBM J. Res. Dev.
Finding and linking incidents in news
On community outliers and their efficient detection in information networks
Cited by (5)
Using Multi-task Deep Neural Network to Explore Person Interaction from Social Media
2022, Proceedings - 2022 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology, WI-IAT 2022Stance detection: A survey
2020, ACM Computing SurveysStance Detection in Turkish Tweets
2017, arXivStance detection in Turkish tweets
2017, CEUR Workshop Proceedings