Event prediction based on evolutionary event ontology knowledge
Introduction
Social media platforms like Twitter, Facebook, and Weibo generate a large amount of online news to discover, report, share, and communicate with others about various public events. The burst of netizens’ activities on the web can be seen as a valuable real-time reflection of events as they happen in real-time. Modeling evolutionary events and predicting subsequent events is crucial to many applications for text mining. Recent developments in the field of textual inferences and analytics have led to a renewed interest in event prediction. It is becoming an increasingly crucial issue in a broad range of fields, including finance, security, policy/governance, NGO planning, and disaster coordination efforts [1], [2]. Some related works focus on predicting news event’s importance and impact [3], [4] based on the keywords volume approach [5], [6], and sequential clustering [7]. Other prevailing event prediction methods focus on the dynamic content on the web with chronological sequence or spatial location of event statements, including Time-series (TS) methods [8], [9], [10] and Spatio-temporal (ST) methods [11], [12]. Several works focus on tracking social media for event prediction on the stock market [13], [14], traffic prediction [15], [16], spatial crime analysis [17] and malware propagation [18], [19]. Some researches have modeled the latent factor of user activities history to exploit the timestamped interaction information among users for predicting real-time events [20] simultaneously.
Events are presented in online news via the indication of anchor/trigger words reflecting that something has happened. These main words evoking the events, are called event mentions [21]. To make the event mentions be useful (i.e., for extracting knowledge and event prediction), some current works have also been proposed based on leveraging knowledge represented from unstructured texts for events prediction. The unstructured text contains contextual explanations for many of event knowledge and evolutionary patterns that are correlative to news events. Recent pieces of literatures offer findings of explicit or implicit causal relations between two events to construct causal patterns among events. As shown in Fig. 1, this is an event evolutionary pattern which the seed event is ‘a fire outbreak in somewhere’. Actually, each event is a generalized schema which performs integration of event instances. For example, an event instance of ‘cause[fire, 2 people are badly burnt]’ and another instance of ‘cause[fire, 3 people died]’ can be mapped into a event knowledge of ‘cause[fire, injuries and deaths]’. Then, subsequent events catenate by relations one-by-one to form an evolutionary pattern.
These evolutionary event patterns extracted from event texts are used to predict special events. Esteban et al. [22] maps a knowledge pattern graph to a tensor representation for event prediction. The method of using causal patterns is called textual semantic events prediction (TSEP), which stands out in many other techniques like TS or ST et al. Given the occurred events, TSEP aims to predict the plausible subsequent events with knowledge inference. For example, it is observed that when an earthquake occurs in some areas, plagues also break out. So as an earthquake occurs in another area, considering this area’s environment and post-disaster measures are similar to that of the former, we can expect an outbreak of the plagues in this area.
However, to generate a plausible subsequent event for prediction, there exists difficulties and challenges which are the following: (1) Lack of dynamic extensibility to extract and predict the event. Existing TSEP methods construct patterns of evolutionary events through lexical granularity for event prediction. These methods are limited to source text cannot be automatically extended with knowledge. (2) Lack of ontology knowledge of the evolutionary event. Some specific patterns exist in the evolutionary event, and the ontology of an evolutionary event can support the modeling of such patterns. (3) Lack of appropriate computational knowledge-augmented models. Modeling of event prediction requires computational frameworks for receiving extensive source data for event extraction. It also requires integrating evolutionary event patterns.
Our work stands out in many aspects from those mentioned methods like TS, ST, and TSEP. We introduce a semantic and language-independent system to predict news events by logical knowledge unrestricted specific textual scenarios. Based on informative news texts from Baidu encyclopedia entries, we extract a large number of evolutionary event patterns and represent them in standard web ontology language (OWL). The events extracted from entries are transformed into OWL structure and merged into the knowledge base. OWL is inherently useful for retrieving, reasoning, and eventually prediction by the machine. These event knowledge with OWL formulation are sufficient to understand a text with logical rules or patterns semantically. It is easier to represent logical inference on the large-scale text corpora. Our study aims to build a predictive model of news events combining logic patterns. We use the advantages of knowledge representation to predict a subsequent event in generalized logic schemas.
Specifically, we construct evolutionary event ontology knowledge (denoted as EEOK) by semantic analysis tools1 and represent it into a machine-readable ontology language OWL model for predicting future events. EEOK can be illustrated as a directed cyclic graph with event chains whose nodes stand for generalized events and edges stand for the relations (e.g., temporal, causal) between two events. The ontology knowledge (logic patterns) of evolutionary events can be treated as to generalize a variety of evolutionary event instances, which are specific events and their arguments. To the best of our knowledge, there is no existing evolutionary event ontology knowledge of emergency events in Chinese corpus. Our study proposes solutions to fill the gaps.
It should be noted that both the generalized events knowledge and specific events knowledge are extracted from the same news text. The semantic consistency of the two kinds of knowledge is maintained, thus the error of knowledge alignment is naturally reduced. In this way, EEOK obtains the most appropriate generalization of the occurred event with textual environments into formulated knowledge patterns. These patterns can be injected into the machine independently of natural language by human experts in OWL formalism as a priori knowledge or practical knowledge. The subsequent event can be reasonably inferred based on manually calibrated rules employed as ontology knowledge of evolutionary events with valuable common-sense knowledge. The main contributions of this work are threefold:
(1) We build EEOK that leverages the evolutionary event knowledge represented in standard ontology language OWL with a set of evolutionary patterns.
(2) We use state-of-the-art NLP tools to extract specific evolutionary event instances and their corresponding generalized evolutionary event patterns. Based on merging to constructed EEOK, candidates of events are selected for further prediction. The subsequent event can be integrated into the knowledge base for knowledge updating.
(3) Considering the different event domains, we offer a domain-aware event prediction method since their different patterns existed in different domains. A series of experimental studies have been conducted, and our event domain-aware method has shown superiority performance.
The code of this work is available at https://github.com/RingBDStack/KGEvetPred.
The remainder of this paper is structured as follows: Section 2 describes the related works in TSEP. Section 3 includes the critical components of our model. Section 4 presents the integral system of event prediction. Section 5 provides a problem description. Section 6 describes our model formulation for event prediction modeling. Section 7 provides quantitative and qualitative evaluation, and the last section summarizes our findings and concludes with suggestions for our future work.
Section snippets
Preliminary
Some definitions and notations and the problem for evolutionary event ontology construction and utilization to predict a subsequent event are introduced in this section. The variables of event can be seen in Table 1.
Definition 1 Event Trigger A trigger action represents an event. It is a predicate used to identify an event.
Definition 2 Event Arguments Event arguments are mentioned entities like temporal expression or value (e.g., location, organization) that serves as a participant or attribute with a specific role to an event. The arguments of a
System description
The overview of the system architecture is shown in Fig. 3. News entries are first processed continuously by a pipeline of text semantic processes. Based on extracted semantic feature tags, news text is now processed by event extraction and evolutionary event recognition to generalize evolutionary event knowledge. Evolutionary event knowledge is constructed from an evolutionary event’s instances by mapping them into canonical forms. As shown in Fig. 2, the top layer is EEOK, which consists of
Prediction model
We propose a neural compositional prediction structure with modeling a nonlinear order relation among events. The main modeling steps include: (1) event representation, (2) order modeling of events chains, (3) calculate relation score, (4) model training.
Datasets
We evaluate the effectiveness of our proposed approaches on the real-world corpora of the Baidu Encyclopedia events entries across a diverse range of domains in the explosion, conflagration, geological hazard, traffic accident, personal injury. The event prediction dataset is labeled manually. Two human groups informed about evolutionary event knowledge were considered to revise extracted events for evaluation. The first group was asked to annotate the merged ontology knowledge of evolutionary
News event prediction
An early representative study [43] applied semantic natural language modeling techniques to news titles. It contains specific predefined causality patterns (such as ‘X because Y,’ ‘X causes Y,’ etc.) used to identify pairs of structured events. The world knowledge is leveraged from several well-known ontologies of ConceptNet [44], WordNet [45], and LinkedData [46] to build the entity graph from the events. Most systems learning causality for events prediction can be executed as follows: (1)
Conclusion
We explored a framework of automatic acquisition of event knowledge from text and constructed an evolutionary event knowledge ontology, with a focus on the next event prediction. With this framework, event instance and correspond event knowledge extracted from uncertain sources of event news can be incorporated together in standard EEOK to be used for eventually event prediction. To the best of knowledge, this is the first work covering ontology knowledge of evolutionary event and using it for
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgments
This work is supported by the National Key R&D Program of China (2018YFC0830804), NSFC (No.61772151 and No.61872022), Academic Excellence Foundation of BUAA for Ph.D. Students. The co-author Min He is supported by National Key R&D Program of China (No. 2017YFB0803305). We also thank our anonymous reviewers for their constructive comments.
Qianren Mao, is currently a Ph.D. candidate at the School of Computer Science and Engineering in Beihang University (BUAA), China. His research interests include event knowledge graph, text generation and abstractive summarization and deep learning.
References (53)
- et al.
Combining time-series and textual data for taxi demand prediction in event areas: A deep learning approach
Inf. Fusion
(2019) - et al.
Improving stock market prediction via heterogeneous information fusion
Knowl.-Based Syst.
(2018) - et al.
Fine-grained event categorization with heterogeneous graph convolutional networks
- et al.
Event detection and evolution in multi-lingual social streams
Frontiers Comput. Sci.
(2020) - et al.
Predicting the future impact of news events
- et al.
Predicting the spatial impact of planned special events
- et al.
Event early embedding: Predicting event volume dynamics at early stage
- et al.
Event detection in twitter: A keyword volume approach
- et al.
Sequential clustering for event sequences and its impact on next process step prediction
- et al.
A rough set approach to events prediction in multiple time series
Modeling extreme events in time series prediction
Predicting soccer highlights from spatio-temporal match event streams
Deep mixture point processes: Spatio-temporal event prediction with rich contextual information
Tracking multiple social media for stock market event prediction
Computing urban traffic congestions by incorporating sparse GPS probe data and social media data
ACM Trans. Inf. Syst.
Trafficgan: Network-scale deep traffic prediction with generative adversarial nets
IEEE Trans. Intell. Transp. Syst.
Integration of social media in spatial crime analysis and prediction models for events
Prediction of malware propagation and links within communities in social media based events
Learning graph embedding with adversarial training methods
IEEE Trans. Cybern.
Incremental learning with social media data to predict near real-time events
Predicting the co-evolution of event and knowledge graphs
LTP: A Chinese language technology platform
Squad: 100, 000+ questions for machine comprehension of text
Know what you don’t know: Unanswerable questions for squad
Skip n-grams and ranking functions for predicting script events
Cited by (40)
Predicting multi-subsequent events and actors in public health emergencies: An event-based knowledge graph approach
2024, Computers and Industrial EngineeringPrompt-based event relation identification with Constrained Prefix ATTention mechanism
2023, Knowledge-Based SystemsSyntax-based dynamic latent graph for event relation extraction
2023, Information Processing and ManagementGuest Editorial: Graph-powered machine learning in future-generation computing systems
2022, Future Generation Computer SystemsAutomatic Event Semantic Division Based on Instance Distribution Constraints
2024, Yingyong Kexue Xuebao/Journal of Applied SciencesA Novel Algorithm for Multi-Criteria Ontology Merging through Iterative Update of RDF Graph
2024, Big Data and Cognitive Computing
Qianren Mao, is currently a Ph.D. candidate at the School of Computer Science and Engineering in Beihang University (BUAA), China. His research interests include event knowledge graph, text generation and abstractive summarization and deep learning.
Xi Li is a graduate student of Beihang University, Beijing, China. Her research interests include knowledge graph, text summarization and deep learning.
Hao Peng, received Ph.D. degree from the School of Computer Science and Engineering in Beihang University, Beijing, China. His research interests include deep learning, representation learning, big data computing, social network analysis.
Jianxin Li is a professor at the School of Computer Science and Engineering, Beihang University, China. He received his Ph.D. degree from Beihang University in 2008. He was a visiting scholar in machine learning department of Carnegie Mellon University, USA in 2015, and a visiting researcher of MSRA in 2011. His current research interests include data analysis and processing, distributed systems, and system virtualization.
Dongxiao He received her B.S., M.S., and Ph.D. degrees in computer science from Jilin University, Changchun, China, in 2007, 2010, and 2014, respectively. She was a Post-Doctoral Research Fellow in Department of Computer Science, Dresden University of Technology, Germany, from 2014 to 2015. She is an Associate Professor with the School of Computer Science and Technology, Tianjin University, Tianjin, China. She has published over 40 international journal and conference papers. Her current research interests include data mining and analysis of complex networks.
Shu Guo, received the Ph.D. degree from the Institute of Information Engineering, Chinese Academy of Sciences. Her current research interests include knowledge graph embedding, meta learning, and relational data analysis.
Min He, received the M.Sc and Ph.D degree from Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China in 2007 and 2016. She is currently an advanced engineer in National Computer Network Emergency Response Technical Team/Coordination Center of China. Her main research focus is on natural language process, Web mining and information.
Lihong Wang, is currently a Professor in the National Computer Network Emergency Response Technical Team/Coordination Center of China. Her current research interests include information security, cloud computing, big data mining and analytics, information retrieval, and data mining.