Abstract
With more and more news articles appearing on the Internet, discovering causal relations between news articles is very important for people to understand the development of news. Extracting the causal relations between news articles is an inter-document relation extraction task. Existing works on relation extraction cannot solve it well because of the following two reasons: (1) most relation extraction models are intra-document models, which focus on relation extraction between entities. However, news articles are many times longer and more complex than entities, which makes the inter-document relation extraction task harder than intra-document. (2) Existing inter-document relation extraction models rely on similarity information between news articles, which could limit the performance of extraction methods. In this paper, we propose an inter-document model based on storytree information to extract causal relations between news articles. We adopt storytree information to integer linear programming (ILP) and design the storytree constraints for the ILP objective function. Experimental results show that all the constraints are effective and the proposed method outperforms widely used machine learning models and a state-of-the-art deep learning model, with F1 improved by more than 5% on three different datasets. Further analysis shows that five constraints in our model improve the results to varying degrees and the effects on the three datasets are different. The experiment about link features also suggests the positive influence of link information.




Similar content being viewed by others
References
Aryal S, Ting KM, Washio T, Haffari G (2019) A new simple and effective measure for bag-of-word inter-document similarity measurement. CoRR, abs/1902.03402
Bai H, Zhao H (2018) Deep enhanced representation for implicit discourse relation recognition. In: Proceedings of the 27th international conference on computational linguistics, Santa Fe, New Mexico, USA, August 2018, pp 571–583
Blei D, Ng A, Jordan M (2003) Latent Dirichlet allocation. J Mach Learn Res 3:993–1022
Breiman L (2001) Random forests. Mach Learn 45(1):5–32
Cao J, Wang S, Wen D, Peng Z, Yu PS, Wang F (2020) Mutual clustering on comparative texts via heterogeneous information networks. Mach Learn 62:175–202
Cao Y, Fang M, Tao D (2019) BAG: bi-directional attention entity graph convolutional network for multi-hop reasoning question answerings. In: Proceedings of the 2019 conference of the north American chapter of the association for computational linguistics: human language technologies, volume 1 (long and short papers), Minneapolis, Minnesota, June 2019, pp 357–362
Chang M, Ratinov L, Roth D (2012) Structured learning with constrained conditional models. Mach Learn 88(3):399–431
Christopoulou F, Miwa M, Ananiadou S (2018) A walk-based model on entity graphs for relation extraction. In: Proceedings of the 56th annual meeting of the association for computational linguistics (volume 2: short papers), Melbourne, Australia, July 2018, pp 81–88
Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297
El Barbary OG, Salama AS (2018) Feature selection for document classification based on topology. Egypt Inform J 19(2):129–132
Freund Y, Schapire RE (1997) A decision-theoretic generalization of on-line learning and an application to boosting. J Comput Syst Sci 55(1):119–139
Gao L, Choubey PK, Huang R (2019) Modeling document-level causal structures for event causal relation identification. In: Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, volume 1 (long and short papers), Minneapolis, Minnesota, June 2019, pp 1808–1817
Goldberg AB, Zhu X, Wright S (2007) Dissimilarity in graph-based semi-supervised classification. In: Proceedings of the eleventh international conference on artificial intelligence and statistics), San Juan, Puerto Rico, March 2007, pp 155–162
Han X, Wang L (2020) A novel document-level relation extraction method based on Bert and entity information. IEEE Access 8:96912–96919
Haneczok J, Piskorski J (2020) Shallow and deep learning for event relatedness classification. Inf Process Manage 57:102371
Jiang H, Liu JT, Zhang S, Yang D, Xiao Y, Wang W (2020) Surface pattern-enhanced relation extraction with global constraints. Knowl Inf Syst 62:4509–4540
Sparck Jones K, Walker S, Robertson SE (2000) A probabilistic model of information retrieval: development and comparative experiments part 2. Inf Process Manag 36(6):809–40
Krishnamoorthy S (2018) Surface pattern-enhanced relation extraction with global constraints. Knowl Inf Syst 56:373–394
Liu B, Han FX, Niu D, Kong L, Lai K, Xu Y (2020) Story forest: extracting events and telling stories from breaking news. ACM Trans Knowl Discov Data 14(3):1–28
Liu P, Gulla JA, Zhang L (2018) Retracted article: a joint model for analyzing topic and sentiment dynamics from large-scale online news. World Wide Web 21:1527–1549
Lu T (2015) Semi-supervised microblog sentiment analysis using social relation and text similarity. In: 2015 international conference on big data and smart computing (BIGCOMP), February 2015, pp 194–201
Lv S, Huang L, Zang L, Zhou W, Han J, Songlin H (2020) RETRACTED ARTICLE: a joint model for analyzing topic and sentiment dynamics from large-scale online news. World Wide Web 23:2449–2470
Mele I, Bahrainian SA, Crestani F (2019) Event mining and timeliness analysis from heterogeneous news streams. Inf Process Manag 56(3):969–993
Morente-Molinera JA, Wikstrom R, Herrera-Viedma E, Carlsson C (2019) A linguistic mobile decision support system based on fuzzy ontology to facilitate knowledge mobilization. Decis Support Syst 81:66–75
Mostafazadeh N, Grealish A, Chambers N, Allen J, Vanderwende L (2016) CaTeRS: causal and temporal relation scheme for semantic annotation of event structures. In: Proceedings of the fourth workshop on events, San Diego, California, June 2016, pp 51–61
Nan G, Guo Z, Sekulic Ivan , Lu Wei (2020) Reasoning with Latent Structure Refinement for Document-Level Relation Extraction. Proceedings of the 58th annual meeting of the association for computational linguistics, Online, July 2020, pp 1546–1557
Ning Q, Feng Z, Wu H, Roth D (2018) Reasoning with latent structure refinement for document-level relation extraction. In: Proceedings of the 56th annual meeting of the association for computational linguistics (volume 1: long papers), Melbourne, Australia, July 2018, pp 2278–2288
Nordhausen K (2009) The elements of statistical learning: data mining, inference, and prediction. Int Stat Rev 77(3):482–482
Ohsawa Y, Benson NE, Yachida M (1998) KeyGraph: automatic indexing by co-occurrence graph based on building construction metaphor. In: Proceedings IEEE international forum on research and technology advances in digital libraries -ADL’98-, pp 12–18
Qin L, Zhang Z, Zhao H (2016) A stacking gated neural architecture for implicit discourse relation classification. In: Proceedings of the 2016 conference on empirical methods in natural language processing, Austin, Texas, November 2016, pp 2263–2270
Qin P, Xu W , Wang WY (2018) Robust distant supervision relation extraction via deep reinforcement learning. In: Proceedings of the 56th annual meeting of the association for computational linguistics (volume 1: long papers), Melbourne, Australia, July 2018, pp 2137–2147
Radinsky K, Horvitz E (2013) Mining the web to predict future events. In: Proceedings of the sixth ACM international conference on web search and data mining, New York, NY, USA, pp 255–264
Rish I (2001) An empirical study of the Naive Bayes classifier. In: IJCAI 2001 workshop on empirical methods in artificial intelligence, vol 3, pp 41–46
Robertson S, Zaragoza H (2009) The probabilistic relevance framework: BM25 and beyond. Found Trends Inf Retr 3(4):333–389
Roth B, Klakow D (2013) Feature-based models for improving the quality of noisy training data for relation extraction. In: Proceedings of the 22nd ACM international conference on information & knowledge management, New York, NY, USA, 2013, pp 1181–1184
Shahaf D, Yang J, Suen C, Jacobs J, Wang H, Leskovec J (2013) Information cartography: creating zoomable, large-scale maps of information. In: Proceedings of the 19th ACM SIGKDD international conference on knowledge discovery and data mining, New York, NY, USA, 2013, pp 1097–1105
Sheng Y, Zenglin X, Wang Y, de Melo G (2020) Multi-document semantic relation extraction for news analytics. World Wide Web 23:2043–2077
Sparck Jones K (1972) A statistical interpretation of term specificity and its application in retrieval. J Doc 28(1):11–21
Steen J, Markert K (2019) Abstractive timeline summarization. In: Proceedings of the 2nd workshop on new frontiers in summarization, Hong Kong, China, November 2019, pp 21–31
Tang H, Cao Y, Zhang Z, Cao J, Fang F, Wang S, Yin P (2020) HIN: hierarchical inference network for document-level relation extraction. In: Advances in knowledge discovery and data mining. Springer, Cham, pp 197–209
Vo D-T, Al-Obeidat F, Bagheri E (2020) HIN: hierarchical inference network for document-level relation extraction. Inf Process Manag 57(6):102319
Wang X, Jiang M (2020) Precise temporal slot filling via truth finding with data-driven commonsense. Knowl Inf Syst 62:4113–4139
Wei J, Zou K (2019) EDA: easy data augmentation techniques for boosting performance on text classification tasks. In: Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP), Hong Kong, China, November 2019, pp 6382–6388
Changxing W, Chaowen H, Li R, Lin H, Jinsong S (2020) Hierarchical multitask learning with CRF for implicit discourse relation recognition. Knowl Based Syst 195:105637
Yang CC, Shi X, Wei C (2009) Discovering event evolution graphs from news corpora. IEEE Trans Syst Man Cybern Part A Syst Hum 39(4):850–863
Yao Y, Ye D, Li P, Han X, Lin Y, Liu Z, Liu Z, Huang L, Zhou J, Sun M (2019) DocRED: a large-scale document-level relation extraction dataset. In: Proceedings of the 57th annual meeting of the association for computational linguistics, Florence, Italy, July 2019, pp 764–777
Zhang F, Liu X, Tang J, Dong Y, Yao P, Zhang J, Gu X, Wang Y, Shao B, Li R, Wang K (2019) OAG: toward linking large-scale heterogeneous entity graphs. In: Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining, New York, NY, USA, 2019, pp 2585–2595
Acknowledgements
This work is supported by National Natural Science Foundation of China (Grant Nos. 71531001 and 61421003).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix A. News articles in case study
Appendix A. News articles in case study
In Section 5.6, we explain how story trees work in our model based on cases, and use numbers to represent news articles. In this section, we present the text of some of the articles mentioned in Section 5.6. For very long news articles, we only show a portion of the articles.
1.1 A.1. Article 0
Kobe Bryant’s Daughter Natalia, 17, Pays Tribute to Late Dad and Sister Gianna at Winter Formal Kobe Bryant ’s eldest daughter Natalia stopped to pose with a mural honoring her late dad and little sister Gianna as she headed to her winter formal. On Sunday, Vanessa Bryant shared a photograph of her 17-year-old all dressed up and ready to attend her high school dance. Ahead of the formal, Natalia posed for photographs in front of a tribute mural painted to honor Kobe, 41, and 13-year-old sister Gigi, both who died in the tragic Jan. 26 helicopter crash in Calabasas, California ."my babies. Natalia. #winterformal," the mom of four captioned the photograph, which featured a smiling Natalia, who was dressed in a blue and white polka dot dress. Fans flooded Bryant’s comments section with compliments for Natalia. NBA star Dwyane Wade commented to heart emojis, while WNBA star Candace Parker wrote, "BEAUTIFUL" with several heart emojis. "She’s beautiful and so is that mural. one fan wrote. Another added, "This warms my heart and at the same time saddens it. Good to see you girls pushing through. "RELATED: Vanessa Bryant ‘Devastated’ by Claims of L.A. Deputies Sharing Photos of Helicopter Crash Site The post comes just one week after Vanessa’s legal team spoke out about the allegations that Los Angeles County Sheriff’s deputies shared graphic photos of the helicopter crash site where Kobe, Gigi and seven others were killed on Jan. 26. In the statement, they denounced the "inexcusable" acts "of injustice" and called on "an Internal Affairs investigation of these alleged incidents. "The Los Angeles County Sheriff’s office also released a statement, claiming an investigation surrounding the allegations was underway. Last month, Vanessa, 37, also filed a wrongful death lawsuit against the helicopter company that owned the aircraft in the tragic crash. RELATED: Powerful Kobe & Gianna Bryant Fan Art Created to Honor Their Legacies In a complaint obtained by PEOPLE that lists herself and her daughters as plaintiffs, the NBA star’s widow is suing Island Express Helicopters and claims that pilot Ara Zobayan of Huntington Beach, California, who was piloting the flight at the time of the crash and died , "failed to properly monitor and assess the weather prior to takeoff," "failed to abort the flight when he knew of the cloudy conditions" and "failed to properly and safely operate the helicopter resulting in a crash. "The complaint also claims that Island Express Helicopters "knew or should have known" that Zoboyan had been previously cited by the FAA for violating "the visual flight rules minimums by flying into an area of reduced visibility from weather conditions. "Vanessa and her daughters are seeking general, economic and punitive damages. In response to the lawsuit, a spokesperson for Island Express Helicopters told PEOPLE, "This was a tragic accident. We will have no comment on the pending litigation.
1.2 A.2. Article 2
KISS Pay Elaborate Tribute to Kobe Bryant at Staples Center: Watch The post KISS Pay Elaborate Tribute to Kobe Bryant at Staples Center: Watch appeared first on Consequence of Sound .KISS paid tribute to late NBA legend Kobe Bryant during their show at Staples Center in Los Angeles on Wednesday night. Bryant, 41, and his 13-year-old daughter, Gianna, died in a tragic helicopter crash along with seven others in late January, and KISS’ Paul Stanley took a moment to show his respect. Donning Bryant’s No. 24 Lakers jersey, Stanley gave a brief monologue during the band’s set before playing Destroyer track "Do You Love Me". "We’re in the house that Kobe built," Stanley said of the Lakers’ home arena. "None of us would be here if this place wasn’t really like a memorial to somebody who was so much more than a basketball player, somebody who’s been a role model. And tonight, I think we dedicate this show not only to Kobe and his daughter Gigi, but to all the people who perished on that helicopter. "Bryant’s retired numbers - No. 8 and 24 - were displayed on the band’s stage screens as they performed "Do You Love Me". As the song concluded, Lakers-colored purple and gold balloons flooded the stage and Stanley dribbled them like basketballs. Editors’ Picks Stanley previously expressed his sadness after Bryant’s death on Twitter . Posted with a picture of Bryant and himself shaking hands courtside, Stanley wrote: "WOW! Kobe. Such A Shock. My Condolences To His Wife And Children. Very, very sad. #KobeBryant". KISS will continue the U.S. leg of their "End of the Road Tour" with Van Halen’s David Lee Roth through mid-March before heading across Europe this summer. After that, they’ll return to the States for another leg in the fall. Get tickets to KISS’ upcoming shows here .Watch KISS’ tribute to Kobe Bryant below. Popular PostsSubscribe to Consequence of Sound’s email digest and get the latest breaking news in music, film, and television, tour updates, access to exclusive giveaways, and more straight to your inbox.
1.3 A.3. Article 3
Kobe Bryant’s death leaves sports world stunned Kobe Bryant was killed in a helicopter crash in Calabasas Sunday morning, a source confirms to PEOPLE. The NBA legend, 41, was reportedly traveling with at least three other people in his private helicopter when it went down, according to TMZ. Emergency personnel responded but nobody on board survived. Five people are confirmed dead, TMZ reported. The outlet says that Bryant’s wife, Vanessa Bryant, was not onboard. Spokespersons for LA county sheriff’s office and LAPD did not immediately respond to PEOPLE’s request for comment. Bryant is survived by Vanessa, 37, and their four children together: daughters Natalia, 17, Gianna, 13, Bianka, 3, and son Capri, 7 months. Since the start of his basketball career, Bryant was one of the most accomplished men in the NBA, having played all 20 seasons with the Los Angeles Lakers. Until yesterday, he was the third-leading scorer in NBA history with 33,643 points but was surpassed LeBron James. James paid tribute to Bryant with special Nike shoes during the game against the Philadelphia 76ers.
1.4 A.4. Article 8
Dwyane Wade Reflects on Kobe Bryant’s Wish to Inspire Others in Jersey Retirement Speech Whenever Kemba Walker looks down at the No. 8 on his jersey, he’ll be reminded of the Mamba Mentality. While others in the NBA switched their jersey numbers in the wake of Kobe Bryant’s tragic death, Walker instead decided to honor the Los Angeles Lakers legend by keeping his No. 8. The Boston Celtics guard spoke with ESPN’s Rachel Nichols about what it means to wear that number going forward. N̈ow, that number means even more. So every time I step on the court, I just want to give 100 percent for him,alker told Nichols. T̈hat’s my goal for the rest of the year and for the rest of my career. L̈IVE stream the Celtics all season. In the immediate aftermath of Bryant’s passing, Walker considered a number change. Evidently, he decided the best way for him to honor Bryant was to continue wearing No. 8 while putting 100-percent effort into every game, just as Kobe would. had a talk about it with some close people in my circle,alker said. definitely thought about giving it up but then I thought, I think Kobe would want me and allow me to wear it. We want to keep his legacy going. I know of a few of us that’s kept it. We’re all just going to go out there and do what we can to play as hard as possible for Kobe.atch below: Not enough Kemba Walker in your life right now? He and I sat down to talk about what it was like replacing Kyrie, wearing No. 8 for Kobe, and what he thinks the Celtics ceiling is this postseason: pic.twitter.com/VyNBM24AZs- Rachel Nichols (Rachel__Nichols) February 27, 2020In 46 games this season, Walker has averaged 21.8 points, 4.1 rebounds, and 5.0 assists while earning his fourth All-Star selection. A knee injury has kept Walker out of commission for the last few games, but the Celtics are hopeful they’ll have him back in the lineup sooner rather than later. Don’t miss NBC Sports Boston’s coverage of Rockets-Celtics, which begins Saturday at 7:30 p.m. with Celtics Pregame Live. You can also Kemba Walker explains keeping No. 8 to honor Kobe Bryant: ’We want to keep his legacy going’.
1.5 A.5. Article 9
Beyonce opens Kobe Bryant’s memorial by singing ’XO,’ one of Bryant’s favorite songs The memorial to Kobe and Gianna Bryant began in an inspiring way Monday. The memorial opened with Beyonce, who told attendees, "I’m here because I love Kobe," before launching into one of Bryant’s favorite songs. Beyonce then began singing "XO." Beyonce opens Kobe & Gianna’s Celebration of Life with one of his favorite songs.(via SpectrumSN) February 24, 2020, Beyonce followed that up with "Halo." Beyonce performs Ḧaloät Kobe and Gianna’s memorial.#KobeFarewell - Entertainment Tonight (etnow) February 24, 2020Beyonce was backed up by a chorus and an orchestra. After Beyonce opened the event, Vanessa Bryant eulogized Kobe and Gianna in a moving speech. Bryant and his 13-year-old daughter Gianna were among the nine people killed in a helicopter crash in January. Fans gathered to put together a make-shift memorial outside the Staples Center in the days after Bryant’s death. The Staples Center decided to host a memorial for Kobe, Gianna and the seven other victims of the crash. February 24 - or 224 - was chosen as the date of the memorial. Gianna Bryant wore No. 2. Kobe Bryant wore No. 24. More from Yahoo Sports: Eisenberg: How Kobe touched the lives of 10 everyday peopleIole: Wilder assistant was right to throw in towel vs. FuryKeyser: How do Astros fans feel about sign-stealing scandal? Bucks clinch playoff spot faster than anyone in at least 15 years.
1.6 A.6. Article 10
Dwyane Wade Says Friend Kobe Bryant Was ’in the Process of Building’ His Next Legacy Before Death Dwyane Wade says Kobe Bryant was just beginning his second act ahead of his shocking death in a helicopter crash in January. Wade tells PEOPLE in an interview ahead of the release of his documentary, D. Wade: Life Unexpected, that Bryant’s legacy is so much more than his illustrious career in the NBA. "His legacy is what he was in the process of building that we all got a chance to watch, right?" says Wade, 38. "We’ve seen what he did for basketball. We’ve seen that legacy. "Continues the former Miami Heat star, "But the legacy he was building outside of there was being there for the players, being a voice for the next generation. Working them out, being on the court with them, being there in his kids’ lives, being a real all-star, superstar parent. Being an amazing husband. "Bryant, 41, was married to Vanessa Bryant, 37. The couple shares four daughters, including Gianna, 13, who was also killed in the crash. "And I think the one thing Kobe told us along the way is that no one is perfect in this, but at some point in his life - I said this recently - he mastered all of it," Wade tells PEOPLE. "He started mastering all of this. And then he showed us, too, that, ’Listen, we can do anything we want.’ "RELATED: Dwyane Wade Had to Rank Himself and His Former Heat Teammates LeBron James and Chris Bosh on Wade calls his friend’s legacy "so huge," adding, "and I think the thing that hurt more so than anything is that we all feel that we lost a loved one when Kobe passed". "And that’s powerful - for someone that a lot of people haven’t even met or didn’t even know, still are mourning and trying to get over it, trying to move on with life," the retired athlete reflects. "That’s when you know that you’ve built something, you’ve created something special. "RELATED: Dwyane Wade on How He and LeBron James Are Different as Basketball Dads: I ’Have More Self-Talk’ Immediately after Bryant’s death, Wade released a video of himself crying on Instagram , admitting, "Today is one of the saddest days in my lifetime". "It seems like a bad dream that you just wanna wake up from. It’s a nightmare." D. Wade: Life Unexpected from ESPN Films and Imagine Documentaries and directed by Bob Metelus premieres Sunday at 9 p.m. EST on ESPN.
1.7 A.7. Article 42
Person in Washington State Is First in U.S. to Die From Coronavirus, Authorities Say A middle-aged patient in Washington state became the first person to die from the 2019 novel coronavirus inside the United States, officials said on Saturday as they announced additional cases, including a nursing home that could become the next hot zone. At least 69 people on American soil have had confirmed cases of the novel 2019 coronavirus, which is believed to have originated in a large seafood and live animal market in Wuhan, China, where it killed thousands before spreading to dozens of other countries. One American also died in China earlier this month. The U.S. outbreak seemed to reach a new stage over the weekend, with the number of confirmed patients who contracted it locally not from traveling abroad-creeping up. California announced Saturday that it had recorded a third such "community spread" case, a patient who was apparently infected by a Santa Clara County woman diagnosed a day earlier. The person who died in Washington state overnight was a man in his 50s considered at high risk, said Dr. Jeff Duchin, health officer for Seattle and King County. Dr. Robert Redfield, director of Centers for Disease Control and Prevention (CDC), said there was currently "no evidence" that the person who died had traveled recently to China or had any contact with someone who had-making it another case of "community spread" or unknown origin. "It’s a tough one, but a lot of progress has been made," President Trump said at a press conference Saturday, stressing that the risk to the general population remained low. "We’re doing really well," he added, "under incredibly adverse circumstances...
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Zhang, C., Lyu, J. & Xu, K. A storytree-based model for inter-document causal relation extraction from news articles. Knowl Inf Syst 65, 827–853 (2023). https://doi.org/10.1007/s10115-022-01781-7
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-022-01781-7