Skip to main content

Multi-document Summarization Based on Atomic Semantic Events and Their Temporal Relationships

  • Conference paper
Advances in Information Retrieval (ECIR 2016)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9626))

Included in the following conference series:

Abstract

Automatic multi-document summarization (MDS) is the process of extracting the most important information, such as events and entities, from multiple natural language texts focused on the same topic. In this paper, we experiment with the effects of different groups of information such as events and named entities in the domain of generic and update MDS. Our generic MDS system has outperformed the best recent generic MDS systems in DUC 2004 in terms of ROUGE-1 recall and \(f_1\)-measure. Update summarization is a new form of MDS, where novel yet salient sentences are chosen as summary sentences based on the assumption that the user has already read a given set of documents. We present an event based update summarization where the novelty is detected based on the temporal ordering of events, and the saliency is ensured by the event and entity distribution. To our knowledge, no other study has deeply experimented with the effects of the novelty information acquired from the temporal ordering of events (assuming that a sentence contains one or more events) in the domain of update multi-document summarization. Our update MDS system has outperformed the state-of-the-art update MDS system in terms of ROUGE-2 and ROUGE-SU4 recall measures. All our MDS systems also generate quality summaries which are manually evaluated based on popular evaluation criteria.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Here ‘22 years’ is a time period. Time periods do not carry important information for detecting novelty.

  2. 2.

    http://duc.nist.gov/duc2007/tasks.html.

  3. 3.

    http://nlp.stanford.edu/software/corenlp.shtml.

  4. 4.

    http://code.google.com/p/cleartk/.

  5. 5.

    Document Creation Time (DCT) can be calculated from document name.

  6. 6.

    Total 4 topics are taken into account, i.e. K is 4.

  7. 7.

    ROUGE runtime arguments for DUC 2004:

    \(ROUGE \text{- }a \text{- }c 95 \text{- }b 665 \text{- }m \text{- }n 4 \text{- }w 1.2\).

  8. 8.

    We do not compare our system with the recent topic model based system [14] because that system is significantly outperformed by Lin and Bilmes’s [23] system in terms of both ROUGE-1 recall and \(f_1\)-measure.

References

  1. James, F.: Allen.: maintaining knowledge about temporal intervals. Commun. ACM 26(11), 832–843 (1983)

    Article  MATH  Google Scholar 

  2. Bethard, S.: Cleartk-timeml: a minimalist approach to tempeval. In: Second Joint Conference on Lexical and Computational Semantics (* SEM), vol. 2, pp. 10–14 (2013)

    Google Scholar 

  3. Boudin, F., El-Bèze, M., Torres-Moreno, J. M.: A scalable MMR approach to sentence scoring for multi-document update summarization. COLING (2008)

    Google Scholar 

  4. Cer, D.M., De Marneffe, M.-C., Jurafsky, D., Manning, C.D.: Parsing to stanford dependencies: trade-offs between speed and accuracy. In: LREC (2010)

    Google Scholar 

  5. Chang, A.X., Manning, C.D.: Sutime: a library for recognizing and normalizing time expressions. In: Language Resources and Evaluation (2012)

    Google Scholar 

  6. Christensen, J., Mausam, S.S., Etzioni, O.: Towards coherent multi-document summarization. In: Proceedings of NAACL-HLT, pp. 1163–1173 (2013)

    Google Scholar 

  7. Delort, J.-Y., Alfonseca, E.: Dualsum: a topic-model based approach for update summarization. In: Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics, pp. 214–223 (2012)

    Google Scholar 

  8. Denis, P., Muller, P.: Predicting globally-coherent temporal structures from texts via endpoint inference and graph decomposition. In: Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence, vol. 3, pp. 1788–1793. AAAI Press (2011)

    Google Scholar 

  9. Pan, D., Guo, J., Zhang, J., Cheng, X.: Manifold ranking with sink points for update summarization. In: Proceedings of the 19th ACM International Conference on Information and Knowledge Management, pp. 1757–1760 (2010)

    Google Scholar 

  10. Erkan, G., Radev, D.R.: Lexrank: graph-based lexical centrality as salience in text summarization. J. Artif. Intell. Res. (JAIR) 22(1), 457–479 (2004)

    Google Scholar 

  11. Filatova, E., Hatzivassiloglou, V.: Event-based extractive summarization. In: Proceedings of ACL Workshop on Summarization, vol. 111 (2004)

    Google Scholar 

  12. Fisher, S., Roark, B.: Query-focused supervised sentence ranking for update summaries. In: Proceeding of TAC 2008 (2008)

    Google Scholar 

  13. Gillick, D., Favre, B., Hakkani-Tur, D., Bohnet, B., Liu, Y., Xie, S.: The icsi/utd summarization system at tac. In: Proceedings of the Second Text Analysis Conference, Gaithersburg, Maryland, USA. NIST (2009)

    Google Scholar 

  14. Haghighi, A., Vanderwende, L.: Exploring content models for multi-document summarization. In: Proceedings of Human Language Technologies: The Annual Conference of the North American Chapter of the Association for Computational Linguistics, pp. 362–370. Association for Computational Linguistics (2009)

    Google Scholar 

  15. Kullback, S.: The kullback-leibler distance (1987)

    Google Scholar 

  16. Li, J., Li, S., Wang, X., Tian, Y., Chang, B.: Update summarization using a multi-level hierarchical dirichlet process model. In: COLING (2012)

    Google Scholar 

  17. Li, L., Heng, W., Jia, Y., Liu, Y., Wan, S.: Cist system report for acl multiling 2013-track 1: multilingual multi-document summarization. In: MultiLing 2013, p. 39 (2013)

    Google Scholar 

  18. Li, P., Wang, Y., Gao, W., Jiang, J.: Generating aspect-oriented multi-document summarization with event-aspect model. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 1137–1146 (2011)

    Google Scholar 

  19. Li, W., Mingli, W., Qin, L., Wei, X., Yuan, C.: Extractive summarization using inter-and intra-event relevance. In: Proceedings of the 21st International Conference on Computational Linguistics and the 44th Annual Meeting of the Association for Computational Linguistics, pp. 369–376. Association for Computational Linguistics (2006)

    Google Scholar 

  20. Li, X., Liang, D., Shen, Y.-D.: Graph-based marginal ranking for update summarization. In: SDM, pp. 486–497. SIAM (2011)

    Google Scholar 

  21. Li, Xuan, Liang, Du, Shen, Yi-Dong: Update summarization via graph-based sentence ranking. IEEE Trans. Knowl. Data Eng. 25(5), 1162–1174 (2013)

    Article  Google Scholar 

  22. Lin, C.-Y.: Rouge: a package for automatic evaluation of summaries. In: Text Summarization Branches Out: Proceedings of the ACL-2004 Workshop, pp. 74–81 (2004)

    Google Scholar 

  23. Lin, H., Bilmes, J.: A class of submodular functions for document summarization. In: ACL, pp. 510–520 (2011)

    Google Scholar 

  24. Mani, I.: Automatic Summarization, vol. 3. John Benjamins Publishing, Amsterdam (2001)

    Book  MATH  Google Scholar 

  25. Mani, I., Schiffman, B., Zhang, J.: Inferring temporal ordering of events in news. In: Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology: Companion Volume of the Proceedings of HLT-NAACL 2003-Short Papers, vol. 2, pp. 55–57. Association for Computational Linguistics (2003)

    Google Scholar 

  26. Mihalcea, R., Tarau, P.: Textrank: Bringing order into texts. In: Proceedings of EMNLP, vol. 4, p. 275, Barcelona, Spain (2004)

    Google Scholar 

  27. Ng, J.-P., Kan, M.-Y.: Improved temporal relation classification using dependency parses and selective crowdsourced annotations. In: COLING, pp. 2109–2124 (2012)

    Google Scholar 

  28. Ng, J.-P., Kan, M.-Y., Lin, Z., Feng, W., Chen, B., Jian, S., Tan, C.L.: Exploiting discourse analysis for article-wide temporal classification. In: EMNLP, pp. 12–23 (2013)

    Google Scholar 

  29. Page, L., Brin, S., Motwani, R., Winograd, T.: The pagerank citation ranking: Bringing order to the web (1999)

    Google Scholar 

  30. Martin, F.: Porter.: an algorithm for suffix stripping. Program Electr. Libr. Inf. Syst. 14(3), 130–137 (1980)

    Article  Google Scholar 

  31. Pustejovsky, J., Castano, J.M., Ingria, R., Sauri, R., Gaizauskas, R.J., Setzer, A., Katz, G., Radev, D.R.: Timeml: robust specification of event and temporal expressions in text. In: New Directions in Question Answering, vol. 3, pp. 28–34 (2003)

    Google Scholar 

  32. Steinberger, J., Ježek, K.: Update summarization based on novel topic distribution. In: Proceedings of the 9th ACM symposium on Document Engineering, pp. 205–213 (2009)

    Google Scholar 

  33. Steinberger, J., Kabadjov, M., Steinberger, R., Tanev, H., Turchi, M., Zavarella, V.: Jrcs participation at tac: Guided and multilingual summarization tasks. In: Proceedings of the Text Analysis Conference (TAC) (2011)

    Google Scholar 

  34. Takamura, H., Okumura, M.: Text summarization model based on maximum coverage problem and its variant. In: Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics, pp. 781–789. Association for Computational Linguistics (2009)

    Google Scholar 

  35. Teh, Y.W., Jordan, M.I., Beal, M.J., Blei, D.M.: Hierarchical dirichlet processes. J. Am. Stat. Assoc. 101, 1566–1581 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  36. Wenjie, L., Wei Furu, L., Qin, H.Y.: Pnr 2: ranking sentences with positive and negative reinforcement for query-oriented update summarization. In: Proceedings of the 22nd International Conference on Computational Linguistics, pp. 489–496 (2008)

    Google Scholar 

  37. Zhang, R., Li, W., Qin, L.: Sentence ordering with event-enriched semantics and two-layered clustering for multi-document news summarization. In: Proceedings of the 23rd International Conference on Computational Linguistics, pp. 1489–1497 (2010)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yllias Chali .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Chali, Y., Uddin, M. (2016). Multi-document Summarization Based on Atomic Semantic Events and Their Temporal Relationships. In: Ferro, N., et al. Advances in Information Retrieval. ECIR 2016. Lecture Notes in Computer Science(), vol 9626. Springer, Cham. https://doi.org/10.1007/978-3-319-30671-1_27

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-30671-1_27

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-30670-4

  • Online ISBN: 978-3-319-30671-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics