Skip to main content

Multi-Document Viewpoint Summarization Focused on Facts, Opinion and Knowledge

  • Chapter

Part of the book series: The Information Retrieval Series ((INRE,volume 20))

Abstract

An interactive information retrieval system that provides different types of summaries of retrieved documents according to each user’s information needs, situation, or purpose of search can be effective for understanding document content. The purpose of this study is to build a multi-document summarizer, “Viewpoint Summarizer With Interactive clustering on Multidocuments (v-SWIM)”, which produces summaries according to such viewpoints. We tested its effectiveness on a new test collection, ViewSumm30, which contains human-made reference summaries of three different summary types for each of the 30 document sets. Once a set of documents on a topic (e.g., documents retrieved by a search engine) is provided to v-SWIM, it returns a list of topics discussed in the given document set, so that the user can select a topic or topics of interest as well as the summary type, such as fact-reporting, opinion-oriented or knowledge-focused, and produces a summary from the viewpoints of the topics and summary type selected by the user. We assume that sentence types and document genres are related to the types of information included in the source documents and are useful for selecting appropriate information for each of the summary types. “Sentence type” defines the type of information in a sentence. “Document genre” defines the type of information in a document. The results of the experiments showed that the proposed system using automatically identified sentence types and document genres of the source documents improved the coverage of the system-produced fact-reporting, opinion-oriented, and knowledge-focused summaries, 13.14%, 34.23%, and 15.89%, respectively, compared with our baseline system which did not differentiate sentence types or document genres.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

8. Bibliography

  • Angheluta, R., Moens, M. F., and De Busser, R. (2003) K. U. Leuven Summarization System-DUC 2003. In Proceedings of the Workshop on Text Summarization (DUC 2003) at the 2003 Human Language Technology Conference (HLT/NAACL 2003), Edmonton, Canada.

    Google Scholar 

  • Bazerman, C. (2004) Speech Acts, Genres, and Activity Systems: How Texts Organize Activity and People. In Bazerman, C. and Prior, P. (Eds.) What Writing Does and How It Does It — An Introduction to Analyzing Texts and Textual Practices. 309–339. Lawrence Erlbaum Associates, Mahwah, NJ.

    Google Scholar 

  • Biber, D., Conrad, S., and Reppen, R. (1998) Corpus Linguistics-Investigating Language Structure and Use (Reprinted, 2002). Cambridge Approaches to Linguistics. Cambridge University Press.

    Google Scholar 

  • Borlund, P. (2003) The Concept of Relevance in IR. Journal of the American Society for Information Science and Technology, 54(10), 913–925.

    Article  Google Scholar 

  • Cardie, C., Wiebe, J., Wilson, T., and Litman, D. (2003) Combining Low-level and Summary Representations of Opinions for Multi-Perspective Question Answering. In AAAI Spring Symposium on New Directions in Question Answering, 20–27.

    Google Scholar 

  • Finn, A., Kushmerick, N., and Smyth, B. (2002) Genre Classification and Domain Transfer for Information Filtering. In Crestani, F., Girolami, M., and van Rijsbergen, C. J. (Eds.) Proceedings of ECIR 2002 Advances in Information Retrieval, 24th BCS-IRSG European Colloquium on IR Research, Glasgow, UK, 353–362. Published in Lecture Notes in Computer Science 2291, Springer-Verlag, Heidelberg, Germany.

    Google Scholar 

  • Harman, D. and Over, P. (2004) The Effects of Human Variation in DUC Summarization Evaluation. In Proceedings of Text Summarization Branches Out, Workshop at the 42nd ACL 2004, Barcelona, Spain, 10–17.

    Google Scholar 

  • Hatzivassiloglou, V., Klavans, J. L., Holcombe, M. L., Barzilay, R., Kan, M. Y., and McKeown, K. R. (2001) Simfinder: A Flexible Clustering Tool for Summarization. In Proceedings of the Workshop on Automatic Summarization at the Second Meeting of the North American Chapter of the Association for Computational Linguistics (NAACL 2001), Pittsburgh, PA, 41–49.

    Google Scholar 

  • Hirao, T., Okumura, M., Fukushima, T. and Nanba, H. (2004) Text Summarization Challenge 3: Text summarization evaluation at NTCIR Workshop 4. In Proceedings of the Fourth NTCIR Workshop on Research in Evaluation of Information Access Technologies: Information Retrieval, Question Answering, and Summarization. National Institute of Informatics, Japan. Available from: <http://research.nii.ac.jp/ntcir>.

    Google Scholar 

  • Joachims, T. (2002) Learning to Classify Text Using Support Vector Machines: Methods, Theory, and Algorithms. Kluwer Academic Publishers.

    Google Scholar 

  • Kando, N. (2004) Overview of the Fourth NTCIR Workshop. In Proceedings of the Fourth NTCIR Workshop on Research in Evaluation of Information Access Technologies: Information Retrieval, Question Answering, and Summarization. National Institute of Informatics, Japan. Available from: <http://research.nii.ac.jp/ntcir>.

    Google Scholar 

  • Kando, N. (1996) Text Structure Analysis Based on Human Recognition: Cases of Japanese Newspaper Articles and English Newspaper Articles (in Japanese). In Research Bulletin of National Center for Science Information Systems, 8, 107–126.

    Google Scholar 

  • Karlgren, J. and Cutting, D. (1994) Recognizing Text Genres with Simple Metrics Using Discriminant Analysis. In Proceedings of the 15th International Conference on Computational Linguistics (COLING 1994), Kyoto, Japan, 1071–1075.

    Google Scholar 

  • Kessler, B., Nunberg, G., Schuetze, H. (1997) Automatic Detection of Text Genre. In Proceedings of the 35th ACL/8th EACL 1997, Madrid, Spain, 32–38.

    Google Scholar 

  • Landis, J. R. and Koch, G. G. (1977) The Measurement of Observer Agreement for Categorical Data. Biometrics, 33, 159–74.

    MathSciNet  Google Scholar 

  • Lin, C-Y., and Hovy, E. (2002) Manual and Automatic Evaluation of Summaries. In Proceedings of the Workshop on Automatic Summarization at the 40th ACL 2002, University of Pennsylvania, PA.

    Google Scholar 

  • Maña-López, M. J., Buenaga, M. D., and Gómez-Hidalgo, J. M. (2004) Multidocument Summarization: An Added Value to Clustering in Interactive Retrieval. ACM Transactions on Information Systems (TOIS), 22(2), 215–241.

    Google Scholar 

  • Mani, I. (2001) Automatic Summarization. Volume 3 of Natural Language Processing, John Benjamins Pub, Amsterdam, Netherlands.

    Google Scholar 

  • McKnight, L. and Srinivasan, P. (2003) Categorization of Sentence Types in Medical Abstracts. In Proceedings of the American Medical Informatics Association (AMIA) Symposium, Ottawa, Canada, 440–444.

    Google Scholar 

  • Pomerantz, J. (2002) Question Taxonomies for Digital Reference. Ph. D. thesis, Syracuse University.

    Google Scholar 

  • Radev, D. R., Jing, H., Sty, M., and Tam, D. (2004) Centroid-based Summarization of Multiple Documents. Information Processing and Management, 40(6), 919–938.

    Article  Google Scholar 

  • Rath, G. J., Resnick, A., and Savage, T. R. (1961) The Formation of Abstracts by the Selection of Sentences. American Documentation, 2(12), 139–208.

    Google Scholar 

  • Sebastiani, F. (2002) Machine Learning in Automated Text Categorization. ACM Computing Surveys, 34(1), 1–47.

    Article  Google Scholar 

  • Seki, Y., Eguchi, K., and Kando, N. (2004a) User-focused Multi-Document Summarization with Paragraph Clustering and Sentence-type Filtering. In Proceedings of the Fourth NTCIR Workshop on Research in Evaluation of Information Access Technologies: Information Retrieval, Question Answering, and Summarization. National Institute of Informatics, Japan. Available from: <http://research.nii.ac.jp/ntcir>.

    Google Scholar 

  • Seki, Y., Eguchi, K., and Kando, N. (2004b) Compact Summarization for Mobile Phones. In Crestani, F., Dunlop, M. and Mizzaro, S. (Eds.) Mobile and Ubiquitous Information Access. 172–186. Published in Lecture Notes in Computer Science 2954, Springer-Verlag, Heidelberg, Germany.

    Google Scholar 

  • Simpson, J. A. and Weiner, E. S. C. (1991) The Oxford English Dictionary (second edition). Clarendon Press, New York.

    Google Scholar 

  • Spärck-Jones, K. (1999) Automatic Summarizing: Factors and Directions. In Mani, I., and Maybury, M. T. (Eds.) Advances in Automatic Text Summarization. 1–12. MIT Press, Cambridge, MA.

    Google Scholar 

  • Stamatatos, E., Fakotakis, N., and Kokkinakis, G. (2000) Text Genre Detection Using Common Word Frequencies. In Proceedings of the 18th International Conference on Computational Linguistics (COLING2000), Saarbrücken, Germany, 808–814.

    Google Scholar 

  • Stein G. C., Strzalkowski, T., and Wise, G. B. (2000) Evaluating Summaries for Multiple Documents in an Interactive Environment. In Proceedings of the Second International Conference on Language Resources & Evaluation (LREC2000), Athens, Greece, 1651–1657.

    Google Scholar 

  • Teufel, S. and Moens, M. (2002) Summarizing Scientific Articles: Experiments with Relevance and Rhetorical Status. Computational Linguistics, 28(4), 409–445.

    Article  Google Scholar 

  • The National Institute for Japanese Language (2004) Bunruigoihyo — enlarged and revised edition. Dainippon-Tosho.

    Google Scholar 

  • Xu, J. Weischedel, R., and Licuanan, A. (2004) Evaluation of an Extraction-Based Approach to Answering Definitional Questions. In Proceedings of the 27th ACM SIGIR 2004, Sheffield, UK, 418–424.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer

About this chapter

Cite this chapter

Seki, Y., Eguchi, K., Kando, N. (2006). Multi-Document Viewpoint Summarization Focused on Facts, Opinion and Knowledge. In: Shanahan, J.G., Qu, Y., Wiebe, J. (eds) Computing Attitude and Affect in Text: Theory and Applications. The Information Retrieval Series, vol 20. Springer, Dordrecht. https://doi.org/10.1007/1-4020-4102-0_24

Download citation

  • DOI: https://doi.org/10.1007/1-4020-4102-0_24

  • Publisher Name: Springer, Dordrecht

  • Print ISBN: 978-1-4020-4026-9

  • Online ISBN: 978-1-4020-4102-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics