Multi-Document Viewpoint Summarization Focused on Facts, Opinion and Knowledge

Seki, Yohei; Eguchi, Koji; Kando, Noriko

doi:10.1007/1-4020-4102-0_24

Multi-Document Viewpoint Summarization Focused on Facts, Opinion and Knowledge

Yohei Seki^4,5,
Koji Eguchi^4,5 &
Noriko Kando^4,5

Chapter

1379 Accesses
1 Citations

Part of the book series: The Information Retrieval Series ((INRE,volume 20))

Abstract

An interactive information retrieval system that provides different types of summaries of retrieved documents according to each user’s information needs, situation, or purpose of search can be effective for understanding document content. The purpose of this study is to build a multi-document summarizer, “Viewpoint Summarizer With Interactive clustering on Multidocuments (v-SWIM)”, which produces summaries according to such viewpoints. We tested its effectiveness on a new test collection, ViewSumm30, which contains human-made reference summaries of three different summary types for each of the 30 document sets. Once a set of documents on a topic (e.g., documents retrieved by a search engine) is provided to v-SWIM, it returns a list of topics discussed in the given document set, so that the user can select a topic or topics of interest as well as the summary type, such as fact-reporting, opinion-oriented or knowledge-focused, and produces a summary from the viewpoints of the topics and summary type selected by the user. We assume that sentence types and document genres are related to the types of information included in the source documents and are useful for selecting appropriate information for each of the summary types. “Sentence type” defines the type of information in a sentence. “Document genre” defines the type of information in a document. The results of the experiments showed that the proposed system using automatically identified sentence types and document genres of the source documents improved the coverage of the system-produced fact-reporting, opinion-oriented, and knowledge-focused summaries, 13.14%, 34.23%, and 15.89%, respectively, compared with our baseline system which did not differentiate sentence types or document genres.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

8. Bibliography

Angheluta, R., Moens, M. F., and De Busser, R. (2003) K. U. Leuven Summarization System-DUC 2003. In Proceedings of the Workshop on Text Summarization (DUC 2003) at the 2003 Human Language Technology Conference (HLT/NAACL 2003), Edmonton, Canada.
Google Scholar
Bazerman, C. (2004) Speech Acts, Genres, and Activity Systems: How Texts Organize Activity and People. In Bazerman, C. and Prior, P. (Eds.) What Writing Does and How It Does It — An Introduction to Analyzing Texts and Textual Practices. 309–339. Lawrence Erlbaum Associates, Mahwah, NJ.
Google Scholar
Biber, D., Conrad, S., and Reppen, R. (1998) Corpus Linguistics-Investigating Language Structure and Use (Reprinted, 2002). Cambridge Approaches to Linguistics. Cambridge University Press.
Google Scholar
Borlund, P. (2003) The Concept of Relevance in IR. Journal of the American Society for Information Science and Technology, 54(10), 913–925.
Article Google Scholar
Cardie, C., Wiebe, J., Wilson, T., and Litman, D. (2003) Combining Low-level and Summary Representations of Opinions for Multi-Perspective Question Answering. In AAAI Spring Symposium on New Directions in Question Answering, 20–27.
Google Scholar
Finn, A., Kushmerick, N., and Smyth, B. (2002) Genre Classification and Domain Transfer for Information Filtering. In Crestani, F., Girolami, M., and van Rijsbergen, C. J. (Eds.) Proceedings of ECIR 2002 Advances in Information Retrieval, 24th BCS-IRSG European Colloquium on IR Research, Glasgow, UK, 353–362. Published in Lecture Notes in Computer Science 2291, Springer-Verlag, Heidelberg, Germany.
Google Scholar
Harman, D. and Over, P. (2004) The Effects of Human Variation in DUC Summarization Evaluation. In Proceedings of Text Summarization Branches Out, Workshop at the 42nd ACL 2004, Barcelona, Spain, 10–17.
Google Scholar
Hatzivassiloglou, V., Klavans, J. L., Holcombe, M. L., Barzilay, R., Kan, M. Y., and McKeown, K. R. (2001) Simfinder: A Flexible Clustering Tool for Summarization. In Proceedings of the Workshop on Automatic Summarization at the Second Meeting of the North American Chapter of the Association for Computational Linguistics (NAACL 2001), Pittsburgh, PA, 41–49.
Google Scholar
Hirao, T., Okumura, M., Fukushima, T. and Nanba, H. (2004) Text Summarization Challenge 3: Text summarization evaluation at NTCIR Workshop 4. In Proceedings of the Fourth NTCIR Workshop on Research in Evaluation of Information Access Technologies: Information Retrieval, Question Answering, and Summarization. National Institute of Informatics, Japan. Available from: <http://research.nii.ac.jp/ntcir>.
Google Scholar
Joachims, T. (2002) Learning to Classify Text Using Support Vector Machines: Methods, Theory, and Algorithms. Kluwer Academic Publishers.
Google Scholar
Kando, N. (2004) Overview of the Fourth NTCIR Workshop. In Proceedings of the Fourth NTCIR Workshop on Research in Evaluation of Information Access Technologies: Information Retrieval, Question Answering, and Summarization. National Institute of Informatics, Japan. Available from: <http://research.nii.ac.jp/ntcir>.
Google Scholar
Kando, N. (1996) Text Structure Analysis Based on Human Recognition: Cases of Japanese Newspaper Articles and English Newspaper Articles (in Japanese). In Research Bulletin of National Center for Science Information Systems, 8, 107–126.
Google Scholar
Karlgren, J. and Cutting, D. (1994) Recognizing Text Genres with Simple Metrics Using Discriminant Analysis. In Proceedings of the 15th International Conference on Computational Linguistics (COLING 1994), Kyoto, Japan, 1071–1075.
Google Scholar
Kessler, B., Nunberg, G., Schuetze, H. (1997) Automatic Detection of Text Genre. In Proceedings of the 35th ACL/8th EACL 1997, Madrid, Spain, 32–38.
Google Scholar
Landis, J. R. and Koch, G. G. (1977) The Measurement of Observer Agreement for Categorical Data. Biometrics, 33, 159–74.
MathSciNet Google Scholar
Lin, C-Y., and Hovy, E. (2002) Manual and Automatic Evaluation of Summaries. In Proceedings of the Workshop on Automatic Summarization at the 40th ACL 2002, University of Pennsylvania, PA.
Google Scholar
Maña-López, M. J., Buenaga, M. D., and Gómez-Hidalgo, J. M. (2004) Multidocument Summarization: An Added Value to Clustering in Interactive Retrieval. ACM Transactions on Information Systems (TOIS), 22(2), 215–241.
Google Scholar
Mani, I. (2001) Automatic Summarization. Volume 3 of Natural Language Processing, John Benjamins Pub, Amsterdam, Netherlands.
Google Scholar
McKnight, L. and Srinivasan, P. (2003) Categorization of Sentence Types in Medical Abstracts. In Proceedings of the American Medical Informatics Association (AMIA) Symposium, Ottawa, Canada, 440–444.
Google Scholar
Pomerantz, J. (2002) Question Taxonomies for Digital Reference. Ph. D. thesis, Syracuse University.
Google Scholar
Radev, D. R., Jing, H., Sty, M., and Tam, D. (2004) Centroid-based Summarization of Multiple Documents. Information Processing and Management, 40(6), 919–938.
Article Google Scholar
Rath, G. J., Resnick, A., and Savage, T. R. (1961) The Formation of Abstracts by the Selection of Sentences. American Documentation, 2(12), 139–208.
Google Scholar
Sebastiani, F. (2002) Machine Learning in Automated Text Categorization. ACM Computing Surveys, 34(1), 1–47.
Article Google Scholar
Seki, Y., Eguchi, K., and Kando, N. (2004a) User-focused Multi-Document Summarization with Paragraph Clustering and Sentence-type Filtering. In Proceedings of the Fourth NTCIR Workshop on Research in Evaluation of Information Access Technologies: Information Retrieval, Question Answering, and Summarization. National Institute of Informatics, Japan. Available from: <http://research.nii.ac.jp/ntcir>.
Google Scholar
Seki, Y., Eguchi, K., and Kando, N. (2004b) Compact Summarization for Mobile Phones. In Crestani, F., Dunlop, M. and Mizzaro, S. (Eds.) Mobile and Ubiquitous Information Access. 172–186. Published in Lecture Notes in Computer Science 2954, Springer-Verlag, Heidelberg, Germany.
Google Scholar
Simpson, J. A. and Weiner, E. S. C. (1991) The Oxford English Dictionary (second edition). Clarendon Press, New York.
Google Scholar
Spärck-Jones, K. (1999) Automatic Summarizing: Factors and Directions. In Mani, I., and Maybury, M. T. (Eds.) Advances in Automatic Text Summarization. 1–12. MIT Press, Cambridge, MA.
Google Scholar
Stamatatos, E., Fakotakis, N., and Kokkinakis, G. (2000) Text Genre Detection Using Common Word Frequencies. In Proceedings of the 18th International Conference on Computational Linguistics (COLING2000), Saarbrücken, Germany, 808–814.
Google Scholar
Stein G. C., Strzalkowski, T., and Wise, G. B. (2000) Evaluating Summaries for Multiple Documents in an Interactive Environment. In Proceedings of the Second International Conference on Language Resources & Evaluation (LREC2000), Athens, Greece, 1651–1657.
Google Scholar
Teufel, S. and Moens, M. (2002) Summarizing Scientific Articles: Experiments with Relevance and Rhetorical Status. Computational Linguistics, 28(4), 409–445.
Article Google Scholar
The National Institute for Japanese Language (2004) Bunruigoihyo — enlarged and revised edition. Dainippon-Tosho.
Google Scholar
Xu, J. Weischedel, R., and Licuanan, A. (2004) Evaluation of an Extraction-Based Approach to Answering Definitional Questions. In Proceedings of the 27th ACM SIGIR 2004, Sheffield, UK, 418–424.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Informatics, The Graduate University for Advanced Studies (Sokendai), 2-1-2, Hitotsubashi, Tokyo, 101-8430, Japan
Yohei Seki, Koji Eguchi & Noriko Kando
National Institute of Informatics, 2-1-2, Hitotsubashi, Tokyo, 101-8430, Japan
Yohei Seki, Koji Eguchi & Noriko Kando

Authors

Yohei Seki
View author publications
You can also search for this author in PubMed Google Scholar
Koji Eguchi
View author publications
You can also search for this author in PubMed Google Scholar
Noriko Kando
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Clairvoyance Cooperation, Pittsburgh, PA, USA
James G. Shanahan & Yan Qu &
University of Pittsburgh, PA, USA
Janyce Wiebe

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Seki, Y., Eguchi, K., Kando, N. (2006). Multi-Document Viewpoint Summarization Focused on Facts, Opinion and Knowledge. In: Shanahan, J.G., Qu, Y., Wiebe, J. (eds) Computing Attitude and Affect in Text: Theory and Applications. The Information Retrieval Series, vol 20. Springer, Dordrecht. https://doi.org/10.1007/1-4020-4102-0_24

Download citation

DOI: https://doi.org/10.1007/1-4020-4102-0_24
Publisher Name: Springer, Dordrecht
Print ISBN: 978-1-4020-4026-9
Online ISBN: 978-1-4020-4102-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics