SimSum: an empirically founded simulation of summarizing

https://doi.org/10.1016/S0306-4573(99)00066-7Get rights and content

Abstract

SimSum (Simulation of Summarizing) simulates 20 real-world working steps of expert summarizers. It presents an empirically founded cognitive model of summarizing and demonstrates that human summarization strategies can be simulated. The cognitive model operationalizes the discourse processing model developed by Kintsch and van Dijk. Knowledge engineering followed the KADS approach, empirical modeling used methods of grounded theory development. The observed strategies of expert summarizers have given rise to cooperating object-oriented agents communicating through dedicated blackboards. Each agent is implemented as a CLOS object with an assigned actor at the multimedia user interface. The interface is realized with Macromedia Director. Communication between CLOS and Macromedia Director is mediated by Apple Events. According to the first evaluation results in an educational environment, SimSum transmits summarization know-how effectively. It is, however, not designed as a tutorial system and serves active and curious users best. We are starting its expansion to summarizing in the WWW.

Section snippets

Introduction and purpose of the paper

The main achievement of summarizing is the reduction of available contents to the most relevant items. Often, but not always, the task includes the goal-oriented acquisition of information from external sources such as texts. Otherwise, memory may supply the material for the summary. Most of the time, the resulting summary (or abstract) is a short text.

The SimSum (Simulation of Summarizing) system does what its name promises: it simulates summarizing of human experts and thus produces a

Current research in automatic summarization

The best place to start an exploration of current international research in summarizing is the webpage of the Ottawa Text Summarization Project (http://www.site.uottawa.ca/tanka/ts.html). It provides information about the local projects and it links to bibliographies, webpages of active researchers, text corpora, industrial systems, conferences and new papers in summarizing.

Summarization is currently a busy field. The increase in activity is mostly motivated by the Internet. Users need

SimSum development

The SimSum simulation has been added to the empirical model of summarizing for scientific and presentational purposes:

  • As usual, the computer model serves to explain and check the empirical cognitive model which is its foundation.

  • lt prepares a cognitively grounded approach to automatic summarizing, something like agents in the WWW accepting a user’s query and bringing up a reasonably short statement (a summary) of the knowledge available in response to it.

  • To its users of today, SimSum shows in a

The test setting

During my regular introductory course to content analysis, polytechnic freshmen tested SimSum as an instrument for learning basic concepts about summarizing. Students did not enrol to a specific experiment. SimSum simply replaced the combined media I used in earlier courses: a tape recorder and the written thinking-aloud protocol, the source paper, the summarizer’s notes and the summary in different stages of progress. This makeshift multimedia show had provoked ‘aha’-experiences in students

Conclusion

Advancing the scientific frontiers of text summarization presupposes more knowledge about the way summarization works. The main fruit of the empirical investigation behind SimSum is an image of the summarization process which is detailed enough to lay the foundations for a simulation. Since the resulting summarization model incorporates the know-how of human experts, it has good prospects of presenting powerful techniques. Summarizing by cooperating cognitive agents seems to be such a principle.

Acknowledgements

The SimSum development was funded under grant F 916.00 by the German Federal Ministry of Education and Research. The German Science Foundation supported the empirical investigation under grant En 186/1-3.

References (53)

  • C. Bereiter et al.

    The psychology of written composition

    (1987)
  • B. Boguraev et al.

    Salience-based content characterization of text documents

  • B. Boguraev et al.

    Dynamic presentation of document content for rapid online skimming

  • B. Endres-Niggemeyer

    Summarizing information

    (1998)
  • K.A. Ericsson et al.

    Verbal reports as data

    Psychological Review

    (1980)
  • K.A. Ericsson et al.

    Protocol analysis: Verbal reports as data

    (1984)
  • D. Fum et al.

    Forward and backward reasoning in automatic abstracting

  • D. Fum et al.

    A propositional language for text representation

  • D. Fum et al.

    Evaluating importance: A step towards text summarization

  • R. Futrelle

    Summarization of documents that include graphics

  • F. Gibb

    Knowledge-based indexing in SIMPR: Integration of natural language processing and principles of subject analysis in an automated indexing system

    Journal of Document and Text Management

    (1993)
  • B.G. Glaser et al.

    The discovery of grounded theory: Strategies for qualitative research

    (1980)
  • Cited by (10)

    • Summarizing court decisions

      2007, Information Processing and Management
    • Summarization from medical documents: A survey

      2005, Artificial Intelligence in Medicine
    • Summarizing speech by contextual reinforcement of important passages

      2012, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
    • Human information interaction: An ecological approach to information behavior

      2012, Human Information Interaction: An Ecological Approach to Information Behavior
    • Learning predicate insertion rules for document abstracting

      2011, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
    View all citing articles on Scopus
    View full text