Skip to main content

Sentence Extraction Using Asymmetric Word Similarity and Topic Similarity

  • Conference paper
Applied Soft Computing Technologies: The Challenge of Complexity

Part of the book series: Advances in Soft Computing ((AINSC,volume 34))

  • 1206 Accesses

Abstract

We propose a text summarization system known as MySum in finding the significance of sentences in order to produce a summary based on asymmetric word similarity and topic similarity. We use mass assignment theory to compute similarity between words based on the basis of their contexts. The algorithm is incremental so that words or documents can be added or subtracted without massive re-computation. Words are considered similar if they appear in similar contexts, however, these words do not have to be synonyms. We also compute the similarity of a sentence to the topic using frequency of overlapping words. We compare the summaries produced with the ones by humans and other system known as TF.ISF (term frequency-inverse sentence frequency). Our method generates summaries that are up to 60% similar to the manually created summaries taken from DUC 2002 test collection.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 259.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 329.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Baldwin JF, Martin TP, Pilsworth BW (1995) Fril-Fuzzy and Evidential Reasoning in Artificial Intelligence. Research Studies Press, England

    Google Scholar 

  • Baldwin JF, Martin TP, Pilsworth BW (1996) “A Mass Assignment Theory of the Probability of Fuzzy Events,” Fuzzy Sets and Systems, (83), pp 353–367

    Article  MathSciNet  Google Scholar 

  • DUC (2002) DUC-Document Understanding Conferences, http://duc.nist.gov

    Google Scholar 

  • Harris Z (1985) Distributional Structure. In: Katz JJ (ed) The Philosophy of Linguistics. New York: Oxford University Press, pp 26–47

    Google Scholar 

  • Larocca Neto J, Santos AD, Kaestner CAA, Freitas AA (2000b) Document clustering and text summarization. In Proc. 4th Int. Conf. Practical Applications of Knowledge Discovery and Data Mining (PADD-2000), London: The Practical Application Company, pp 41–55

    Google Scholar 

  • Lo SH, Meng H, Lam W (2002) “Automatic Bilingual Text Document Summarization,” Proceedings of the Sixth World Multiconference on Systematic, Cybernetics and Informatics, Orlando, Florida, USA

    Google Scholar 

  • Luhn H (1958) The automatic creation of literature abstracts. IBM Journal of Research and Development, 2 (92):159–165

    Article  MathSciNet  Google Scholar 

  • Mani I, Maybury MT (eds) (1999) Advances in Automatic Text Summarization, Cambridge, MA: The MIT Press

    Google Scholar 

  • Pantel P, Lin D (2002) “Discovering Word Senses from Text,” In Conference on Knowledge Discovery and Data Mining, Alberta, Canada

    Google Scholar 

  • Salton G, Buckley C (1988) Term-weighting approaches in automatic text retrieval. Information Processing and Management 24, pp 513–523. Reprinted in: Sparck Jones K. and Willet P. (eds) (1997) Readings in Information Retrieval, Morgan Kaufmann, pp 323–328

    Google Scholar 

  • Yohei S (2002) “Sentence Extraction by tf/idf and Position Weighting from Newspaper Articles TSC-8),” NTCIR Workshop 3 Meeting TSC, pp 55–59

    Google Scholar 

  • Zadeh LA (1965) “Fuzzy Sets,” Information and Control, vol. 8, pp 338–353

    Article  MATH  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer

About this paper

Cite this paper

Azmi-Murad, M., Martin, T. (2006). Sentence Extraction Using Asymmetric Word Similarity and Topic Similarity. In: Abraham, A., de Baets, B., Köppen, M., Nickolay, B. (eds) Applied Soft Computing Technologies: The Challenge of Complexity. Advances in Soft Computing, vol 34. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-31662-0_39

Download citation

  • DOI: https://doi.org/10.1007/3-540-31662-0_39

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-31649-7

  • Online ISBN: 978-3-540-31662-6

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics