Skip to main content

An Introduction to Text Mining

  • Chapter
  • First Online:
Mining Text Data

Abstract

The problem of text mining has gained increasing attention in recent years because of the large amounts of text data, which are created in a variety of social network, web, and other information-centric applications. Unstructured data is the easiest form of data which can be created in any application scenario. As a result, there has been a tremendous need to design methods and algorithms which can effectively process a wide variety of text applications. This book will provide an overview of the different methods and algorithms which are common in the text domain, with a particular focus on mining methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 189.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 249.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 249.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. C. Aggarwal. Data Streams: Models and Algorithms, Springer, 2007.

    Google Scholar 

  2. C. Aggarwal. Social Network Data Analytics, Springer, 2011.

    Google Scholar 

  3. R. A. Baeza-Yates, B. A. Ribeiro-Neto, Modern Information Retrieval - the concepts and technology behind search, Second edition, Pearson Education Ltd., Harlow, England, 2011.

    Google Scholar 

  4. S. Chakrabarti, B. Dom, P. Indyk. Enhanced Hypertext Categorization using Hyperlinks, ACM SIGMOD Conference, 1998.

    Google Scholar 

  5. W. B. Croft, D. Metzler, T. Strohma, Search Engines - Information Retrieval in Practice, Pearson Education, 2009.

    Google Scholar 

  6. S. Deerwester, S. Dumais, T. Landauer, G. Furnas, R. Harshman. Indexing by Latent Semantic Analysis. JASIS, 41(6), pp. 391–407, 1990.

    Google Scholar 

  7. D. A. Grossman, O. Frieder, Information Retrieval: Algorithms and Heuristics (The Kluwer International Series on Information Retrieval), Springer-Verlag New York, Inc, 2004.

    Google Scholar 

  8. J. Han, M. Kamber. Data Mining: Concepts and Techniques, 2nd Edition, Morgan Kaufmann, 2005.

    Google Scholar 

  9. C. Manning, P. Raghavan, H. Schutze, Introduction to Information Retrieval, Cambridge University Press, 2008.

    Google Scholar 

  10. I. T. Jolliffee. Principal Component Analysis. Springer, 2002.

    Google Scholar 

  11. S. J. Pan, Q. Yang. A Survey on Transfer Learning, IEEE Transactions on Knowledge and Data Engineering, 22(10): pp 1345–1359, Oct. 2010.

    Article  Google Scholar 

  12. G. Salton. An Introduction to Modern Information Retrieval, Mc Graw Hill, 1983.

    Google Scholar 

  13. K. Sparck Jones P. Willett (ed.). Readings in Information Retrieval, Morgan Kaufmann Publishers Inc, 1997.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Charu C. Aggarwal .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer Science+Business Media, LLC

About this chapter

Cite this chapter

Aggarwal, C.C., Zhai, C. (2012). An Introduction to Text Mining. In: Aggarwal, C., Zhai, C. (eds) Mining Text Data. Springer, Boston, MA. https://doi.org/10.1007/978-1-4614-3223-4_1

Download citation

  • DOI: https://doi.org/10.1007/978-1-4614-3223-4_1

  • Published:

  • Publisher Name: Springer, Boston, MA

  • Print ISBN: 978-1-4614-3222-7

  • Online ISBN: 978-1-4614-3223-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics