Blog Mining for the Fortune 500

Geller, James; Parikh, Sapankumar; Krishnan, Sriram

doi:10.1007/978-3-540-73499-4_29

James Geller¹,
Sapankumar Parikh¹ &
Sriram Krishnan¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4571))

Included in the following conference series:

International Workshop on Machine Learning and Data Mining in Pattern Recognition

3642 Accesses
2 Citations

Abstract

In recent years there has been a tremendous increase in the number of users maintaining online blogs on the Internet. Companies, in particular, have become aware of this medium of communication and have taken a keen interest in what is being said about them through such personal blogs. This has given rise to a new field of research directed towards mining useful information from a large amount of unformatted data present in online blogs and online forums. We discuss an implementation of such a blog mining application. The application is broadly divided into two parts, the indexing process and the search module. Blogs pertaining to different organizations are fetched from a particular blog domain on the Internet. After analyzing the textual content of these blogs they are assigned a sentiment rating. Specific data from such blogs along with their sentiment ratings are then indexed on the physical hard drive. The search module searches through these indexes at run time for the input organization name and produces a list of blogs conveying both positive and negative sentiments about the organization.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Aschenbrenner, A., Miksch, S.: Blog Mining in a Corporate Environment, Technical Report ASGAARD-TR-2005-11, Technical University Vienna (September 2005) (accessed February 1, 2007), http://ieg.ifs.tuwien.ac.at/techreports/Asgaard-TR-2005-1.pdf
Shen, D., Sun, J.-T., Yang, Q., Chen, Z.: Latent Friend Mining from Blog Data. In: International Conference on Data Mining, pp. 552–561 (2006)
Google Scholar
Tirapat, T., Espiritu, C., Stroulia, E.: Taking the community’s pulse: one blog at a time. In: International Conference on Web Engineering, pp. 169–176 (2006)
Google Scholar
Mishne, G.: Experiments with Mood Classification in Blog Posts. In: Style2005 the 1st Workshop on Stylistic Analysis of Text for Information Access, at SIGIR 2005 (August 2005)
Google Scholar
Li, X., Liu, B., Yu, P.S.: Mining Community Structure of Named Entities from Web Pages and Blogs. In: AAAI Spring Symposium, Computational Approaches to Analyzing Weblogs, pp. 108–114 (2006)
Google Scholar
Google Web APIs (March 10, 2006), http://code.google.com/apis
Fischer, I., Torres, E.: A Distributed Blog Search Platform (2006)
Google Scholar
The Blog in the Corporate Machine, The Economist (February 11, 2006)
Google Scholar
Fortune 500 Full List, CNNMoney (April 17, 2006), http://money.cnn.com/magazines/fortune/fortune500/full_list
Hatcher, E., Gospodnetic, O.: Lucene in Action (2006)
Google Scholar
JTidy - HTML Parser and Pretty-Printer in Java (March 10, 2006), http://jtidy.sourceforge.net
LingPipe (2007), http://www.alias-i.com/lingpipe

Download references

Author information

Authors and Affiliations

College of Computing Sciences, Department of Computer Sciences, New Jersey Institute of Technology, Newark, NJ 07102,
James Geller, Sapankumar Parikh & Sriram Krishnan

Authors

James Geller
View author publications
You can also search for this author in PubMed Google Scholar
Sapankumar Parikh
View author publications
You can also search for this author in PubMed Google Scholar
Sriram Krishnan
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Petra Perner

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Geller, J., Parikh, S., Krishnan, S. (2007). Blog Mining for the Fortune 500. In: Perner, P. (eds) Machine Learning and Data Mining in Pattern Recognition. MLDM 2007. Lecture Notes in Computer Science(), vol 4571. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-73499-4_29

Download citation

DOI: https://doi.org/10.1007/978-3-540-73499-4_29
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-73498-7
Online ISBN: 978-3-540-73499-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics