Skip to main content

Identifying Event-Specific Sources from Social Media

  • Chapter
  • First Online:
Online Social Media Analysis and Visualization

Part of the book series: Lecture Notes in Social Networks ((LNSN))

Abstract

Social media has become an indispensable resource for coordinating various real-life events by providing a platform to instantly tap into a huge audience. The participatory nature of social media creates an environment highly conducive for people to share information, voice their opinion, and engage in discussions. It is not uncommon to find novel and specific information with intimate details for an event on social media platforms in contrast to the mainstream media. This makes social media a valuable source for event analysis studies. It is, therefore, of utmost importance to identify quality sources from these social media sites for understanding and exploring an event. However, due to the power law distribution of the Internet, social media sources get buried in the Long Tail. The overwhelming number of social media sources makes it even more challenging to identify the valuable sources. We propose an evolutionary mutual reinforcement model for identifying and ranking highly ‘specific’ social media sources and ‘close’ entities related to an event. Due to the absence of ground truth, we provide a novel evaluation strategy for validating the model. By considering the top ranked sources according to our model, we observe a substantial information gain (ranging between 25 and 130 %) as compared to the baselines (viz., Google search and Icerocket blog search). Moreover, highly informative sources are ranked much higher according to our model as compared to the widely-used baselines, putting spotlight on the social media sources that could be easily overlooked otherwise. Our model further affords an apparatus to analyze events at micro and macro scales. Data for the research is collected from various blogging platforms such as, Blogger (hosted at blogspot), LiveJournal, WordPress, Typepad, etc. and will be made publicly available for researchers.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://touchgraph.com.

  2. 2.

    http://globalvoicesonline.org.

  3. 3.

    http://alchemyapi.com.

  4. 4.

    http://docs.python.org/2/library/difflib.html.

  5. 5.

    http://blogger.com.

  6. 6.

    http://icerocket.com.

References

  1. Adamic L et al (2000) Power-law distribution of the world wide web. Science 287(5461):2115–2115

    Article  Google Scholar 

  2. Agarwal N, Lim M, Wigand RT (2011) Finding her master’s voice: the power of collective action among female muslim bloggers. In: ECIS

    Google Scholar 

  3. Agarwal N, Lim M, Wigand RT (2012) Online collective action and the role of social media in mobilizing opinions: a case study on women’s right-to-drive campaigns in Saudi Arabia. Web 2.0 technologies and democratic governance. Springer, New York, pp 99–123

    Chapter  Google Scholar 

  4. Agarwal N, Lim M, Wigand RT (2012) Raising and rising voices in social media. Bus Inf Syst Eng 4(3):113–126

    Article  Google Scholar 

  5. Agarwal N, Liu H, Tang L, Yu PS (2008) Identifying the influential bloggers in a community. In: Proceedings of the international conference on web search and data mining (WSDM). ACM, pp 207–218

    Google Scholar 

  6. Anderson C (2008) Long tail, the, revised and updated edition: why the future of business is selling less of more. Hyperion

    Google Scholar 

  7. Becker H, Naaman M, Gravano L (2011) Selecting quality twitter content for events. In: ICWSM, p 11

    Google Scholar 

  8. Bian J, Liu Y, Zhou D, Agichtein E, Zha H (2009) Learning to recognize reliable users and content in social media with coupled mutual reinforcement. In: Proceedings of the 18th international conference on world wide web. ACM, pp 51–60

    Google Scholar 

  9. Brin S et al (1998) The anatomy of a large-scale hypertextual web search engine. Comput Netw ISDN Syst 30(1):107–117

    Article  MathSciNet  Google Scholar 

  10. Diakopoulos N et al (2012) Finding and assessing social media information sources in the context of journalism. In: Proceedings of the 2012 ACM annual conference on human factors in computing systems. ACM, pp 2451–2460

    Google Scholar 

  11. Ekdale B et al (2007) From expression to influence: understanding the change in blogger motivations over the blogspan. AEJMC, Washington

    Google Scholar 

  12. Erkan G et al (2004) Lexrank: graph-based lexical centrality as salience in text summarization. J Artif Intell Res (JAIR) 22:457–479

    Google Scholar 

  13. Golub G et al (1996) Matrix computations. Johns Hopkins University Press, Baltimore

    MATH  Google Scholar 

  14. Gupta M, Zhao P, Han J (2012) Evaluating event credibility on twitter. In: SDM. SIAM, pp 153–164

    Google Scholar 

  15. Hamdy N et al (2012) Framing the Egyptian uprising in Arabic language newspapers and social media. J Commun 62(2):195–211

    Article  Google Scholar 

  16. Harb Z (2011) Arab revolutions and the social media effect. M/C J 14(2):1–6

    Google Scholar 

  17. Haveliwala T (2003) Topic-sensitive pagerank: a context-sensitive ranking algorithm for web search. IEEE Trans Knowl Data Eng 15(4):784–796

    Article  Google Scholar 

  18. Jadhav A, Purohit H, Kapanipathi P, Ananthram P, Ranabahu A, Nguyen V, Mendes PN, Smith AG, Cooney M, Sheth A (2010) Twitris 2.0: semantically empowered system for understanding perceptions from social data. Semant Web Chall

    Google Scholar 

  19. Johnson T et al (2004) Wag the blog: how reliance on traditional media and the internet influence credibility perceptions of weblogs among blog users. J Mass Commun Q 81(3):622–642

    Google Scholar 

  20. Jurczyk P, Agichtein E (2007) Hits on question answer portals: exploration of link analysis for author ranking. In: Proceedings of the 30th annual international ACM SIGIR conference on research and development in information retrieval. ACM, pp 845–846

    Google Scholar 

  21. Kleinberg J (1999) Authoritative sources in a hyperlinked environment. J ACM (JACM) 46(5):604–632

    Article  MATH  MathSciNet  Google Scholar 

  22. Kumar S, Barbier G, Abbasi MA, Liu H (2011) Tweettracker: an analysis tool for humanitarian and disaster relief. In: ICWSM

    Google Scholar 

  23. Langville A et al (2004) Deeper inside pagerank. Internet Math 1(3):335–380

    Article  MATH  MathSciNet  Google Scholar 

  24. Liu L, Sun L, Rui Y, Shi Y, Yang S (2008) Web video topic discovery and tracking via bipartite graph reinforcement model. In: Proceedings of the 17th international conference on world wide web. ACM, pp 1009–1018

    Google Scholar 

  25. LOmariba (2009) Is new media posing a serious challenge to traditional media? Technical report, University of Westminster

    Google Scholar 

  26. Mahata D, Agarwal N (2012) What does everybody know? Identifying event-specific sources from social media. In: CASoN, pp 63–68

    Google Scholar 

  27. Mahata D, Agarwal N (2013) Learning from the crowd: an evolutionary mutual reinforcement model for analyzing events. In: Proceedings of the 2013 IEEE/ACM international conference on advances in social networks analysis and mining. ACM, pp 474–478

    Google Scholar 

  28. Marcus A et al (2011) Twitinfo: aggregating and visualizing microblogs for event exploration. In: Proceedings of the 2011 annual conference on human factors in computing systems. ACM, pp 227–236

    Google Scholar 

  29. Morstatter F, Kumar S, Liu H, Maciejewski R (2013) Understanding twitter data with tweetxplorer. In: Proceedings of the 19th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 1482–1485

    Google Scholar 

  30. Ramos J (2003) Using tf-idf to determine word relevance in document queries. In: Proceedings of the first instructional conference on machine learning

    Google Scholar 

  31. Rattenbury T et al (2007) Towards automatic extraction of event and place semantics from flickr tags. In: Proceedings of the 30th annual international ACM SIGIR conference on research and development in information retrieval. ACM, pp 103–110

    Google Scholar 

  32. Reese S et al (2007) Mapping the blogosphere professional and citizen-based media in the global news arena. Journalism 8(3):235–261

    Article  Google Scholar 

  33. Sakaki T, Okazaki M, Matsuo Y (2010) Earthquake shakes twitter users: real-time event detection by social sensors. In: Proceedings of the 19th international conference on world wide web. ACM, pp 851–860

    Google Scholar 

  34. Sankaranarayanan J, Samet H, Teitler BE, Lieberman MD, Sperling J (2009) Twitterstand: news in tweets. In: Proceedings of the 17th ACM SIGSPATIAL international conference on advances in geographic information systems. ACM, pp 42–51

    Google Scholar 

  35. Singh V et al (2010) Mining the blogosphere from a socio-political perspective. In: 2010 international conference on Computer information systems and industrial management applications (CISIM), IEEE, pp 365–370

    Google Scholar 

  36. Troncy R et al (2010) Linking events with media. In: Proceedings of the 6th international conference on semantic systems. ACM, p 42

    Google Scholar 

  37. Tumasjan A, Sprenger TO, Sandner PG, Welpe IM (2010) Predicting elections with Twitter: what 140 characters reveal about political sentiment. ICWSM 10:178–185

    Google Scholar 

Download references

Acknowledgments

This research is funded in part by the National Science Foundation’s Social Computational Systems (SoCS) and Human-Centered Computing (HCC) research programs within the Directorate for Computer&Information Science&Engineering’s (CISE) Division of Information&Intelligent Systems (IIS) (Award Numbers: IIS-1110868 and IIS-1110649) and the US Office of Naval Research (Grant numbers: N000141010091 and N000141410489). We would like to thank the Advances in Social Network Analysis and Mining (ASONAM) 2013 conference chairs for inviting us to develop our research further and submit the research to this publication. We are also grateful to the anonymous reviewers for their invaluable comments.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nitin Agarwal .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Mahata, D., Agarwal, N. (2014). Identifying Event-Specific Sources from Social Media. In: Kawash, J. (eds) Online Social Media Analysis and Visualization. Lecture Notes in Social Networks. Springer, Cham. https://doi.org/10.1007/978-3-319-13590-8_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-13590-8_1

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-13589-2

  • Online ISBN: 978-3-319-13590-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics