Skip to main content

Assessing the Coverage of Data Collection Campaigns on Twitter: A Case Study

  • Conference paper
On the Move to Meaningful Internet Systems: OTM 2013 Workshops (OTM 2013)

Abstract

Online social networks provide a unique opportunity to access and analyze the reactions of people as real-world events unfold. The quality of any analysis task, however, depends on the appropriateness and quality of the collected data. Hence, given the spontaneous nature of user-generated content, as well as the high speed and large volume of data, it is important to carefully define a data-collection campaign about a topic or an event, in order to maximize its coverage (recall). Motivated by the development of a social-network data management platform, in this work we evaluate the coverage of data collection campaigns on Twitter. Using an adaptive language model, we estimate the coverage of a campaign with respect to the total number of relevant tweets. Our findings support the development of adaptive methods to account for unexpected real-world developments, and hence, to increase the recall of the data collection processes.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Lin, J., Snow, R., Morgan, W.: Smoothing techniques for adaptive online language models: topic tracking in tweet streams. In: Procs. of the 17th ACM SIGKDD Intl. Conf. on Knowledge Discovery and Data Mining, pp. 422–429 (2011)

    Google Scholar 

  2. Plachouras, V., Stavrakas, Y.: Querying Term Associations and their Temporal Evolution in Social Data. In: Procs. of the 1st Intl. Workshop on Online Social Systems (2012)

    Google Scholar 

  3. Stavrakas, Y., Plachouras, V.: A Platform for Supporting Data Analytics on Twitter: Challenges and Objectives. In: Procs. of the 1st Intl. Workshop on Knowledge Extraction & Consolidation from Social Media (2012)

    Google Scholar 

  4. Allan, J. (ed.): Introduction to Topic Detection and Tracking: Event-based Information Organization. Kluwer Academic Publishers (2002)

    Google Scholar 

  5. Dan, O., Feng, J., Davison, B.: Filtering microblogging messages for social tv. In: Procs. of the 20th Intl. Conf. Companion on World Wide Web, pp. 197–200 (2011)

    Google Scholar 

  6. Ward, E.: Tweet Collect: short text message collection using automatic query expansion and classification. MSc thesis, University of Upsala (2013)

    Google Scholar 

  7. Ma, Z., Sun, A., Cong, G.: On Predicting the Popularity of Newly Emerging Hashtags in Twitter. J. Am. Soc. Inf. Sci., doi:10.1002/asi.22844

    Google Scholar 

  8. Tsur, O., Rappoport, A.: What’s in a Hashtag? Content based Prediction of the Spread of Ideas in Microblogging Communities. In: Procs. of the 5th ACM Intl. Conf. on Web Search and Data Mining (2012)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Plachouras, V., Stavrakas, Y., Andreou, A. (2013). Assessing the Coverage of Data Collection Campaigns on Twitter: A Case Study. In: Demey, Y.T., Panetto, H. (eds) On the Move to Meaningful Internet Systems: OTM 2013 Workshops. OTM 2013. Lecture Notes in Computer Science, vol 8186. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-41033-8_76

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-41033-8_76

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-41032-1

  • Online ISBN: 978-3-642-41033-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics