Skip to main content

High Quality Microblog Extraction Based on Multiple Features Fusion and Time-Frequency Transformation

  • Conference paper
Web Information Systems Engineering – WISE 2013 (WISE 2013)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 8181))

Included in the following conference series:

Abstract

Online social media exhibits massive social event relevant messages. Some of them contain useful and meaningful information, while others might not worth reading. In this paper, for a given social event, we focus on extracting high quality information from massive social media messages, since the extracted information has valuable textual content, and is widely propagated and posted by authority. We propose an extraction framework to get high quality information by considering different features globally in social media. Specially, in order to reduce computing time and improve extraction precision, some important social media features are employed and transformed into wavelet domain and fused further, to get a weighted ensemble value. A large scale of Sina microblog dataset is used to evaluate the framework’s performance. Experimental results show that the proposed framework is effective to extract high quality information.

This material is based on the work supported by National Science Foundation of China (NSFC) under Award 61070083 as well as the Key Technologies R&D Program of Wuhan under Award 201210421135.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Daubechies, I.: Ten Lectures on Wavelets. Philadelphia: Society for Industrial and Applied Mathematics (1992)

    Google Scholar 

  2. Agichtein, E., Castillo, C., Donato, D., Gionis, A., Mishne, G.: Finding High-quality Content in Social Media. In: The International Conference on Web Search and Web Data Mining, pp. 183–194. ACM Press, New York (2008)

    Chapter  Google Scholar 

  3. Becker, H., Naaman, M., Gravano, L.: Selecting Quality Twitter Content for Events. In: The Fifth International AAAI Conference on Weblogs and Social Media. AAAI Press, Barcelona (2011)

    Google Scholar 

  4. Ramage, D., Dumais, S., Liebling, D.: Characterizing Microblogs with Topic Models. In: International AAAI Conference on Weblogs and Social Media, pp. 130–137. AAAI Press, Washington (2010)

    Google Scholar 

  5. Xia, W., He, Y., Tian, Y., Chen, Q., Lin, L.: Feature Expansion for Microblogging Text Based on Latent Dirichlet Allocation with User Feature. In: 2011 6th IEEE Joint International Information Technology and Artificial Intelligence Conference, pp. 228–232. IEEE Press, Chongqing (2011)

    Chapter  Google Scholar 

  6. Weng, J., Lim, E.P., Jiang, J., He, Q.: Twitterrank: Finding Topic-sensitive Influential Twitterers. In: Proceedings of the Third ACM International Conference on Web Search and Data Mining, pp. 261–270. ACM Press, New York (2010)

    Chapter  Google Scholar 

  7. Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum Likelihood from Incomplete Data via the EM Algorithm. J. Roy. Stat. Soc. Series B (Methodological), 1–38 (1977)

    Google Scholar 

  8. De Choudhury, M., Counts, S., Czerwinski, M.: Find Me the Right Content! Diversity-Based Sampling of Social Media Spaces for Topic-Centric Search. In: the 5th International AAAI Conference on Weblogs and Social Media. AAAI Press, Barcelona (2011)

    Google Scholar 

  9. Vosecky, J., Leung, K.W.-T., Ng, W.: Searching for Quality Microblog Posts: Filtering and Ranking Based on Content Analysis and Implicit Links. In: Lee, S.-g., Peng, Z., Zhou, X., Moon, Y.-S., Unland, R., Yoo, J. (eds.) DASFAA 2012, Part I. LNCS, vol. 7238, pp. 397–413. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  10. Lin, Y.R., Candan, K.S., Sundaram, H., Xie, L.: SCENT: Scalable Compressed Monitoring of Evolving Multirelational Social Networks. J. TOMCCAP 7(1), 29 (2011)

    Google Scholar 

  11. Sharifi, B., Hutton, M.A., Kalita, J.K.: Experiments in Microblog Summarization. In: 2010 IEEE Second International Conference on Social Computing (SocialCom), pp. 49–56. IEEE Press, Minneapolis (2010)

    Chapter  Google Scholar 

  12. Becker, H., Naaman, M., Gravano, L.: Event Identification in Social Media. In: The ACM SIGMOD Workshop on the Web and Databases. ACM Press, Rhode Island (2009)

    Google Scholar 

  13. Harabagiu, S.M., Hickl, A.: Relevance Modeling for Microblog Summarization. In: The Fifth International AAAI Conference on Weblogs and Social Media. AAAI Press, Barcelona (2011)

    Google Scholar 

  14. Asur, S., Huberman, B.A.: Predicting the Future with Social Media. In: 2010 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology, pp. 492–499. IEEE Press, Toronto (2010)

    Chapter  Google Scholar 

  15. Becker, H., Naaman, M., Gravano, L.: Selecting Quality Twitter Content for Events. In: The Fifth International AAAI Conference on Weblogs and Social Media. AAAI Press, Barcelona (2011)

    Google Scholar 

  16. Mallat, S.G.: A Theory for Multiresolution Signal Decomposition: The Wavelet Representation. J. IEEE T. Pattern Anal. 11(7), 674–693 (1989)

    Article  MATH  Google Scholar 

  17. TwitterEngineering: 200 million tweets per day, http://blog.twitter.com/2011/06/200-million-tweets-per-day.html

  18. Sharifi, B., Hutton, M.A., Kalita, J.: Summarizing Microblogs Automatically. In: Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, pp. 685–688. ACL Press, Los Angeles (2010)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Peng, M. et al. (2013). High Quality Microblog Extraction Based on Multiple Features Fusion and Time-Frequency Transformation. In: Lin, X., Manolopoulos, Y., Srivastava, D., Huang, G. (eds) Web Information Systems Engineering – WISE 2013. WISE 2013. Lecture Notes in Computer Science, vol 8181. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-41154-0_14

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-41154-0_14

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-41153-3

  • Online ISBN: 978-3-642-41154-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics