Skip to main content

Challenges in Applying Machine Learning to Media Monitoring

  • Conference paper
  • First Online:
  • 851 Accesses

Abstract

The Gorkana Group provides high quality media monitoring services to its clients. This paper describes an ongoing project aimed at increasing the amount of automation in Gorkana Group’s workflow through the application of machine learning and language processing technologies. It is important that Gorkana Group’s clients should have a very high level of confidence that if an article has been published, that is relevant to one of their briefs, then they will be shown the article. However, delivering this high-quality media monitoring service means that humans are having to read through very large quantities of data, only a small portion of which is typically deemed relevant. The challenge being addressed by the work reported in this paper is how to efficiently achieve such high-quality media monitoring in the face of huge increases in the amount of the data that needs to be monitored. This paper discusses some of the findings that have emerged during the early stages of the project. We show that, while machine learning can be applied successfully to this real world business problem, the distinctive constraints of the task give rise to a number of interesting challenges.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   169.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. A.L. Blum and P. Langley. Selection of relevant features and examples in machine learning. Artificial intelligence, 97(1-2):245–271, 1997.

    Article  MathSciNet  MATH  Google Scholar 

  2. Stanley F. Chen and Joshua Goodman. An empirical study of smoothing techniques for language modeling. Computer Speech & Language, 13(4):359 – 393, 1999.

    MathSciNet  Google Scholar 

  3. Daoud Clarke, Peter Lane, and Paul Hender. Developing robust models for favourability analysis. In Proceedings of the 2nd Workshop on Computational Approaches to Subjectivity and Sentiment Analysis (WASSA 2.011), pages 44–52, Portland, Oregon, June 2011. Association for Computational Linguistics.

    Google Scholar 

  4. G. Forman. An extensive empirical study of feature selection metrics for text classification. The Journal of Machine Learning Research, 3:1289–1305, 2003.

    MATH  Google Scholar 

  5. William A Gale and Kenneth W Church. What’s Wrong with Adding One? In Nelleke Oostdijk and Peter de Haan, editors, Corpus Based Research in Language: In Honour of Jan Aarts, pages 189–200. Rodopi, Amsterdam, 1994.

    Google Scholar 

  6. P.D. Green, P. C. R. Lane, A.W. Rainer, and S. Scholz. Selecting measures in origin analysis. In Proceedings of the Thirtieth SGAI International Conference on Artificial Intelligence, 2010.

    Google Scholar 

  7. M. Kubat, R.C. Holte, and S. Matwin. Machine learning for the detection of oil spills in satellite radar images. Machine learning, 30(2):195–215, 1998.

    Article  Google Scholar 

  8. Christopher D. Manning, Prabhakar Raghavan, and Hinrich Schtze. Introduction to Information Retrieval. Cambridge University Press, New York, NY, USA, 2008.

    Google Scholar 

  9. D. Mladeni’c. Feature subset selection in text-learning. Machine Learning: ECML-98, pages 95–100, 1998.

    Google Scholar 

  10. M. Rogati and Y. Yang. High-performing feature selection for text classification. In Proceedings of the eleventh international conference on Information and knowledge management, pages 659–661. ACM, 2002.

    Google Scholar 

  11. Lei Tang and Huan Liu. Bias analysis in text classification for highly skewed data. In Proceedings of the Fifth IEEE International Conference on Data Mining, ICDM ’05, pages 781–784, Washington, DC, USA, 2005. IEEE Computer Society.

    Google Scholar 

  12. Edward Tufte. Sparkline theory and practice. http://www.edwardtufte.com/ bboard/q-and-a-fetch-msg?msg_id=OR&topic_id=1, May 2004.

  13. Chengxiang Zhai and John Lafferty. A study of smoothing methods for language models applied to information retrieval. ACM Trans. Inf. Syst., 22(2):179–214, April 2004.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Matti Lyra .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag London

About this paper

Cite this paper

Lyra, M., Clarke, D., Morgan, H., Reffin, J., Weir, D. (2012). Challenges in Applying Machine Learning to Media Monitoring. In: Bramer, M., Petridis, M. (eds) Research and Development in Intelligent Systems XXIX. SGAI 2012. Springer, London. https://doi.org/10.1007/978-1-4471-4739-8_32

Download citation

  • DOI: https://doi.org/10.1007/978-1-4471-4739-8_32

  • Published:

  • Publisher Name: Springer, London

  • Print ISBN: 978-1-4471-4738-1

  • Online ISBN: 978-1-4471-4739-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics