Skip to main content

WISE 2014 Challenge: Multi-label Classification of Print Media Articles to Topics

  • Conference paper
Book cover Web Information Systems Engineering – WISE 2014 (WISE 2014)

Abstract

The WISE 2014 challenge was concerned with the task of multi-label classification of articles coming from Greek print media. Raw data comes from the scanning of print media, article segmentation, and optical character segmentation, and therefore is quite noisy. Each article is examined by a human annotator and categorized to one or more of the topics being monitored. Topics range from specific persons, products, and companies that can be easily categorized based on keywords, to more general semantic concepts, such as environment or economy. Building multi-label classifiers for the automated annotation of articles into topics can support the work of human annotators by suggesting a list of all topics by order of relevance, or even automate the annotation process for media and/or categories that are easier to predict. This saves valuable time and allows a media monitoring company to expand the portfolio of media being monitored. This paper summarizes the approaches of the top 4 among the 121 teams that participated in the competition.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Tsoumakas, G., Katakis, I., Vlahavas, I.: Mining multi-label data. In: Maimon, O., Rokach, L. (eds.) Data Mining and Knowledge Discovery Handbook, 2nd edn., pp. 667–685. Springer, Heidelberg (2010)

    Google Scholar 

  2. Wolpert, D.H.: Stacked generalization. Neural Networks 5, 241–259 (1992)

    Article  Google Scholar 

  3. Lesk, M.E.: Word-word associations in document retrieval systems. American Documentation 20(1), 27–38 (1969)

    Article  Google Scholar 

  4. Sill, J., Takács, G., Mackey, L., Lin, D.: Feature-weighted linear stacking. CoRR abs/0911.0460 (2009)

    Google Scholar 

  5. Puurula, A., Bifet, A.: Ensembles of sparse multinomial classifiers for scalable text classification. In: ECML/PKDD - PASCAL Workshop on Large-Scale Hierarchical Classification (2012)

    Google Scholar 

  6. Zhang, T.: Solving large scale linear prediction problems using stochastic gradient descent algorithms. In: Proceedings of the Twenty-First International Conference on Machine Learning, ICML 2004, p. 116. ACM, New York (2004)

    Google Scholar 

  7. Zadrozny, B., Elkan, C.: Transforming classifier scores into accurate multiclass probability estimates. In: Proceedings of the eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2002, pp. 694–699 (2002)

    Google Scholar 

  8. Puurula, A., Read, J., Bifet, A.: Kaggle LSHTC4 winning solution. CoRR abs/1405.0546 (2014)

    Google Scholar 

  9. Elisseeff, A., Weston, J.: A kernel method for multi-labelled classification. In: Advances in Neural Information Processing Systems 14 (2002)

    Google Scholar 

  10. Nam, J., Kim, J., Gurevych, I., Fürnkranz, J.: Large-scale multi-label text classification - revisiting neural networks. CoRR abs/1312.5419 (2013)

    Google Scholar 

  11. Domingos, P.: The role of occam’s razor in knowledge discovery. Data Min. Knowl. Discov. 3(4), 409–425 (1999)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Tsoumakas, G. et al. (2014). WISE 2014 Challenge: Multi-label Classification of Print Media Articles to Topics. In: Benatallah, B., Bestavros, A., Manolopoulos, Y., Vakali, A., Zhang, Y. (eds) Web Information Systems Engineering – WISE 2014. WISE 2014. Lecture Notes in Computer Science, vol 8787. Springer, Cham. https://doi.org/10.1007/978-3-319-11746-1_40

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-11746-1_40

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-11745-4

  • Online ISBN: 978-3-319-11746-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics