Skip to main content

Text Quantification

  • Conference paper
Advances in Information Retrieval (ECIR 2014)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 8416))

Included in the following conference series:

Abstract

In recent years it has been pointed out that, in a number of applications involving classification, the final goal is not determining which class (or classes) individual unlabelled data items belong to, but determining the prevalence (or “relative frequency”) of each class in the unlabelled data. The latter task has come to be known as quantification [1, 3, 5-10, 15, 18, 19].

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Baccianella, S., Esuli, A., Sebastiani, F.: Variable-constraint classification and quantification of radiology reports under the ACR Index. Expert Systems and Applications 40(9), 3441–3449 (2013)

    Article  Google Scholar 

  2. Bella, A., Ferri, C., Hernández-Orallo, J., Ramírez-Quintana, M.J.: Quantification via probability estimators. In: Proceedings of the 11th IEEE International Conference on Data Mining (ICDM 2010), pp. 737–742 (2010)

    Google Scholar 

  3. Bella, A., Ferri, C., Hernández-Orallo, J., Ramírez-Quintana, M.J.: Aggregative quantification for regression. Data Mining and Knowledge Discovery 28(2), 475–518 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  4. Esuli, A., Sebastiani, F.: Machines that learn how to code open-ended survey data. International Journal of Market Research 52(6), 775–800 (2010)

    Article  Google Scholar 

  5. Esuli, A., Sebastiani, F.: Sentiment quantification. IEEE Intelligent Systems 25(4), 72–75 (2010)

    Article  Google Scholar 

  6. Esuli, A., Sebastiani, F.: Optimizing text quantifiers for multivariate loss functions. Technical Report 2013-TR-005, Istituto di Scienza e Tecnologie dell’Informazione, Consiglio Nazionale delle Ricerche, Pisa, IT (2013)

    Google Scholar 

  7. Forman, G.: Counting positives accurately despite inaccurate classification. In: Gama, J., Camacho, R., Brazdil, P.B., Jorge, A.M., Torgo, L. (eds.) ECML 2005. LNCS (LNAI), vol. 3720, pp. 564–575. Springer, Heidelberg (2005)

    Google Scholar 

  8. Forman, G.: Quantifying trends accurately despite classifier error and class imbalance. In: Proceedings of the 12th ACM International Conference on Knowledge Discovery and Data Mining (KDD 2006), Philadelphia, US, pp. 157–166 (2006)

    Google Scholar 

  9. Forman, G.: Quantifying counts and costs via classification. Data Mining and Knowledge Discovery 17(2), 164–206 (2008)

    Article  MathSciNet  Google Scholar 

  10. Forman, G., Kirshenbaum, E., Suermondt, J.: Pragmatic text mining: Minimizing human effort to quantify many issues in call logs. In: Proceedings of the 12th ACM International Conference on Knowledge Discovery and Data Mining (KDD 2006), Philadelphia, US, pp. 852–861 (2006)

    Google Scholar 

  11. Gamon, M.: Sentiment classification on customer feedback data: Noisy data, large feature vectors, and the role of linguistic analysis. In: Proceedings of the 20th International Conference on Computational Linguistics (COLING 2004), Geneva, CH, pp. 841–847 (2004)

    Google Scholar 

  12. Giorgetti, D., Sebastiani, F.: Automating survey coding by multiclass text categorization techniques. Journal of the American Society for Information Science and Technology 54(14), 1269–1277 (2003)

    Article  Google Scholar 

  13. Hopkins, D.J., King, G.: A method of automated nonparametric content analysis for social science. American Journal of Political Science 54(1), 229–247 (2010)

    Article  Google Scholar 

  14. Kelly, M.G., Hand, D.J., Adams, N.M.: The impact of changing populations on classifier performance. In: Proceedings of the 5th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 1999), San Diego, US, pp. 367–371 (1999)

    Google Scholar 

  15. Milli, L., Monreale, A., Rossetti, G., Giannotti, F., Pedreschi, D., Sebastiani, F.: Quantification trees. In: Proceedings of the 13th IEEE International Conference on Data Mining (ICDM 2013), Dallas, US, pp. 528–536 (2013)

    Google Scholar 

  16. Quiñonero-Candela, J., Sugiyama, M., Schwaighofer, A., Lawrence, N.D.: Dataset shift in machine learning. The MIT Press, Cambridge (2009)

    Google Scholar 

  17. Sammut, C., Harries, M.: Concept drift. In: Sammut, C., Webb, G.I. (eds.) Encyclopedia of Machine Learning, pp. 202–205. Springer, Heidelberg (2011)

    Google Scholar 

  18. Tang, L., Gao, H., Liu, H.: Network quantification despite biased labels. In: Proceedings of the 8th Workshop on Mining and Learning with Graphs (MLG 2010), Washington, US, pp. 147–154 (2010)

    Google Scholar 

  19. Xue, J.C., Weiss, G.M.: Quantification and semi-supervised classification methods for handling changes in class distribution. In: Proceedings of the 15th ACM International Conference on Knowledge Discovery and Data Mining (SIGKDD 2009), Paris, FR, pp. 897–906 (2009)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Sebastiani, F. (2014). Text Quantification. In: de Rijke, M., et al. Advances in Information Retrieval. ECIR 2014. Lecture Notes in Computer Science, vol 8416. Springer, Cham. https://doi.org/10.1007/978-3-319-06028-6_104

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-06028-6_104

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-06027-9

  • Online ISBN: 978-3-319-06028-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics