skip to main content
10.1145/3459637.3482040acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
abstract

Learning to Quantify: Methods and Applications (LQ 2021)

Published: 30 October 2021 Publication History

Abstract

Learning to Quantify (LQ) is the task of training class prevalence estimators via supervised learning. The task of these estimators is to estimate, given an unlabelled set of data items D and a set of classes C ={c1,...., c|C|}, the prevalence (i.e., relative frequency) of each class c_i in D. LQ is interesting in all applications of classification in which the final goal is not determining which class (or classes) individual unlabelled data items belong to, but estimating the distribution of the unlabelled data items across the classes of interest. Example disciplines whose interest in labelling data items is at the aggregate level (rather than at the individual level) are the social sciences, political science, market research, ecological modelling, and epidemiology. While LQ may in principle be solved by classifying each data item in D and counting how many such items have been labelled with c_i, it has been shown that this "classify and count'' (CC) method yields suboptimal quantification accuracy. As a result, quantification is now no longer considered a mere byproduct of classification and has evolved as a task of its own. The goal of this workshop is bringing together all researchers interested in methods, algorithms, and evaluation measures and methodologies for LQ, as well as practitioners interested in their practical application to managing large quantities of data.

References

[1]
Letizia, Anna Monreale, Giulio Rossetti, Fosca Giannotti, Dino Pedreschi, and Fabrizio Sebastiani. 2013. Quantification trees. In Proceedings of the ICDM 2013. Dallas, US, 528--536. https://doi.org/10.1109/icdm.2013.122
[2]
Rocío Alaíz-Rodríguez, Alicia Guerrero-Curieses, and Jesús Cid-Sueiro. 2011. Class and subclass probability re-estimation to adapt a classifier in the presence of concept drift. Neurocomputing, Vol. 74, 16 (2011), 2614--2623. https://doi.org/10.1016/j.neucom.2011.03.019
[3]
José Barranquero, Jorge Díez, and Juan José del Coz. 2015. Quantification-oriented learning based on reliable classifiers. Pattern Recognition, Vol. 48, 2 (2015), 591--604. https://doi.org/10.1016/j.patcog.2014.07.032
[4]
Antonio Bella, Cèsar Ferri, José Hernández-Orallo, and María José Ramírez-Quintana. 2010. Quantification via probability estimators. In Proceedings of the 11th IEEE International Conference on Data Mining (ICDM 2010). Sydney, AU, 737--742. https://doi.org/10.1109/icdm.2010.75
[5]
Dallas Card and Noah A. Smith. 2018. The importance of calibration for estimating proportions from annotations. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics (HLT-NAACL 2018). New Orleans, US, 1636--1646. https://doi.org/10.18653/v1/n18--1148
[6]
Giovanni Da San Martino, Wei Gao, and Fabrizio Sebastiani. 2016. Ordinal text quantification. In Proceedings of the 39th ACM Conference on Research and Development in Information Retrieval (SIGIR 2016). Pisa, IT, 937--940. https://doi.org/10.1145/2911451.2914749
[7]
Marthinus C. du Plessis, Gang Niu, and Masashi Sugiyama. 2017. Class-prior estimation for learning from positive and unlabeled data. Machine Learning, Vol. 106, 4 (2017), 463--492. https://doi.org/10.1007/s10994-016--5604--6
[8]
Andrea Esuli, Alejandro Moreo, and Fabrizio Sebastiani. 2018. A recurrent neural network for sentiment quantification. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management (CIKM 2018). Torino, IT, 1775--1778. https://doi.org/10.1145/3269206.3269287
[9]
Andrea Esuli and Fabrizio Sebastiani. 2010. Sentiment quantification. IEEE Intelligent Systems, Vol. 25, 4 (2010), 72--75.
[10]
Andrea Esuli and Fabrizio Sebastiani. 2015. Optimizing text quantifiers for multivariate loss functions. ACM Transactions on Knowledge Discovery and Data, Vol. 9, 4 (2015), Article 27. https://doi.org/10.1145/2700406
[11]
George Forman. 2008. Quantifying counts and costs via classification. Data Mining and Knowledge Discovery, Vol. 17, 2 (2008), 164--206. https://doi.org/10.1007/s10618-008-0097-y
[12]
Wei Gao and Fabrizio Sebastiani. 2016. From classification to quantification in tweet sentiment analysis. Social Network Analysis and Mining, Vol. 6, 19 (2016), 1--22. https://doi.org/10.1007/s13278-016-0327-z
[13]
Pablo Gonzá lez, Alberto Casta n o, Nitesh V. Chawla, and Juan José del Coz. 2017. A review on quantification learning. Comput. Surveys, Vol. 50, 5 (2017), 74:1--74:40. https://doi.org/10.1145/3117807
[14]
Víctor González-Castro, Rocío Alaiz-Rodríguez, and Enrique Alegre. 2013. Class distribution estimation based on the Hellinger distance. Information Sciences, Vol. 218 (2013), 146--164. https://doi.org/10.1016/j.ins.2012.05.028
[15]
Daniel J. Hopkins and Gary King. 2010. A method of automated nonparametric content analysis for social science. American Journal of Political Science, Vol. 54, 1 (2010), 229--247. https://doi.org/10.1111/j.1540--5907.2009.00428.x
[16]
Gary King and Ying Lu. 2008. Verbal autopsy methods with multiple causes of death. Statist. Sci., Vol. 23, 1 (2008), 78--91. https://doi.org/10.1214/07-sts247
[17]
Roy Levin and Haggai Roitman. 2017. Enhanced probabilistic classify and count methods for multi-label text quantification. In Proceedings of the 7th ACM International Conference on the Theory of Information Retrieval (ICTIR 2017). Amsterdam, NL, 229--232. https://doi.org/10.1145/3121050.3121083
[18]
André G. Maletzke, Denis Moreira dos Reis, and Gustavo E. Batista. 2018. Combining instance selection and self-training to improve data stream quantification. Journal of the Brazilian Computer Society, Vol. 24, 12 (2018), 43--48. https://doi.org/10.1186/s13173-018-0076-0
[19]
Letizia Milli, Anna Monreale, Giulio Rossetti, Dino Pedreschi, Fosca Giannotti, and Fabrizio Sebastiani. 2015. Quantification in social networks. In Proceedings of the 2nd IEEE International Conference on Data Science and Advanced Analytics (DSAA 2015). Paris, FR. https://doi.org/10.1109/dsaa.2015.7344845
[20]
Jose G. Moreno-Torres, Troy Raeder, Rocío Alaíz-Rodríguez, Nitesh V. Chawla, and Francisco Herrera. 2012. A unifying view on dataset shift in classification. Pattern Recognition, Vol. 45, 1 (2012), 521--530. https://doi.org/10.1016/j.patcog.2011.06.019
[21]
Fabrizio Sebastiani. 2020. Evaluation measures for quantification: An axiomatic approach. Information Retrieval Journal, Vol. 23, 3 (2020), 255--288. https://doi.org/10.1007/s10791-019-09363-y
[22]
Vladimir Vapnik. 1998. Statistical learning theory .Wiley, New York, US.

Cited By

View all
  • (2024)Matching Distributions Algorithms Based on the Earth Mover’s Distance for Ordinal QuantificationIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2022.317935535:1(1050-1061)Online publication date: Jan-2024
  • (2022)A Concise Overview of LeQua@CLEF 2022: Learning to QuantifyExperimental IR Meets Multilinguality, Multimodality, and Interaction10.1007/978-3-031-13643-6_23(362-381)Online publication date: 5-Sep-2022
  • (2022)LeQua@CLEF2022: Learning to QuantifyAdvances in Information Retrieval10.1007/978-3-030-99739-7_47(374-381)Online publication date: 10-Apr-2022

Index Terms

  1. Learning to Quantify: Methods and Applications (LQ 2021)

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    CIKM '21: Proceedings of the 30th ACM International Conference on Information & Knowledge Management
    October 2021
    4966 pages
    ISBN:9781450384469
    DOI:10.1145/3459637
    Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 30 October 2021

    Check for updates

    Author Tags

    1. dataset shift
    2. quantification

    Qualifiers

    • Abstract

    Funding Sources

    • H2020 Programme ICT-48-202
    • Ministerio de Economía y Competitividad
    • H2020 Programme INFRAIA-2019-1

    Conference

    CIKM '21
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

    Upcoming Conference

    CIKM '25

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)11
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 05 Mar 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Matching Distributions Algorithms Based on the Earth Mover’s Distance for Ordinal QuantificationIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2022.317935535:1(1050-1061)Online publication date: Jan-2024
    • (2022)A Concise Overview of LeQua@CLEF 2022: Learning to QuantifyExperimental IR Meets Multilinguality, Multimodality, and Interaction10.1007/978-3-031-13643-6_23(362-381)Online publication date: 5-Sep-2022
    • (2022)LeQua@CLEF2022: Learning to QuantifyAdvances in Information Retrieval10.1007/978-3-030-99739-7_47(374-381)Online publication date: 10-Apr-2022

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media