ABSTRACT
Measuring the quality of research work is an essential component of the scientific process. With the ever-growing rates of articles being submitted to top-tier conferences, and the potential consistency and bias issues in the peer review process identified by scientific community, it is thus of great necessary and challenge to automatically evaluate submissions. Existing works mainly focus on exploring relevant factors and applying machine learning models to simply be accurate at predicting the acceptance of a given academic paper, while ignoring the interpretability power which is required by a wide range of applications. In this paper, we propose a framework to construct decision sets that consist of unordered if-then rules for predicting paper acceptance. We formalize decision set learning problem via a joint objective function that simultaneously optimize accuracy and interpretability of the rules, rather than organizing them in a hierarchy. We evaluate the effectiveness of the proposed framework by applying it on a public scientific peer reviews dataset. Experimental results demonstrate that the learned interpretable decision sets by our framework performs on par with state-of-the-art classification algorithms which optimize exclusively for predictive accuracy and much more interpretable than rule-based methods.
- [1] J. Tang, J. Zhang, L. Yao, J. Li, L. Zhang, and Z. Su. Arnetminer: Extraction and Mining of Academic Social Networks. In Proc. of SIGKDD, pages 990-998. 2008.Google ScholarDigital Library
- [2] D. Wang, C. Song, and A.-L. Barabási. Quantifying Long-term Scientific Impact. Science, 342(6154), 127–132, 2013.Google Scholar
- [3] R. Sinatra, D. Wang, P. Deville, C. Song, and A.-L. Barabási. Quantifying the Evolution of Individual Scientific Impact. Science, 354(6312), 596, 2016.Google ScholarCross Ref
- [4] P. Bao and X. Zhang. Uncovering and Predicting the Dynamic Process of Collective Attention with Survival Theory. Sci. Rep., 7:2621, 2017.Google ScholarCross Ref
- [5] C. Candia, C. Jara-Figueroa, C. Rodriguez-Sickert, A.-L. Barabási, and C. A. Hidalgo. The Universal Decay of Collective Memory and Attention. Nat. Hum. Behav., 3(1),82-91, 2020.Google Scholar
- [6] H. W. Shen and A.-L. Barabási. Collective Credit Allocation in Science. Proc. Natl. Acad. Sci., 111(34), 12325–12330, 2014.Google ScholarCross Ref
- [7] P. Bao and C. Zhai. Dynamic Credit Allocation in Scientific Literature. Scientometrics, 112, 595-606, 2017.Google Scholar
- [8] P. Bao, J. Wang. Identifying Your Representative Work Based on Credit Allocation. In Proc. of WWW, page 5-6, 2018.Google ScholarDigital Library
- [9] P. Bao, J. Wang. Metapath-Guided Credit Allocation for Identifying RepresentativeWorks. In Proc. of WWW, page 8-9, 2020.Google ScholarDigital Library
- [10] A. Birukou, J. R. Wakeling, C. Bartolin, et al. Alternatives to Peer Review: Novel Approaches for Research Evaluation. Front. Comput. Neurosc., 4, 2011.Google Scholar
- [11] S. Hill and F. Provost. The Myth of the Double-blind Review? Author Identification Using Only Citations. SIGKDD Explorations, 5(2), 179-184, 2003.Google ScholarDigital Library
- [12] D. Kang, W. Ammar, B. D. Mishra, et al. A Dataset of Peer Reviews (PeerRead): Collection, Insights and NLP Applications. In Proc. of NAACL, pages 1647-1661, 2018.Google Scholar
- [13] V. Balachandran. Reducing human effort and improving quality in peer code reviews using automatic static analysis and reviewer recommendation. In Proc. of ICSE, page 931-940, 2013.Google ScholarCross Ref
- [14] T. Chen and Y. Sun. Task-Guided and Path-Augmented Heterogeneous Network Embedding for Author Identification. In Proc. of WSDM, pages 295-304, 2017.Google ScholarDigital Library
- [15] P. Clark and T. Niblett. The CN2 Induction Algorithm. Mach. Learn., 3(4), 261–283, 1989.Google ScholarCross Ref
- [16] B. Liu, W. Hsu, and Y. Ma. Integrating Classification and Association Rule Mining. In Proc. of SIGKDD, pages 80-86, 1998.Google Scholar
- [17] B. Letham, C. Rudin, T. H. McCormick, et al. Interpretable Classifiers Using Rules and Bayesian Analysis: Building a Better Stroke Prediction Model. Ann. Appl. Stat., 9(3), 1350-1371, 2015.Google ScholarCross Ref
- [18] H. Lakkaraju, S. H. Bach, and J. Leskovec. Interpretable Decision Sets: A Joint Framework for Description and Prediction. In Proc. of SIGKDD, pages 1675-1684, 2016.Google ScholarDigital Library
- [19] S. Khuller, A. Moss, and J. S. Naor. The Budgeted Maximum Coverage Problem. Inf. Process. Lett., 70(1), 39-45, 1999.Google ScholarDigital Library
- [20] U. Feige, V. S. Mirrokni, and J. Vondrák. Maximizing Non-monotone Submodular Functions. SIAM J. on Computing, 40(4), 1133–1153, 2011.Google ScholarDigital Library
- [21] R. Agrawal, T. Imielński, and A. Swami. Mining Association Rules Between Sets of Items in Large Databases. In Proc. of SIGMOD, pages 207-216, 1993.Google ScholarDigital Library
- [22] R. Agrawal and R. Srikant. Fast Algorithms for Mining Association Rules in Large Databases. In Proc. of VLDB, pages 487-499, 1994.Google ScholarDigital Library
- Predicting Paper Acceptance via Interpretable Decision Sets
Recommendations
Interpretable Decision Sets: A Joint Framework for Description and Prediction
KDD '16: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data MiningOne of the most important obstacles to deploying predictive models is the fact that humans do not understand and trust them. Knowing which variables are important in a model's prediction and how they are combined can be very powerful in helping people ...
Learning Locally Interpretable Rule Ensemble
Machine Learning and Knowledge Discovery in Databases: Research TrackAbstractThis paper proposes a new framework for learning a rule ensemble model that is both accurate and interpretable. A rule ensemble is an interpretable model based on the linear combination of weighted rules. In practice, we often face the trade-off ...
Comments