Causal Learning in Question Quality Improvement

Li, Yichuan; Guo, Ruocheng; Wang, Weiying; Liu, Huan

doi:10.1007/978-3-030-49556-5_20

Causal Learning in Question Quality Improvement

Yichuan Li¹³,
Ruocheng Guo¹³,
Weiying Wang¹³ &
…
Huan Liu¹³

Conference paper
First Online: 09 June 2020

1110 Accesses
2 Citations

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12093))

Abstract

To improve the quality of questions asked in Community-based questions answering forums, we create a new dataset from the website, Stack Overflow, which contains three components: (1) context: the text features of questions, (2) treatment: categories of revision suggestions and (3) outcome: the measure of question quality (e.g., the number of questions, upvotes or clicks). This dataset helps researchers develop causal inference models towards solving two problems: (i) estimating the causal effects of aforementioned treatments on the outcome and (ii) finding the optimal treatment for the questions. Empirically, we performed experiments with three state-of-the-art causal effect estimation methods on the contributed dataset. In particular, we evaluated the optimal treatments recommended by the these approaches by comparing them with the ground truth labels – treatments (suggestions) provided by experts.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

References

Anderson, A., Huttenlocher, D., Kleinberg, J., Leskovec, J.: Discovering value from community activity on focused question answering sites: a case study of stack overflow. In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 850–858, August 2012
Google Scholar
Correa, D., Sureka, A.: Chaff from the wheat: characterization and modeling of deleted questions on stack overflow. In: Proceedings of the 23rd International Conference on World Wide Web, pp. 631–642. ACM (2014)
Google Scholar
Kato, M., White, R.W., Teevan, J., Dumais, S.: Clarifications and question specificity in synchronous social Q&A. ACM, April 2013
Google Scholar
Faruqui, M., Das, D.: Identifying well-formed natural language questions. arXiv e-prints, page arXiv:1808.09419, August 2018
Trienes, J., Balog, K.: Identifying unclear questions in community question answering websites. In: Azzopardi, L., Stein, B., Fuhr, N., Mayr, P., Hauff, C., Hiemstra, D. (eds.) ECIR 2019. LNCS, vol. 11437, pp. 276–289. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-15712-8_18
Chapter Google Scholar
Yang, J., Hauff, C., Bozzon, A., Houben, G.-J.: Asking the right question in collaborative q&a systems. In: Proceedings of the 25th ACM Conference on Hypertext and Social Media, HT 2014, pp. 179–189. ACM , New York (2014)
Google Scholar
Mueller, J., Reshef, D.N., Du, G., Jaakkola, T.: Learning optimal interventions. arXiv preprint arXiv:1606.05027 (2016)
Mueller, J., Gifford, D., Jaakkola, T.: Sequence to better sequence: continuous revision of combinatorial structures. In: Precup, D., Teh, Y.W., (eds.) Proceedings of the 34th International Conference on Machine Learning, volume 70 of Proceedings of Machine Learning Research, pp. 2536–2544. International Convention Centre, Sydney, 06–11 August 2017. PMLR
Google Scholar
Yang, D., Halfaker, A., Kraut, R., Hovy, E.: Identifying semantic edit intentions from revisions in Wikipedia. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, Copenhagen, Denmark, September 2017, pp. 2000–2010. Association for Computational Linguistics (2017)
Google Scholar
Miao, N., Zhou, H., Mou, L., Yan, R., Li, L.: CGMH: constrained sentence generation by metropolis-hastings sampling. CoRR, abs/1811.10996 (2018)
Google Scholar
Guo, R., Cheng, L., Li, J., Richard Hahn, P., Liu, H.: A survey of learning causality with data: problems and methods (2018)
Google Scholar
Hill, J.L.: Bayesian nonparametric modeling for causal inference. J. Comput. Graph. Stat. 20(1), 217–240 (2011)
Article MathSciNet Google Scholar
Johansson, F.D., Shalit, U., Sontag, D.: Learning representations for counterfactual inference (2016)
Google Scholar
Guo, R., Li, J., Liu, H.: Learning individual treatment effects from networked observational data. arXiv preprint arXiv:1906.03485 (2019)
Li, J., Guo, R., Liu, C., Liu, H.: Adaptive unsupervised feature selection on attributed networks. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD 2019, Anchorage, AK, USA, 4–8 August 2019, pp. 92–100 (2019)
Google Scholar
Shakarian, P., Bhatnagar, A., Aleali, A., Shaabani, E., Guo, R.: Diffusion in Social Networks. SCS. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-23105-1
Book MATH Google Scholar
Rakesh, V., Guo, R., Moraffah, R., Agarwal, N., Liu, H.: Linked causal variational autoencoder for inferring paired spillover effects. In: Proceedings of the 27th ACM International Conference on Information and Knowledge Management, pp. 1679–1682. ACM (2018)
Google Scholar
Veitch, V., Sridhar, D., Blei, D.M.: Using text embeddings for causal inference. arXiv preprint arXiv:1905.12741 (2019)
Cheng, L., Guo, R., Liu, H.: Robust cyberbullying detection with causal interpretation. In: Companion Proceedings of The 2019 World Wide Web Conference, pp. 169–175. ACM (2019)
Google Scholar
Cheng, L., Moraffah, R., Guo, R., Candan, K.S., Raglin, A., Huan, L.: A practical data repository for causal learning with big data. In: 2019 BenchCouncil International Symposium on Benchmarking, Measuring and Optimizing (Bench 2019) (2019)
Google Scholar
Chipman, H.A., George, E.I., McCulloch, R.E., et al.: Bart: bayesian additive regression trees. Ann. Appl. Stat. 4(1), 266–298 (2010)
Article MathSciNet Google Scholar
Shalit, U., Johansson, F.D., Sontag, D.: Estimating individual treatment effect: generalization bounds and algorithms. In: Proceedings of the 34th International Conference on Machine Learning, vol. 70, pp. 3076–3085. JMLR. org (2017)
Google Scholar
Louizos, C., Shalit, U., Mooij, J.M., Sontag, D., Zemel, R., Welling, M.: Causal effect inference with deep latent-variable models. In: Advances in Neural Information Processing Systems, pp. 6446–6456 (2017)
Google Scholar

Download references

Acknowledgement

This material is based upon work supported by ARO/ARL and the National Science Foundation (NSF) Grant #1610282, NSF #1909555.

Author information

Authors and Affiliations

Arizona State University, Tempe, AZ, 85281, USA
Yichuan Li, Ruocheng Guo, Weiying Wang & Huan Liu

Authors

Yichuan Li
View author publications
You can also search for this author in PubMed Google Scholar
Ruocheng Guo
View author publications
You can also search for this author in PubMed Google Scholar
Weiying Wang
View author publications
You can also search for this author in PubMed Google Scholar
Huan Liu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yichuan Li .

Editor information

Editors and Affiliations

Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China
Wanling Gao
Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China
Jianfeng Zhan
School of Informatics, Computing, and Engineering, Indiana University, Bloomington, IN, USA
Geoffrey Fox
Department of Computer Science and Engineering, The Ohio State University, Columbus, OH, USA
Xiaoyi Lu
Texas Advanced Computing Center, The University of Texas at Austin, Austin, TX, USA
Dan Stanzione

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Li, Y., Guo, R., Wang, W., Liu, H. (2020). Causal Learning in Question Quality Improvement. In: Gao, W., Zhan, J., Fox, G., Lu, X., Stanzione, D. (eds) Benchmarking, Measuring, and Optimizing. Bench 2019. Lecture Notes in Computer Science(), vol 12093. Springer, Cham. https://doi.org/10.1007/978-3-030-49556-5_20

Download citation

DOI: https://doi.org/10.1007/978-3-030-49556-5_20
Published: 09 June 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-49555-8
Online ISBN: 978-3-030-49556-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics