Skip to main content

Causal Learning in Question Quality Improvement

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12093))

Abstract

To improve the quality of questions asked in Community-based questions answering forums, we create a new dataset from the website, Stack Overflow, which contains three components: (1) context: the text features of questions, (2) treatment: categories of revision suggestions and (3) outcome: the measure of question quality (e.g., the number of questions, upvotes or clicks). This dataset helps researchers develop causal inference models towards solving two problems: (i) estimating the causal effects of aforementioned treatments on the outcome and (ii) finding the optimal treatment for the questions. Empirically, we performed experiments with three state-of-the-art causal effect estimation methods on the contributed dataset. In particular, we evaluated the optimal treatments recommended by the these approaches by comparing them with the ground truth labels – treatments (suggestions) provided by experts.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    https://stackoverflow.com/.

  2. 2.

    https://www.quora.com/.

  3. 3.

    https://www.zhihu.com.

  4. 4.

    https://stackexchange.com/sites?view=list#questionsperday.

  5. 5.

    https://meta.stackexchange.com/questions/180692/why-do-i-receive-downvotes-when-i-am-genuinely-trying-to-learn/.

  6. 6.

    https://stackoverflow.com/help/how-to-ask.

  7. 7.

    https://stackoverflow.com/help/how-to-ask.

  8. 8.

    https://archive.org/details/stackexchange.

  9. 9.

    https://stackexchange.com/.

References

  1. Anderson, A., Huttenlocher, D., Kleinberg, J., Leskovec, J.: Discovering value from community activity on focused question answering sites: a case study of stack overflow. In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 850–858, August 2012

    Google Scholar 

  2. Correa, D., Sureka, A.: Chaff from the wheat: characterization and modeling of deleted questions on stack overflow. In: Proceedings of the 23rd International Conference on World Wide Web, pp. 631–642. ACM (2014)

    Google Scholar 

  3. Kato, M., White, R.W., Teevan, J., Dumais, S.: Clarifications and question specificity in synchronous social Q&A. ACM, April 2013

    Google Scholar 

  4. Faruqui, M., Das, D.: Identifying well-formed natural language questions. arXiv e-prints, page arXiv:1808.09419, August 2018

  5. Trienes, J., Balog, K.: Identifying unclear questions in community question answering websites. In: Azzopardi, L., Stein, B., Fuhr, N., Mayr, P., Hauff, C., Hiemstra, D. (eds.) ECIR 2019. LNCS, vol. 11437, pp. 276–289. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-15712-8_18

    Chapter  Google Scholar 

  6. Yang, J., Hauff, C., Bozzon, A., Houben, G.-J.: Asking the right question in collaborative q&a systems. In: Proceedings of the 25th ACM Conference on Hypertext and Social Media, HT 2014, pp. 179–189. ACM , New York (2014)

    Google Scholar 

  7. Mueller, J., Reshef, D.N., Du, G., Jaakkola, T.: Learning optimal interventions. arXiv preprint arXiv:1606.05027 (2016)

  8. Mueller, J., Gifford, D., Jaakkola, T.: Sequence to better sequence: continuous revision of combinatorial structures. In: Precup, D., Teh, Y.W., (eds.) Proceedings of the 34th International Conference on Machine Learning, volume 70 of Proceedings of Machine Learning Research, pp. 2536–2544. International Convention Centre, Sydney, 06–11 August 2017. PMLR

    Google Scholar 

  9. Yang, D., Halfaker, A., Kraut, R., Hovy, E.: Identifying semantic edit intentions from revisions in Wikipedia. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, Copenhagen, Denmark, September 2017, pp. 2000–2010. Association for Computational Linguistics (2017)

    Google Scholar 

  10. Miao, N., Zhou, H., Mou, L., Yan, R., Li, L.: CGMH: constrained sentence generation by metropolis-hastings sampling. CoRR, abs/1811.10996 (2018)

    Google Scholar 

  11. Guo, R., Cheng, L., Li, J., Richard Hahn, P., Liu, H.: A survey of learning causality with data: problems and methods (2018)

    Google Scholar 

  12. Hill, J.L.: Bayesian nonparametric modeling for causal inference. J. Comput. Graph. Stat. 20(1), 217–240 (2011)

    Article  MathSciNet  Google Scholar 

  13. Johansson, F.D., Shalit, U., Sontag, D.: Learning representations for counterfactual inference (2016)

    Google Scholar 

  14. Guo, R., Li, J., Liu, H.: Learning individual treatment effects from networked observational data. arXiv preprint arXiv:1906.03485 (2019)

  15. Li, J., Guo, R., Liu, C., Liu, H.: Adaptive unsupervised feature selection on attributed networks. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD 2019, Anchorage, AK, USA, 4–8 August 2019, pp. 92–100 (2019)

    Google Scholar 

  16. Shakarian, P., Bhatnagar, A., Aleali, A., Shaabani, E., Guo, R.: Diffusion in Social Networks. SCS. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-23105-1

    Book  MATH  Google Scholar 

  17. Rakesh, V., Guo, R., Moraffah, R., Agarwal, N., Liu, H.: Linked causal variational autoencoder for inferring paired spillover effects. In: Proceedings of the 27th ACM International Conference on Information and Knowledge Management, pp. 1679–1682. ACM (2018)

    Google Scholar 

  18. Veitch, V., Sridhar, D., Blei, D.M.: Using text embeddings for causal inference. arXiv preprint arXiv:1905.12741 (2019)

  19. Cheng, L., Guo, R., Liu, H.: Robust cyberbullying detection with causal interpretation. In: Companion Proceedings of The 2019 World Wide Web Conference, pp. 169–175. ACM (2019)

    Google Scholar 

  20. Cheng, L., Moraffah, R., Guo, R., Candan, K.S., Raglin, A., Huan, L.: A practical data repository for causal learning with big data. In: 2019 BenchCouncil International Symposium on Benchmarking, Measuring and Optimizing (Bench 2019) (2019)

    Google Scholar 

  21. Chipman, H.A., George, E.I., McCulloch, R.E., et al.: Bart: bayesian additive regression trees. Ann. Appl. Stat. 4(1), 266–298 (2010)

    Article  MathSciNet  Google Scholar 

  22. Shalit, U., Johansson, F.D., Sontag, D.: Estimating individual treatment effect: generalization bounds and algorithms. In: Proceedings of the 34th International Conference on Machine Learning, vol. 70, pp. 3076–3085. JMLR. org (2017)

    Google Scholar 

  23. Louizos, C., Shalit, U., Mooij, J.M., Sontag, D., Zemel, R., Welling, M.: Causal effect inference with deep latent-variable models. In: Advances in Neural Information Processing Systems, pp. 6446–6456 (2017)

    Google Scholar 

Download references

Acknowledgement

This material is based upon work supported by ARO/ARL and the National Science Foundation (NSF) Grant #1610282, NSF #1909555.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yichuan Li .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Li, Y., Guo, R., Wang, W., Liu, H. (2020). Causal Learning in Question Quality Improvement. In: Gao, W., Zhan, J., Fox, G., Lu, X., Stanzione, D. (eds) Benchmarking, Measuring, and Optimizing. Bench 2019. Lecture Notes in Computer Science(), vol 12093. Springer, Cham. https://doi.org/10.1007/978-3-030-49556-5_20

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-49556-5_20

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-49555-8

  • Online ISBN: 978-3-030-49556-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics