Skip to main content

Advertisement

Log in

Selecting the most helpful answers in online health question answering communities

  • Published:
Journal of Intelligent Information Systems Aims and scope Submit manuscript

Abstract

The online question answering (QA) community has been popular in recent years. In this paper, we focus on the online health question answering (HQA) community. The HQA community provides a platform for health consumers to inquire about health information. There are two ways to use this platform. One is to post a question and wait for answers to be provided by authenticated doctors. The other is to search for relevant questions with answers. For the latter, health consumers may prefer an accepted answer marked by the previous health consumer. However, there is a large proportion of questions without an accepted answer and it is inconvenient for people who want to search for relevant questions. To address this issue, we aim to select high-quality answers from the answers without marked accepted answers. We propose a deep learning approach to achieve this goal. To train the model for the prediction of answer quality, we first view the accepted answer as the positive answer and propose a method to label the negative answer. Next, we capture the semantic information on the question and the answer by the deep learning structure. We then combine the information to predict the quality score of the answer. We collect data from one of the biggest Chinese HQA community and divide them into groups by the medical departments for detailed analysis. Finally, we conduct experiments to show the effectiveness of categorization and the labeling method. The results show that our approach outperforms other studies and we further research into the differences among the results of different categories.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

Notes

  1. https://au.answers.yahoo.com/

  2. https://www.quora.com/

  3. https://stackoverflow.com/

  4. https://www.120ask.com/

  5. http://club.xywy.com/

  6. http://ask.39.net/

  7. https://zixun.haodf.com/

  8. https://github.com/fxsjy/jieba

References

  • Azzam, A., Tazi, N., & Hossny, A.H. (2017). A question routing technique using deep neural network. Database Systems for Advanced Applications, 1, 35–49.

    Google Scholar 

  • Bagheri, A., Sammani, A., van der Heijden, P.G.M., Asselbergs, F.W., & Oberski, D.L. (2020). ETM: Enrichment by topic modeling for automated clinical sentence classification to detect patients’ disease history. Journal of Intelligent Information Systems (JIIS), 55, 329–349.

    Article  Google Scholar 

  • Blei, D. M. , Ng, A.Y., & Jordan, M.I. (2003). Latent dirichlet allocation. Journal of Machine Learning Research, 3, 993–1022.

    MATH  Google Scholar 

  • Blooma, M.J., Yeow-Kuan Chua, A., & Hoe-Lian Goh, D. (2008). A predictive framework for retrieving the best answer. Symposium on Applied Computing, 1107-1111.

  • Cai, R., Zhu, B., Ji, L., Hao, T., Yan, J., & Liu, W. (2017). An CNN-LSTM attention approach to understanding user query intent from online health communities. ICDM Workshops, 430–437.

  • Chung, J., Gulcehre, C., Cho, K., & Bengio, Y. (2014). Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv:1412.3555.

  • Feng, M., Xiang, B., Glass, M.R., Wang, L., & Zhou, B. (2015). Applying deep learning to answer selection: A study and an open task. IEEE Automatic Speech Recognition and Understanding Workshop, 813–820.

  • Flesch, R. (1948). A new readability yardstick. Journal of Applied Psychology, 32(3), 221–233.

    Article  Google Scholar 

  • Harris, Z.S. (1954). Distributional structure. Word, 10(2-3), 146–162.

    Article  Google Scholar 

  • He, J., Fu, M., & Tu, M. (2019). Applying deep matching networks to Chinese medical question answering: a study and a dataset. BMC Medical Informatics & Decision Making, 19-S(2), 91–100.

    Google Scholar 

  • Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural Computation, 9, 1735–1780.

    Article  Google Scholar 

  • Hu, Z., Zhang, Z., Yang, H., Chen, Q., Zhu, R., & Zuo, D. (2018). Predicting the quality of online health expert question-answering services with temporal features in a deep learning framework. Neurocomputing, 275, 2769–2782.

    Article  Google Scholar 

  • Jeon, J., Croft, W.B., Lee, J. H., & Park, S. (2006). A framework to predict the quality of answers with non-textual features. International ACM SIGIR Conference on Research and Development in Information Retrieval, 228–235.

  • Kalchbrenner, N., Grefenstette, E., & Blunsom, P. (2014). A convolutional neural network for modelling sentences. Meeting of the Association for Computational Linguistics, 1, 655–665.

    Google Scholar 

  • Kim, Y. (2014). Convolutional neural networks for sentence classification. Empirical Methods in Natural Language Processing, 1746–1751.

  • Kingma, D.P., & Ba, J. (2015). Adam: A method for stochastic optimization. International Conference on Learning Representations.

  • Liu, P., Qiu, X., & Huang, X. (2016). Recurrent neural network for text classification with multi-task learning. International Joint Conferences on Artificial Intelligence, 2873–2879.

  • Lowe, R., Pow, N., Serban, I.V., & Pineau, J. (2015). The ubuntu dialogue corpus: A large dataset for research in unstructured multi-turn dialogue systems. Special Interest Group on Discourse and Dialogue, 285–294.

  • McLaughlin, G.H. (1969). SMOG grading - a new readability formula. Journal of Reading, 12(8), 639–646.

    Google Scholar 

  • Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013). Efficient estimation of word representations in vector space. International Conference on Learning Representations.

  • Mueller, J., & Thyagarajan, A. (2016). Siamese recurrent architectures for learning sentence similarity. Association for the Advancement of Artificial Intelligence, 2786–2792.

  • Riahi, F., Zolaktaf, Z., Shafiei, M.M., & Milios, E.E. (2012). Finding expert users in community question answering. International World Wide Web conferences, 791–798.

  • Rumelhart, D.E., Hinton, G.E., & Williams, R.J. (1986). Learning representations by back-propagating errors. Nature, 323, 533–536.

    Article  Google Scholar 

  • Sahu, T.P., Nagwani, N.K., & Verma, S. (2016). Selecting best answer: An empirical analysis on community question answering sites. IEEE Access, 4, 4797–4808.

    Article  Google Scholar 

  • Shah, C., & Pomerantz, J. (2010). Evaluating and predicting answer quality in community QA. International ACM SIGIR Conference on Research and Development in Information Retrieval, 411–418.

  • Shao, B., & Yan, J. (2017). Recommending answerers for stack overflow with LDA model. Chinese Conference on Computer Supported Cooperative Work and Social Computing, 80–86.

  • Tai, L.K., Setyonugroho, W., & Chen, A.L.P. (2020). Finding discriminatory features from electronic health records for depression prediction. Journal of Intelligent Information Systems (JIIS), 55, 371–396.

    Article  Google Scholar 

  • Tan, M., Xiang, B., & Zhou, B. (2016). STM-based deep learning models for non-factoid answer selection. International Conference on Learning Representations.

  • Tang, D., Qin, B., Feng, X., & Liu, T. (2016). Effective LSTMs for target-dependent sentiment classification. International Conference on Computational Linguistics, 3298–3307.

  • Tian, Y., Ma, W., Xia, F., & Song, Y. (2019). ChiMed: A Chinese medical corpus for question answering. BioNLP, 250–260.

  • Toba, H., Ming, Z., Adriani, M., & Chua, T.-S. (2014). Discovering high quality answers in community question answering archives using a hierarchy of classifiers. Information Sciences, 261, 101–115.

    Article  MathSciNet  Google Scholar 

  • Tran, T.N.T., Felfernig, A., Trattner, C., & Holzinger, A. (2020). Recommender systems in the healthcare domain: state-of-the-art and research issues. Journal of Intelligent Information Systems (JIIS).

  • Wang, Z., Hamza, W., & Florian, R. (2017). Bilateral multi-perspective matching for natural language sentences. International Joint Conferences on Artificial Intelligence, 4144–4150.

  • Wang, X., Jiang, W., & Luo, Z. (2016). Combination of convolutional and recurrent neural network for sentiment analysis of short texts. International Conference on Computational Linguistics, 2428–2437.

  • Wang, M., Smith, N.A., & Mitamura, T. (2007). What is the Jeopardy Model? A Quasi-Synchronous Grammar for QA. The Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, 22–32.

  • Wang, P., Xu, B., Xu, J., Tian, G., Liu, C.-L., & Hao, H. (2016). Semantic expansion using word embedding clustering and convolutional neural network for improving short text classification. Neurocomputing, 174, 806–814.

    Article  Google Scholar 

  • Ye, D., Zhang, S., Wang, H., Cheng, J., Zhang, X., Ding, Z., & Li, P. (2018). Multi-Level composite neural networks for medical question answer matching. IEEE International Conference on Data Science in Cyberspace, 139–145.

  • Yoon, S., Shin, J., & Jung, K. (2018). Learning to rank question-answer pairs using hierarchical recurrent encoder with latent topic clustering. Annual Conference of the North American Chapter of the Association for Computational Linguistics, 1575–1584.

  • Zhang, T., Cho, J.H.D., & Zhai, C. (2014). Understanding user intents in online health forums. ACM International Conference on Bioinformatics and Computational Biology, 220–229.

  • Zhang, C., Du, N., Fan, W., Li, Y., Lu, C.-T., & Yu, P.S. (2017). Bringing semantic structures to user intent detection in online medical queries. IEEE International Conference on Big Data, 1019–1026.

  • Zhang, C., Fan, W., Du, N., & Yu, P.S. (2016). Mining user intentions from medical queries. International World Wide Web Conferences, 1373–1384.

  • Zhang, R., Lee, H., & Radev, D.R. (2016). Dependency sensitive convolutional neural networks for modeling sentences and documents. North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 1512–1521.

  • Zhang, S., Zhang, X., Wang, H., Cheng, J., Li, P., & Ding, Z. (2017). Chinese medical question answer matching using end-to-end character-level multi-scale CNNs. Applied Sciences, 7(8).

  • Zhou, T.C., Lyu, M.R., & King, I. (2012). A classification-based approach to question routing in community question answering. International World Wide Web Conference, 783–790.

Download references

Acknowledgements

This work was partially supported by the Ministry of Science and Technology, ROC (Grant Number: 109-2221-E-468 -014 -MY3).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Arbee L. P. Chen.

Ethics declarations

Conflict of Interests

The authors declare that they have no conflict of interest.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lin, C.Y., Wu, YH. & Chen, A.L.P. Selecting the most helpful answers in online health question answering communities. J Intell Inf Syst 57, 271–293 (2021). https://doi.org/10.1007/s10844-021-00640-1

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10844-021-00640-1

Keywords

Navigation