ABSTRACT
Community Question-Answering (CQA), where questions and answers are generated by peers, has become a popular method of information seeking in online environments. While the content repositories created through CQA sites have been used widely to support general purpose tasks, using them as online digital libraries that support educational needs is an emerging practice. Horizontal CQA services, such as Yahoo! Answers, and vertical CQA services, such as Brainly, are aiming to help students improve their learning process by answering their educational questions. In these services, receiving high quality answer(s) to a question is a critical factor not only for user satisfaction, but also for supporting learning. However, the questions are not necessarily answered by experts, and the askers may not have enough knowledge and skill to evaluate the quality of the answers they receive. This could be problematic when students build their own knowledge base by applying inaccurate information or knowledge acquired from online sources. Using moderators could alleviate this problem. However, a moderator's evaluation of answer quality may be inconsistent because it is based on their subjective assessments. Employing human assessors may also be insufficient due to the large amount of content available on a CQA site. To address these issues, we propose a framework for automatically assessing the quality of answers. This is achieved by integrating different groups of features - personal, community-based, textual, and contextual - to build a classification model and determine what constitutes answer quality. To test this evaluation framework, we collected more than 10 million educational answers posted by more than 3 million users on Brainly's United States and Poland sites. The experiments conducted on these datasets show that the model using Random Forest (RF) achieves more than 83% accuracy in identifying high quality of answers. In addition, the findings indicate that personal and community-based features have more prediction power in assessing answer quality. Our approach also achieves high values on other key metrics such as F1-score and Area under ROC curve. The work reported here can be useful in many other contexts where providing automatic quality assessment in a digital repository of textual information is paramount.
- L. A. Adamic, J. Zhang, E. Bakshy, and M. S. Ackerman. Knowledge sharing and yahoo answers: Everyone knows something. In WWW, pages 665--674, 2008. Google ScholarDigital Library
- C. Aritajati and N. H. Narayanan. Facilitating students' collaboration and learning in a question and answer system. In CSCW Companion, pages 101--106, 2013. Google ScholarDigital Library
- M. Berlingerio, D. Koutra, T. Eliassi-Rad, and C. Faloutsos. Network similarity via multiple social theories. In ASONAM, pages 1439--1440, 2013. Google ScholarDigital Library
- C. M. Bishop. Pattern Recognition and Machine Learning (Information Science and Statistics). Springer, 2006. Google ScholarDigital Library
- L. Breiman. Random forests. Mach. Learn., 45(1):5--32, 2001. Google ScholarDigital Library
- E. Choi, M. Borkowski, J. Zakoian, K. Sagan, K. Scholla, C. Ponti, M. Labedz, and M. Bielski. Utilizing content moderators to investigate critical factors for assessing the quality of answers on brainly, social learning Q&A platform for students: a pilot study. In ASIST, 2015. Google ScholarDigital Library
- E. Choi, V. Kitzie, and C. Shah. Developing a typology of online Q&A models and recommending the right model for each question type. In ASIST, pages 1--4, 2012.Google ScholarCross Ref
- E. Choi and C. Shah. User motivation for asking a question in online Q&A services. JASIST, In press.Google Scholar
- R. A. Cole. Issues in Web-based pedagogy: A critical primer. Greenwood Press, 2000. Google ScholarDigital Library
- D. H. Dalip, H. Lima, M. A. Gonçalves, M. Cristo, and P. Calado. Quality assessment of collaborative content with minimal information. In JCDL, pages 201--210, 2014. Google ScholarDigital Library
- G. Dror, Y. Maarek, and I. Szpektor. Will my question be answered? predicting "question answerability" in community question-answering sites. In ECML/PKDD, volume 8190, pages 499--514, 2013.Google ScholarDigital Library
- R. Gazan. Social Q&A. JASIST, 63:2301--2312, 2011. Google ScholarDigital Library
- S. D. Gollapalli, P. Mitra, and C. L. Giles. Ranking experts using author-document-topic graphs. In JCDL, pages 87--96, 2013. Google ScholarDigital Library
- T. Hastie, R. Tibshirani, and J. Friedman. The Elements of Statistical Learning. Springer Series in Statistics, 2009.Google ScholarCross Ref
- J. P. Kincaid, R. P. Fishburne, R. L. Rogers, and B. S. Chissom. Derivation of New Readability Formulas (Automated Readability Index, Fog Count and Flesch Reading Ease Formula) for Navy Enlisted Personnel. Technical report, Naval Air Station Memphis, 1975.Google Scholar
- L. T. Le, T. Eliassi-Rad, and H. Tong. MET: A fast algorithm for minimizing propagation in large graphs with small eigen-gaps. In SDM, pages 694--702, 2015.Google ScholarCross Ref
- L. T. Le and C. Shah. Retrieving rising stars in focused community question-answering. In ACIIDS, pages 25--36, 2016.Google ScholarCross Ref
- A. Y. Levy, A. Rajaraman, and J. J. Ordille. Querying heterogeneous information sources using source descriptions. In VLDB, pages 251--262, 1996. Google ScholarDigital Library
- Y. Liu, J. Bian, and E. Agichtein. Predicting information seeker satisfaction in community question answering. In SIGIR, pages 483--490, 2008. Google ScholarDigital Library
- E. Momeni, K. Tao, B. Haslhofer, and G.-J. Houben. Identification of useful user comments in social media: A case study on flickr commons. In JCDL, pages 1--10, 2013. Google ScholarDigital Library
- M. Noer. One Man, One Computer, 10 Million Students: How Khan Academy Is Reinventing Education. Forbes, 2013.Google Scholar
- J. Preece, B. Nonnecke, and D. Andrews. The top five reasons for lurking: improving community experiences for everyone. Computers in Human Behavior, 20(2):201 -- 223, 2004.Google ScholarCross Ref
- C. Ross, K. Nilsen, and P. Dewdney. Conducting the reference interview: A how-to-do-it manual for librarians. New York: NealSchuman, 2002.Google Scholar
- C. Shah and V. Kitzie. Social q&a and virtual reference - comparing apples and oranges with the help of experts and users. JASIST, 63:2020--2036, 2012. Google ScholarDigital Library
- C. Shah, S. Oh, and J. S. Oh. Research agenda for social Q&A. Library & Information Science Research, 31(4):205--209, 2009.Google ScholarCross Ref
- C. Shah and J. Pomerantz. Evaluating and predicting answer quality in community qa. In SIGIR, pages 411--418, 2010. Google ScholarDigital Library
- C. Shah, M. Radford, L. Connaway, E. Choi, and V. Kitzie. How much change do you get from 40? analyzing and addressing failed questions on social Q&A. In ASIST, pages 1--10, 2012.Google ScholarCross Ref
- I. Srba and M. Bielikova. Askalot: Community question answering as a means for knowledge sharing in an educational organization. In CSCW Companion, pages 179--182, 2015. Google ScholarDigital Library
- M. Surdeanu, M. Ciaramita, and H. Zaragoza. Learning to rank answers on large online qa collections. In ACL, pages 719--727, 2008.Google Scholar
- J. Surowiecki. The Wisdom of Crowds. Anchor, 2005. Google ScholarDigital Library
- M. A. Suryanto, E. P. Lim, A. Sun, and R. H. L. Chiang. Quality-aware collaborative question answering: Methods and evaluation. In WSDM, pages 142--151, 2009. Google ScholarDigital Library
- P. A. Tess. The role of social media in higher education classes (real and virtual) - a literature review. Computers in Human Behavior, 29:A60--A68, 2013.Google ScholarCross Ref
- G. Wang, K. Gill, M. Mohanlal, H. Zheng, and B. Y. Zhao. Wisdom in the social crowd: An analysis of quora. In WWW, pages 1341--1352, 2013. Google ScholarDigital Library
- L. Yang, S. Bao, Q. Lin, X. Wu, D. Han, Z. Su, and Y. Yu. Analyzing and predicting not-answered questions in community-based question answering services. In AAAI, pages 1273--1278, 2011. Google ScholarDigital Library
- S. Yang. Information seeking as problem-solving using a qualitative approach to uncover the novice learners' information-seeking process in a perseus hypertext system. Library and Information Science Research, 19(1):71--92, 1997.Google ScholarCross Ref
- Y. Yao, H. Tong, F. Xu, and J. Lu. Predicting long-term impact of cqa posts: A comprehensive viewpoint. In SIGKDD, pages 1496--1505, 2014. Google ScholarDigital Library
Index Terms
- Evaluating the Quality of Educational Answers in Community Question-Answering
Recommendations
Quality-aware collaborative question answering: methods and evaluation
WSDM '09: Proceedings of the Second ACM International Conference on Web Search and Data MiningCommunity Question Answering (QA) portals contain questions and answers contributed by hundreds of millions of users. These databases of questions and answers are of great value if they can be used directly to answer questions from any user. In this ...
Discovering high quality answers in community question answering archives using a hierarchy of classifiers
In community-based question answering (CQA) services where answers are generated by human, users may expect better answers than an automatic question answering system. However, in some cases, the user generated answers provided by CQA archives are not ...
Question Retrieval with High Quality Answers in Community Question Answering
CIKM '14: Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge ManagementThis paper studies the problem of question retrieval in community question answering (CQA). To bridge lexical gaps in questions, which is regarded as the biggest challenge in retrieval, state-of-the-art methods learn translation models using answers ...
Comments