skip to main content
10.1145/3522749.3523085acmotherconferencesArticle/Chapter ViewAbstractPublication PagescceaiConference Proceedingsconference-collections
research-article

Subjective Prediction of Questions in Q & A System based on the Open Domain of Daily Life

Published: 13 April 2022 Publication History

Abstract

People and computers have different understandings of questions, and people have different needs for answers. For some questions, people may not need objective answers, but developmental opinions. This paper analyzes long and difficult questions in an open domain question answering system and provides effective information to the system with subjective predictions. It uses pseudo-label technology and the blending of multiple pre-trained language models to improve the understanding of long and difficult text question sentences. In addition, by designing a variety of subjective labels, the model's prediction of the subjectivity and objectivity of questions can provide effective information for the question-and-answer system. Since there are currently no standard definitions or standards for subjective labels and long and difficult text question sentences, we have conducted a subjective analysis of long text questions based on 30 question sentence subjective labels and long text question longer than 512 characters, using Spearman's relative coefficient as the evaluation standard for model prediction. This work is the first to implement subjective prediction of long and difficult text in the open domain area by designing 30 subjective labels.

References

[1]
F. Benamara, “Generating Intensional Answers in Intelligent Question Answering Systems,” Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics), vol. 3123, no. 2, pp. 11–20, 2004.
[2]
Z. Lan, M. Chen, S. Goodman, K. Gimpel, P. Sharma, and R. Soricut, “Albert: A lite bert for self-supervised learning of language representations,” arXiv. pp. 1–17, 2019.
[3]
D. S. Shah, H. A. Schwartz, and D. Hovy, “Predictive Biases in Natural Language Processing Models: A Conceptual Framework and Overview,” pp. 5248–5264, 2020.
[4]
M. Pit, “Determining Subjectivity in Text: The Case of Backward Causal Connectives in Dutch,” Discourse Process. - DISCOURSE Process, vol. 41, pp. 151–174, 2006.
[5]
M. Liebmann, M. Hagenau, and D. Neumann, “Information processing in electronic markets: Measuring subjective interpretation using sentiment analysis,” Int. Conf. Inf. Syst. ICIS 2012, vol. 1, pp. 683–700, 2012.
[6]
S. Behdenna, F. Barigou, and G. Belalem, “EAI Endorsed Transactions Document Level Sentiment Analysis: A survey,” vol. 4, no. 1, pp. 1–8, 2017.
[7]
G. Xu, Z. Yu, H. Yao, F. Li, Y. Meng, and X. Wu, “Chinese Text Sentiment Analysis Based on Extended Sentiment Dictionary,” IEEE Access, vol. 7, pp. 43749–43762, 2019.
[8]
V. Hatzivassiloglou and K. McKeown, “Predicting the Semantic Orientation of Adjectives,” 2002.
[9]
P. D. Turney and M. L. Littman, “Measuring praise and criticism: Inference of semantic orientation from association,” ACM Trans. Inf. Syst., vol. 21, no. 4, pp. 315–346, 2003.
[10]
B. Pang, L. Lee, and S. Vaithyanathan, “Thumbs up? Sentiment Classification using Machine Learning Techniques,” 2002.
[11]
S. Liu, F. Li, F. Li, X. Cheng, and H. Shen, “Adaptive co-training SVM for sentiment classification on tweets,” Int. Conf. Inf. Knowl. Manag. Proc., pp. 2079–2088, 2013.
[12]
P. Liu and H. Meng, “SeemGo: Conditional Random Fields Labeling and Maximum Entropy Classification for Aspect Based Sentiment Analysis,” no. 2005, pp. 527–531, 2015.
[13]
L. Dey, S. Chakraborty, A. Biswas, B. Bose, and S. Tiwari, “Sentiment Analysis of Review Datasets Using Naïve Bayes‘ and K-NN Classifier,” Int. J. Inf. Eng. Electron. Bus., vol. 8, no. 4, pp. 54–62, 2016.
[14]
A. Radford, R. Jozefowicz, and I. Sutskever, “Learning to generate reviews and discovering sentiment,” arXiv, 2017.
[15]
J. Devlin, M. W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of deep bidirectional transformers for language understanding,” NAACL HLT 2019 - 2019 Conf. North Am. Chapter Assoc. Comput. Linguist. Hum. Lang. Technol. - Proc. Conf., vol. 1, no. Mlm, pp. 4171–4186, 2019.
[16]
Eleuther AI, “EleutherAI/stackexchange-dataset: Python tools for processing the stackexchange data dumps into a text dataset for Language Models,” GitHub. [Online]. Available: https://github.com/EleutherAI/stackexchange-dataset. [Accessed: 14-Dec-2021].
[17]
D.-H. Lee, “Pseudo-label: The simple and efficient semi-supervised learning method for deep neural networks,” ICML 2013 Work. Challenges Represent. Learn., pp. 1–6, 2013, [Online]. Available: https://www.kaggle.com/blobs/download/forum-message-attachment-files/746/pseudo_label_final.pdf.
[18]
Y. Liu, “RoBERTa: A robustly optimized BERT pretraining approach,” arXiv, no. 1, 2019.
[19]
M. Lewis, “BART: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension,” arXiv, 2019.
[20]
A. C. Gibbs, “Spearman correlation coefficient,” BMJ, 2015.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
CCEAI '22: Proceedings of the 6th International Conference on Control Engineering and Artificial Intelligence
March 2022
130 pages
ISBN:9781450385916
DOI:10.1145/3522749
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 April 2022

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. natural language processing
  2. pre-trained language model
  3. question analysis
  4. subjective prediction

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

CCEAI 2022

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 27
    Total Downloads
  • Downloads (Last 12 months)2
  • Downloads (Last 6 weeks)0
Reflects downloads up to 16 Feb 2025

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media