Abstract
The increasingly rapid spread of information about COVID-19 on the web calls for automatic measures of credibility assessment [18]. If large parts of the population are expected to act responsibly during a pandemic, they need information that can be trusted [20].
In that context, we model the credibility of texts using 25 linguistic phenomena, such as spelling, sentiment and lexical diversity. We integrate these measures in a graphical interface and present two empirical studies to evaluate its usability for credibility assessment on COVID-19 news. Raw data for the studies, including all questions and responses, has been made available to the public using an open license: https://github.com/konstantinschulz/credible-covid-ux. The user interface prominently features three sub-scores and an aggregation for a quick overview. Besides, metadata about the concept, authorship and infrastructure of the underlying algorithm is provided explicitly.
Our working definition of credibility is operationalized through the terms of trustworthiness, understandability, transparency, and relevance. Each of them builds on well-established scientific notions [41, 65, 68] and is explained orally or through Likert scales.
In a moderated qualitative interview with six participants, we introduce information transparency for news about COVID-19 as the general goal of a prototypical platform, accessible through an interface in the form of a wireframe [43]. The participants’ answers are transcribed in excerpts. Then, we triangulate inductive and deductive coding methods [19] to analyze their content. As a result, we identify rating scale, sub-criteria and algorithm authorship as important predictors of the usability.
In a subsequent quantitative online survey, we present a questionnaire with wireframes to 50 crowdworkers. The question formats include Likert scales, multiple choice and open-ended types. This way, we aim to strike a balance between the known strengths and weaknesses of open vs. closed questions [11]. The answers reveal a conflict between transparency and conciseness in the interface design: Users tend to ask for more information, but do not necessarily make explicit use of it when given. This discrepancy is influenced by capacity constraints of the human working memory [38]. Moreover, a perceived hierarchy of metadata becomes apparent: the authorship of a news text is more important than the authorship of the algorithm used to assess its credibility.
From the first to the second study, we notice an improved usability of the aggregated credibility score’s scale. That change is due to the conceptual introduction before seeing the actual interface, as well as the simplified binary indicators with direct visual support. Sub-scores need to be handled similarly if they are supposed to contribute meaningfully to the overall credibility assessment.
By integrating detailed information about the employed algorithm, we are able to dissipate the users’ doubts about its anonymity and possible hidden agendas. However, the overall transparency can only be increased if other more important factors, like the source of the news article, are provided as well. Knowledge about this interaction enables software designers to build useful prototypes with a strong focus on the most important elements of credibility: source of text and algorithm, as well as distribution and composition of algorithm.
All in all, the understandability of our interface was rated as acceptable (78% of responses being neutral or positive), while transparency (70%) and relevance (72%) still lag behind. This discrepancy is closely related to the missing article metadata and more meaningful visually supported explanations of credibility sub-scores.
The insights from our studies lead to a better understanding of the amount, sequence and relation of information that needs to be provided in interfaces for credibility assessment. In particular, our integration of software metadata contributes to the more holistic notion of credibility [47, 72] that has become popular in recent years. Besides, it paves the way for a more thoroughly informed interaction between humans and machine-generated assessments, anticipating the users’ doubts and concerns [39] in early stages of the software design process [37].
Finally, we make suggestions for future research, such as proactively documenting credibility-related metadata for Natural Language Processing and Language Technology services and establishing an explicit hierarchical taxonomy of usability predictors for automatic credibility assessment.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
- 2.
- 3.
- 4.
- 5.
- 6.
- 7.
- 8.
- 9.
See the file ‘210702_Panqura_Testdesign_P02.pdf’ at https://github.com/konstantinschulz/credible-covid-ux/blob/main/1st-usability-study/210702_Panqura_Testdesign_P02.pdf.
- 10.
- 11.
- 12.
- 13.
- 14.
- 15.
References
Aksenov, D., Bourgonje, P., Zaczynska, K., Ostendorff, M., Moreno-Schneider, J., Rehm, G.: Fine-grained classification of political bias in german news: a data set and initial experiments. In: Mostafazadeh Davani, A., Kiela, D., Lambert, M., Vidgen, B., Prabhakaran, V., Waseem, Z. (eds.) Proceedings of the 5th Workshop on Online Abuse and Harms (WOAH 2021), pp. 121–131. Association for Computational Linguistics (ACL), Bangkok, Thailand (8 2021), 1–6 Aug 2021, co-located with ACL-IJCNLP (2021)
Allred, S.R., Crawford, L.E., Duffy, S., Smith, J.: Working memory and spatial judgments: cognitive load increases the central tendency bias. Psychon. Bull. Rev. 23(6), 1825–1831 (2016). https://doi.org/10.3758/s13423-016-1039-0
Amit Aharon, A., Ruban, A., Dubovi, I.: Knowledge and information credibility evaluation strategies regarding COVID-19: a cross-sectional study. Nurs. Outlook 69(1), 22–31 (2021)
Atanasova, P., Simonsen, J.G., Lioma, C., Augenstein, I.: Generating fact checking explanations. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 7352–7364. Association for Computational Linguistics (2020). https://doi.org/10.18653/v1/2020.acl-main.656
Augenstein, I.: Determining the credibility of science communication. In: Proceedings of the Second Workshop on Scholarly Document Processing, pp. 1–6. Association for Computational Linguistics (2021)
Bannon, L.J., Ehn, P.: Design matters in participatory design. In: Simonsen, J., Robertson, T. (eds.) Routledge International Handbook of Participatory Design, vol. 711, pp. 37–63. Routledge, London & New York (2013)
Berndt, E., Furniss, D., Blandford, A.: Learning Contextual Inquiry and Distributed Cognition: a case study on technology use in Anaesthesia. Cogn. Technol. Work 17(3), 431–449 (2015)
Budiu, R., Moran, K.: How many participants for quantitative usability studies: a summary of sample-size recommendations (2021). www.nngroup.com/articles/summary-quant-sample-sizes/
Chen, Z., Freire, J.: Discovering and measuring malicious URL redirection campaigns from fake news domains. In: 2021 IEEE Security and Privacy Workshops (SPW), pp. 1–6. IEEE, San Francisco (2021)
Cohn, M.: succeeding with agile: software development using scrum. Pearson Education, Ann Arbor (2010)
Connor Desai, S., Reimers, S.: Comparing the use of open and closed questions for Web-based measures of the continued-influence effect. Behav. Res. Methods 51(3), 1426–1440 (2018). https://doi.org/10.3758/s13428-018-1066-z
Crosetto, P., Filippin, A., Katuščák, P., Smith, J.: Central tendency bias in belief elicitation. J. Econ. Psychol. 78, 102273 (2020)
Das, S.D., Basak, A., Dutta, S.: A heuristic-driven ensemble framework for COVID-19 fake news detection. In: Chakraborty, T., Shu, K., Bernard, H.R., Liu, H., Akhtar, M.S. (eds.) CONSTRAINT 2021. CCIS, vol. 1402, pp. 164–176. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-73696-5_16
De Grandis, M., Pasi, G., Viviani, M.: Multi-criteria decision making and supervised learning for fake news detection in microblogging. In: Workshop on Reducing Online Misinformation Exposure, pp. 1–8. ACM, Paris, France (2019)
DeVerna, M.R., et al.: CoVaxxy: a collection of english-language Twitter posts about COVID-19 vaccines. In: Proceedings of the Fifteenth International AAAI Conference on Web and Social Media (ICWSM 2021), pp. 992–999. AAAI, Virtual (2021)
Dutta, B., DeBellis, M.: CODO: an ontology for collection and analysis of COVID-19 data. In: Proceedings of the 12th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management, pp. 76–85. SCITEPRESS - Science and Technology Publications, Budapest, Hungary (2020). https://doi.org/10.5220/0010112500760085
Elias, S.M., Smith, W.L., Barney, C.E.: Age as a moderator of attitude towards technology in the workplace: work motivation and overall job satisfaction. Behav. Inf. Technol. 31(5), 453–467 (2012)
Fairbanks, J., Fitch, N., Knauf, N., Briscoe, E.: Credibility assessment in the news: do we need to read? In: Proceedings of the MIS2 Workshop Held in Conjunction with 11th International Conference on Web Search and Data Mining, pp. 1–8. ACM, Marina Del Rey (2018)
Fereday, J., Muir-Cochrane, E.: Demonstrating rigor using thematic analysis: a hybrid approach of inductive and deductive coding and theme development. Int. J. Qual. Methods 5(1), 80–92 (2006)
Gallotti, R., Valle, F., Castaldo, N., Sacco, P., De Domenico, M.: Assessing the risks of ‘infodemics’ in response to COVID-19 epidemics. Nat. Hum. Behav. 4(12), 1285–1293 (2020)
Giachanou, A., Rosso, P., Crestani, F.: The impact of emotional signals on credibility assessment. J. Am. Soc. Inf. Sci. 72(9), 1117–1132 (2021). https://doi.org/10.1002/asi.24480
Gothelf, J., Seiden, J.: Lean UX: designing great products with agile teams. O’Reilly Media Inc, Sebastopol (2016)
He, Y., et al.: CIDO, a community-based ontology for coronavirus disease knowledge and data integration, sharing, and analysis. Scientific Data 7(1), 181 (2020)
Hettrick, S.: Research software sustainability: report on a knowledge exchange workshop. Tech. rep, The Software Sustainability Institute (2016)
Houy, C., Fettke, P., Loos, P.: Understanding understandability of conceptual models-what are we actually talking about? In: Atzeni, P., Cheung, D., Ram, S. (eds.) ER 2012. LNCS, vol. 7532, pp. 64–77. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-34002-4_5
Jahanbakhsh, F., Zhang, A.X., Berinsky, A.J., Pennycook, G., Rand, D.G., Karger, D.R.: Exploring lightweight interventions at posting time to reduce the sharing of misinformation on social media. In: Proceedings of the ACM on Human-Computer Interaction 5(CSCW1), pp. 1–42 (2021)
Jiang, Y., Bordia, S., Zhong, Z., Dognin, C., Singh, M., Bansal, M.: HoVer: a dataset for many-hop fact extraction and claim verification. In: Findings of the Association for Computational Linguistics: EMNLP 2020, pp. 3441–3460. Association for Computational Linguistics, Online (2020). https://doi.org/10.18653/v1/2020.findings-emnlp.309
Jureta, I.J., Herssens, C., Faulkner, S.: A comprehensive quality model for service-oriented systems. Software Qual. J. 17(1), 65–98 (2009)
Kagolovsky, Y., Möhr, J.R.: A new approach to the concept of “relevance” in information retrieval (IR). In: MEDINFO 2001, pp. 348–352. IOS Press, Amsterdam (2001)
Kakol, M., Nielek, R., Wierzbicki, A.: Understanding and predicting Web content credibility using the Content Credibility Corpus. Inf. Process. Manage. 53(5), 1043–1061 (2017)
Kang, H., Yang, J.: Quantifying perceived political bias of newspapers through a document classification technique. J. Quant. Linguist. Ahead-of-print (Ahead-of-print) 29(2), 1–24 (2020)
Karray, F., Alemzadeh, M., Abou Saleh, J., Arab, M.N.: Human-computer interaction: overview on state of the art. Int. J. Smart Sens. Intell. Syst. 1(1), 137–159 (2017)
Kautz, K.: Investigating the design process: participatory design in agile software development. Inf. Technol. People 24(3), 217–235 (2011)
Keller, F.B., Schoch, D., Stier, S., Yang, J.: Political astroturfing on Twitter: how to coordinate a disinformation campaign. Polit. Commun. 37(2), 256–280 (2020)
Kuusinen, K., Mikkonen, T., Pakarinen, S.: Agile user experience development in a large software organization: good expertise but limited impact. In: Winckler, M., Forbrig, P., Bernhaupt, R. (eds.) HCSE 2012. LNCS, vol. 7623, pp. 94–111. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-34347-6_6
Labropoulou, P., et al.: Making metadata fit for next generation language technology platforms: the metadata schema of the european language grid. In: Proceedings of the 12th Language Resources and Evaluation Conference, pp. 3428–3437. European Language Resources Association, Marseille, France (2020)
Lee, G., Xia, W.: Toward agile: an integrated analysis of quantitative and qualitative field data on software development agility. MIS Q. 34(1), 87–114 (2010)
Ma, W.J., Husain, M., Bays, P.M.: Changing concepts of working memory. Nat. Neurosci. 17(3), 347–356 (2014). https://doi.org/10.1038/nn.3655
MacKenzie, I.S.: Human-computer interaction: an empirical research perspective. Newnes, Waltham (2012)
McGrew, S., Breakstone, J., Ortega, T., Smith, M., Wineburg, S.: Can students evaluate online sources? learning from assessments of civic online reasoning. Theor. Res. Soc. Educ. 46(2), 165–193 (2018)
Michener, G., Bersch, K.: Identifying transparency. Inf. Polity 18(3), 233–242 (2013)
Nielsen, J.: Estimating the number of subjects needed for a thinking aloud test. Int. J. Hum Comput Stud. 41(3), 385–397 (1994)
Ozenc, F.K., Kim, M., Zimmerman, J., Oney, S., Myers, B.: How to support designers in getting hold of the immaterial material of software. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 2513–2522. ACM, Atlanta (2010)
Pankovska, E., Schulz, K., Rehm, G.: Suspicious sentence detection and claim verification in the COVID-19 domain. In: Proceedings of the Workshop Reducing Online Misinformation through Credible Information Retrieval (ROMCIR 2022), CEUR-WS, Stavanger (2022)
Pasi, G., De Grandis, M., Viviani, M.: Decision making over multiple criteria to assess news credibility in microblogging sites. In: 2020 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE), pp. 1–8. IEEE, Glasgow (2020)
Patwa, P., et al.: Fighting an Infodemic: COVID-19 fake news dataset. arXiv:2011.03327 (2021)
Przybyła, P., Soto, A.J.: When classification accuracy is not enough: explaining news credibility assessment. Inf. Process. Manage. 58(5), 102653 (2021)
Raison, C., Schmidt, S.: Keeping user centred design (UCD) alive and well in your organisation: taking an agile approach. In: Marcus, A. (ed.) DUXU 2013. LNCS, vol. 8012, pp. 573–582. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-39229-0_61
Rehm, G.: An Infrastructure for Empowering Internet Users to Handle Fake News and Other Online Media Phenomena. In: Rehm, G., Declerck, T. (eds.) GSCL 2017. LNCS (LNAI), vol. 10713, pp. 216–231. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-73706-5_19
Rehm, G., et al.: European language grid: an overview. In: Proceedings of the 12th Language Resources and Evaluation Conference, pp. 3366–3380. European Language Resources Association, Marseille, France (2020)
Rehm, G., et al.: European language grid: a joint platform for the european language technology community. In: Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: System Demonstrations, pp. 221–230 (2021)
Rehm, G., Schneider, J.M., Bourgonje, P.: Automatic and manual web annotations in an infrastructure to handle fake news and other online media phenomena. In: Calzolari, N., et al. (eds.) Proceedings of the 11th Language Resources and Evaluation Conference (LREC 2018), pp. 2416–2422. European Language Resources Association (ELRA), Miyazaki, Japan (2018)
Rieger, J., von Nordheim, G.: Corona100d: german-language Twitter dataset of the first 100 days after Chancellor Merkel addressed the coronavirus outbreak on TV. Tech. rep., DoCMA Working Paper (2021)
Rieger, M.O., He-Ulbricht, Y.: German and Chinese dataset on attitudes regarding COVID-19 policies, perception of the crisis, and belief in conspiracy theories. Data Brief 33, 106384 (2020)
Rieh, S.Y.: Credibility assessment of online information in context. J. Inf. Sci. Theory Pract. 2(3), 6–17 (2014)
Rogers, A., Gardner, M., Augenstein, I.: QA dataset explosion: a taxonomy of NLP resources for question answering and reading comprehension. arXiv:2107.12708 (2021)
Saltz, E., Barari, S., Leibowicz, C., Wardle, C.: Misinformation interventions are common, divisive, and poorly understood. Harvard Kennedy School Misinf. Rev. 2(5), 1–25 (2021). https://doi.org/10.37016/mr-2020-81
Samimi, H., Hicks, R., Fogel, A., Millstein, T.: Declarative mocking. In: Proceedings of the 2013 International Symposium on Software Testing and Analysis, pp. 246–256. ACM, New York, NY (2013)
Sass, J., et al.: The German Corona Consensus Dataset (GECCO): a standardized dataset for COVID-19 research in university medicine and beyond. BMC Med. Inform. Decis. Mak. 20(1), 341 (2020)
Sauro, J., Lewis, J.R.: Quantifying the user experience: practical statistics for user research. Morgan Kaufmann, Cambridge, MA (2016)
Solis, C., Wang, X.: A study of the characteristics of behaviour driven development. In: Proceedings of the 37th EUROMICRO Conference on Software Engineering and Advanced Application, pp. 383–387. IEEE, Los Alamitos (2011)
Su, Q., Wan, M., Liu, X., Huang, C.R.: Motivations, methods and metrics of misinformation detection: an NLP perspective. Nat. Lang. Process. Res. 1(1–2), 1–13 (2020)
Teyssou, D., et al.: The InVID plug-in: web video verification on the browser. In: Proceedings of the First International Workshop on Multimedia Verification, pp. 23–30. MuVer 2017, Association for Computing Machinery, New York, NY, USA (2017)
Thakur, N., Reimers, N., Rücklé, A., Srivastava, A., Gurevych, I.: BEIR: a Heterogeneous Benchmark for Zero-shot Evaluation of Information Retrieval Models. In: Thirty-Fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 2), pp. 1–16. NeurIPS, Virtual (2021)
Tu, Y.C.: Transparency in Software Engineering, Ph. D. thesis, The University of Auckland, Auckland (2014)
Tu, Y.C., Tempero, E., Thomborson, C.: An experiment on the impact of transparency on the effectiveness of requirements documents. Empir. Softw. Eng. 21(3), 1035–1066 (2016)
Vargas, L., Emami, P., Traynor, P.: On the detection of disinformation campaign activity with network analysis. In: Proceedings of the 2020 ACM SIGSAC Conference on Cloud Computing Security Workshop, pp. 133–146. ACM, Virtual (2020)
Viviani, M., Pasi, G.: Credibility in social media: opinions, news, and health information–a survey. Wiley Interdisc. Rev. Data Min. Knowl. Discov. 7(5), e1209 (2017)
Wautelet, Y., Heng, S., Kolp, M., Mirbel, I.: Unifying and extending user story models. In: Jarke, M., Jarke, M., et al. (eds.) CAiSE 2014. LNCS, vol. 8484, pp. 211–225. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-07881-6_15
Williams, E.: Experimental comparisons of face-to-face and mediated communication: a review. Psychol. Bull. 84(5), 963 (1977)
Wobbrock, J.O., Hattatoglu, L., Hsu, A.K., Burger, M.A., Magee, M.J.: The goldilocks zone: young adults’ credibility perceptions of online news articles based on visual appearance. New Rev. Hypermedia and Multimedia 27, 1–46 (2021)
Zhou, X., Mulay, A., Ferrara, E., Zafarani, R.: ReCOVery: a Multimodal Repository for COVID-19 News Credibility Research. In: Proceedings of the 29th ACM International Conference on Information & Knowledge Management, pp. 3205–3212. ACM, Virtual Event Ireland (2020)
Acknowledgements
The research presented in this paper is funded by the German Federal Ministry of Education and Research (BMBF) through the project PANQURA (http://qurator.ai/panqura; grant no. 03COV03E).
We are grateful to Yuewen Röder (3pc GmbH Neue Kommunikation, Germany) for assisting in the research; to León Viktor Avilés Podgurski for his research on and implementation of credibility signals.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendices
ANewspaper Article (Translation)
Spanish youths receive cultural vouchers
The culture industry around the world suffered from the corona pandemic. In Spain, young people now receive EUR 400 vouchers to take advantage of cultural offers - but one type of event is excluded.
To cushion the hardships of the corona pandemic, young people in Spain receive a cultural voucher . Everyone who will turn 18 next year should receive a voucher worth 400 euros from the government. But it cannot be used indefinitely: the recipients cannot buy tickets for bullfights with it.
The decision was one of the politically controversial measures the government included in the 2022 draft state budget. The vouchers are intended to help the country’s culture and events industry recover from the loss of income during the corona lockdowns. According to the government, eligible teenagers can spend their 400 euros on cinema and theater tickets, books and concerts, for example.
The Ministry of Culture announced in a written communication to the state news agency Efe that “not everything that our legislation regards as culture will fall under this cultural support.” Bullfighting is now rejected by a large part of Spanish society, especially young city dwellers.
BQuestions from the Online Survey (Study 2)
-
1.
Which sources do you use most frequently to inform yourself about various issues regarding the current COVID-19 crisis?
-
2.
Which 3 attributes describe your chosen sources best?
-
3.
How accessible for you is information about transparency and reliability of online sources?
-
4.
How credible is that article for you?
-
5.
Which aspects do you consider most important when assessing credibility?
-
6.
What further information would you like to obtain in order to better assess the article’s credibility?
-
7.
How understandable is the information on the right side?
-
8.
How relevant is the information on the right side?
-
9.
How transparent is the information on the right side?
-
10.
Which influence does the information on the right side have on your assessment of the article?
-
11.
How do you interpret the meaning of the percentage (83%)?
-
12.
Who created the credibility score?
-
13.
Which information do you lack with regard to the credibility score? Do you have any open questions or comments?
Rights and permissions
Copyright information
© 2022 Springer Nature Switzerland AG
About this paper
Cite this paper
Schulz, K., Rauenbusch, J., Fillies, J., Rutenburg, L., Karvelas, D., Rehm, G. (2022). User Experience Design for Automatic Credibility Assessment of News Content About COVID-19. In: Meiselwitz, G., et al. HCI International 2022 - Late Breaking Papers. Interaction in New Media, Learning and Games. HCII 2022. Lecture Notes in Computer Science, vol 13517. Springer, Cham. https://doi.org/10.1007/978-3-031-22131-6_11
Download citation
DOI: https://doi.org/10.1007/978-3-031-22131-6_11
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-22130-9
Online ISBN: 978-3-031-22131-6
eBook Packages: Computer ScienceComputer Science (R0)