skip to main content
10.1145/3429360.3468172acmconferencesArticle/Chapter ViewAbstractPublication Pagesasian-chiConference Proceedingsconference-collections
research-article

Word-Copying on a Website as a Word Complexity Indicator and the Relation to Web Users’ Preferred Languages

Published: 07 September 2021 Publication History

Abstract

The first step toward accessibility improvement in the context of Human-Computer Interaction (HCI) is the identification of potential barriers. A recent study shows that words that are frequently copied to the clipboard by web users are relatively complex. A plausible reason is that users copy challenging words to search for a translation or more information. Accordingly, tracking word-copying operations of web users may be useful in identifying complex words. This study focuses on the users that apply word-copying. It shows significant differences in the frequency of word-copying operations among different populations of users. On the examined website, whose content is in English, users whose preferred language is not English copied single words to the clipboard significantly more frequently than users whose preferred language is English. Further analysis of the data also shows that word-copying was more frequent among users whose preferred languages have low proximity to English, such as Asian languages, compared to Western European languages. These results support the observation that word-copying indicates complexity, as it is reasonable to expect that native speakers of foreign languages (and especially of languages that are considerably different from the website’s language) are more likely to need help with complex words. Word complexity is subjective and audience-dependent. This study contributes to the understanding of which users tend to use word-copying, and accordingly, in which context word-copying data can be used as a word complexity indicator. It also introduces a new practical approach for detecting language barriers in order to improve language accessibility on global websites.

References

[1]
M. Coleman and T. L. Liau. 1975. A Computer Readability Formula Designed for Machine Scoring. Journal of Applied Psychology 60 (08 1975), 283–284. Issue 2.
[2]
Brent Dykes. 2014. Web Analytics Kick Start Guide: A Primer on the Fundamentals of Digital Analytics. Peachpit, Pearson Education, USA.
[3]
Robert Gunning. 1952. The Technique of Clear Writing. McGraw-Hill, New York, New York.
[4]
Layla Hasan, Anne Morris, and Steve Probets. 2009. Using Google Analytics to Evaluate the Usability of E-Commerce Sites. In Proceedings of the 1st International Conference on Human Centered Design. Springer Berlin Heidelberg, Berlin, Heidelberg, 697–706. https://doi.org/10.1007/978-3-642-02806-9_81
[5]
Razib Iqbal, Matthew Scott, and Tarah Cleveland. 2016. Measuring Actual Visitor Engagement in News Websites. In 4th International Workshop on News Recommendation and Analytics(INRA 2016). Association for Computing Machinery, Halifax, Canada, 4 pages.
[6]
Avinash Kaushik. 2007. Web Analytics: An Hour a Day. SYBEX Inc., USA.
[7]
Avinash Kaushik. 2010. Web Analytics 2.0. SYBEX Inc., USA.
[8]
Ilan Kirsh. 2020. Automatic Complex Word Identification Using Implicit Feedback From User Copy Operations. In Proceedings of the 21st International Conference on Web Information Systems Engineering (WISE 2020), Lecture Notes in Computer Science. Springer International Publishing, Cham, 155–166. https://doi.org/10.1007/978-3-030-62008-0_11
[9]
Ilan Kirsh. 2020. Using Mouse Movement Heatmaps to Visualize User Attention to Words. In Proceedings of the 11th Nordic Conference on Human-Computer Interaction (NordiCHI 2020), Tallinn, Estonia. Association for Computing Machinery, New York, NY, USA, 117:1–5. https://doi.org/10.1145/3419249.3421250
[10]
Ilan Kirsh. 2020. What Web Users Copy to the Clipboard on a Website: A Case Study. In Proceedings of the 16th International Conference on Web Information Systems and Technologies (WEBIST 2020). INSTICC, SciTePress, Setúbal, Portugal, 303–312. https://doi.org/10.5220/0010113203030312
[11]
Ilan Kirsh and Mike Joy. 2020. An HCI Approach to Extractive Text Summarization: Selecting Key Sentences Based on User Copy Operations. In Proceedings of the 22nd HCI International Conference (HCII 2020), Communications in Computer and Information Science. Springer International Publishing, Cham, 335–341. https://doi.org/10.1007/978-3-030-60700-5_43
[12]
Ilan Kirsh and Mike Joy. 2020. Splitting the Web Analytics Atom: From Page Metrics and KPIs to Sub-Page Metrics and KPIs. In Proceedings of the 10th International Conference on Web Intelligence, Mining and Semantics (WIMS 2020), Biarritz, France. Association for Computing Machinery, New York, NY, USA, 33–43. https://doi.org/10.1145/3405962.3405984
[13]
Lakhwinder Kumar, Hardeep Singh, and Ramandeep Kaur. 2012. Web Analytics and Metrics: A Survey. In Proceedings of the International Conference on Advances in Computing, Communications and Informatics (Chennai, India) (ICACCI ’12). Association for Computing Machinery, New York, NY, USA, 966–971. https://doi.org/10.1145/2345396.2345552
[14]
Gondy Leroy and David Kauchak. 2013. The effect of word familiarity on actual and perceived text difficulty. Journal of the American Medical Informatics Association : JAMIA 21 (10 2013), 169–172. https://doi.org/10.1136/amiajnl-2013-002172
[15]
G. H McLaughlin. 1969. SMOG grading: A new readability formula. Journal of Reading 12 (08 1969), 639–646. Issue 8.
[16]
Gustavo Paetzold and Lucia Specia. 2016. SemEval 2016 Task 11: Complex Word Identification. In Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016). Association for Computational Linguistics, San Diego, California, 560–569. https://doi.org/10.18653/v1/S16-1085
[17]
Matthew Shardlow. 2013. A Comparison of Techniques to Automatically Identify Complex Words”. In 51st Annual Meeting of the Association for Computational Linguistics Proceedings of the Student Research Workshop. Association for Computational Linguistics, Sofia, Bulgaria, 103–109. https://www.aclweb.org/anthology/P13-3015
[18]
Matthew Shardlow. 2014. A Survey of Automated Text Simplification. International Journal of Advanced Computer Science and Applications(IJACSA), Special Issue on Natural Language Processing 2014 4, 1(2014), 58–70. https://doi.org/10.14569/SpecialIssue.2014.040109
[19]
Helen Sharp, Jennifer Preece, and Yvonne Rogers. 2019. Interaction Design. Beyond Human-Computer Interaction, 5th Edition. John Wiley & Sons, Inc., USA.
[20]
Lucia Specia, Sujay Kumar Jauhar, and Rada Mihalcea. 2012. SemEval-2012 Task 1: English Lexical Simplification. In Proceedings of the First Joint Conference on Lexical and Computational Semantics (Montréal, Canada) (SemEval ’12). Association for Computational Linguistics, USA, 347–355.
[21]
Seid Muhie Yimam, Chris Biemann, Shervin Malmasi, Gustavo Paetzold, Lucia Specia, Sanja Štajner, Anaïs Tack, and Marcos Zampieri. 2018. A Report on the Complex Word Identification Shared Task 2018. In Proceedings of the Thirteenth Workshop on Innovative Use of NLP for Building Educational Applications. Association for Computational Linguistics, New Orleans, Louisiana, 66–78. https://doi.org/10.18653/v1/W18-0507

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
Asian CHI '21: Proceedings of the Asian CHI Symposium 2021
May 2021
228 pages
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 September 2021

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Accessibility
  2. Browser
  3. Clipboard
  4. Complex Word Identification (CWI)
  5. Copy
  6. Language Barriers
  7. Text Simplification
  8. User Behavior
  9. Web Analytics.

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

CHI '21

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)2
  • Downloads (Last 6 weeks)0
Reflects downloads up to 07 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media