Skip to main content

Exploring Effects of Auditory Stimuli on CAPTCHA Performance

  • Conference paper
  • First Online:
Financial Cryptography and Data Security (FC 2020)

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 12063))

Included in the following conference series:

  • 1516 Accesses

Abstract

CAPTCHAs have been widely used as an anti-bot means for well over a decade. Unfortunately, they are often hard and annoying to use, and human errors have been blamed mainly on overly complex challenges, or poor challenge design. However, errors can also occur because of ambient sensory distractions, and performance impact of these distractions has not been thoroughly examined.

The goal of our work is to explore the impact of auditory distractions on CAPTCHA performance. To this end, we conducted a comprehensive user study. Its results, discussed in this paper, show that various types of auditory stimuli impact performance differently. Generally, simple and less dynamic stimuli sometimes improve subject performance, while highly dynamic stimuli have a negative impact. This is troublesome since CAPTCHAs are often used to protect web sites offering tickets for limited-quantity events, that sell out very quickly, i.e., within seconds. In such settings, introduction of even a small delay can make the difference between obtaining tickets from the primary source, and being forced to use a secondary market. Our study was conducted in a fully automated experimental environment to foster uniform and scalable experiments. We discuss both benefits and limitations of unattended automated experiment paradigm.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    With the volume knob physically disabled.

  2. 2.

    OSHA requires all employers to implement a Hearing Conservation Program where workers are exposed to a time-weighted average noise level of 90 dB or higher over an 8 h work shift. Our noise levels were for a much lower duration, and only the very loudest was within the regulated range. See: https://www.osha.gov/SLTC/noisehearingconservation/.

  3. 3.

    Although it would have been possible to detect non-compliance automatically, e.g., via an inactivity timeout, non-compliant subject data would still be discarded.

  4. 4.

    See secondlife.com.

References

  1. Benignus, V.A., Otto, D.A., Knelson, J.H.: Effect of low-frequency random noises on performance of a numeric monitoring task. Percept. Motor Skills 40(1), 231–239 (1975)

    Article  Google Scholar 

  2. Berg, B.G., Kaczmarek, T., Kobsa, A., Tsudik, G.: An exploration of the effects of sensory stimuli on the completion of security tasks. IEEE Secur. Privacy 15(6), 52–60 (2017)

    Article  Google Scholar 

  3. Bursztein, E., Bethard, S., Fabry, C., Mitchell, J.C., Jurafsky, D.: How good are humans at solving captchas? A large scale evaluation. In: 2010 IEEE Symposium on Security and Privacy (SP), pp. 399–413. IEEE (2010)

    Google Scholar 

  4. Bursztein, E., Moscicki, A., Fabry, C., Bethard, S., Mitchell, J.C., Jurafsky, D.: Easy does it: more usable captchas. In: Proceedings of the 32nd Annual ACM Conference on Human Factors in Computing Systems, pp. 2637–2646. ACM (2014)

    Google Scholar 

  5. Chang, R., Shmatikov, V.: Formal analysis of authentication in Bluetooth device pairing. In: FCS-ARSPA 2007, p. 45 (2007)

    Google Scholar 

  6. Chellapilla, K., Larson, K., Simard, P., Czerwinski, M.: Designing human friendly human interaction proofs (HIPS). In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 711–720. ACM (2005)

    Google Scholar 

  7. Cohen, R.A.: Yerkes-Dodson law. In: Kreutzer, J.S., DeLuca, J., Caplan, B. (eds.) Encyclopedia of Clinical Neuropsychology, pp. 2737–2738. Springer, Heidelberg (2011)

    Google Scholar 

  8. El Ahmad, A.S., Yan, J., Ng, W.-Y.: Captcha design: color, usability, and security. IEEE Internet Comput. 16(2), 44–51 (2012)

    Article  Google Scholar 

  9. Harris, W.: Stress and perception: the effects of intense noise stimulation and noxious stimulation upon perceptual performance. Ph.D. thesis, University of Southern California (1960)

    Google Scholar 

  10. Hockey, G.R.J.: Effect of loud noise on attentional selectivity. Q. J. Exp. Psychol. 22(1), 28–36 (1970)

    Article  Google Scholar 

  11. Kaiser, E., Feng, W.-C.: Helping ticketmaster: changing the economics of ticket robots with geographic proof-of-work. In: INFOCOM IEEE Conference on Computer Communications Workshops, pp. 1–6. IEEE (2010)

    Google Scholar 

  12. Khalil, A., Abdallah, S., Ahmed, S., Hajjdiab, H.: Script familiarity and its effect on CAPTCHA usability: an experiment with Arab participants. Int. J. Web Portals (IJWP) 4(2), 74–87 (2012)

    Article  Google Scholar 

  13. Kolias, C., Kambourakis, G., Stavrou, A., Voas, J.: DDoS in the IoT: Mirai and other botnets. Computer 50(7), 80–84 (2017)

    Article  Google Scholar 

  14. Lazem, S., Gracanin, D.: Social traps in second life. In: 2010 Second International Conference on Games and Virtual Worlds for Serious Applications (VS-GAMES), pp. 133–140, March 2010

    Google Scholar 

  15. MacLeod, C.M.: Half a century of research on the stroop effect: an integrative review. Psychol. Bull. 109(2), 163 (1991)

    Article  Google Scholar 

  16. Ollesch, H., Heineken, E., Schulte, F.P.: Physical or virtual presence of the experimenter: psychological online-experiments in different settings. Int. J. Internet Sci. 1(1), 71–81 (2006)

    Google Scholar 

  17. Olmedo, E.L., Kirk, R.E.: Maintenance of vigilance by non-task-related stimulation in the monitoring environment. Percept. Motor Skills 44(3), 715–723 (1977)

    Article  Google Scholar 

  18. O’Malley, J.J., Poplawsky, A.: Noise-induced arousal and breadth of attention. Percept. Motor Skills 33(3), 887–890 (1971)

    Article  Google Scholar 

  19. Riva, G., Teruzzi, T., Anolli, L.: The use of the internet in psychological research: comparison of online and offline questionnaires. CyberPsychol. Behav. 6(1), 73–80 (2003)

    Article  Google Scholar 

  20. Rogers, R.D., Monsell, S.: Costs of a predictible switch between simple cognitive tasks. J. Exp. Psychol.: General 124(2), 207 (1995)

    Article  Google Scholar 

  21. Söderlund, G., et al.: Positive effects of noise on cognitive performance: explaining the moderate brain arousal model. In: The 9th Congress of the International Commisssion on the Biological Effects of Noise, Leibniz Gemeinschaft, pp. 378–386 (2008)

    Google Scholar 

  22. Von Ahn, L., Blum, M., Hopper, N.J., Langford, J.: Captcha: using hard AI problems for security. In: International Conference on the Theory and Applications of Cryptographic Techniques, pp. 294–311. Springer, Heidelberg (2003)

    Google Scholar 

  23. Von Ahn, L., Maurer, B., McMillen, C., Abraham, D., Blum, M.: reCAPTCHA: human-based character recognition via web security measures. Science 321(5895), 1465–1468 (2008)

    Article  MathSciNet  Google Scholar 

  24. Whelan, R.: Effective analysis of reaction time data. Psychol. Rec. 58(3), 475–482 (2008)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Gene Tsudik .

Editor information

Editors and Affiliations

Appendices

A A: Background and Related Work

This section overviews related work in automated experiments, and human-assisted security methods. We also provide psychological background theory related to effects of sensory arousal on subject task performance.

1.1 A.1 Automated Experiments

There has been a prior study focusing on effects of visual and auditory stimuli on completion of a specific security-critical task – Bluetooth pairing [2]. It showed that introduction of unexpected stimuli has a spectrum of beneficial and detrimental effects on subject performance. That initial result motivates a more thorough examination of the space of security-critical tasks, since Bluetooth pairing is a very simple (and infrequent) cognitive task that only requires a single button press to confirm matching codes [5].

Some prior work focused on evaluating virtually-attended remote experiments and unattended online surveys. in comparison with those conducted in the traditional lab setting. Ollesch et al. [16] collected psychometric data in a physically attended experimental lab setting and its virtually attended remote counterpart. No significant differences were found. This is further reinforced by Riva et al. [19] who compared data collected from unattended online, and attended offline, questionnaires. Finally, Lazem and Gracanin [14] replicated two classical social psychology experiments where both the participants and the experimenter were represented by avatars in Second LifeFootnote 4, instead of being physically co-present. Here too, no significant differences were observed.

1.2 A.2 User Studies of Text-Based CAPTCHAs

Given ubiquity of CAPTCHAs, it is surprising that only a few usability studies have been conducted.

Chellapilla et al. [6] performed the first usability evaluation of CAPTCHAs, by examining character-based CAPTCHAs and evaluating Robustness/Usability tradeoffs. Results showed that sophisticated segmentation algorithms can violate robustness goals of popular, currently deployed text-based CAPTCHAs. However, service providers are hesitant to switch to more difficult CAPTCHAs for fear of low user acceptability.

Bursztein et al. [3] conducted a large-scale evaluation of user performance with several CAPTCHA schemes. Performance varied widely from scheme to scheme, with user’s success rates ranging from \(91\%\) to \(70\%\). This contradicted self-reported statistics, e.g., from Ebay, which claimed a \(98\%\) successful completion rate. Audio-only CAPTCHAs were found to be extremely difficult for most users, with success rates as low as \(35\%\). This motivates guidelines for user-friendly text-based, and the need for further study of audio-only, CAPTCHAs.

Yan and El Ahmed [8] examine what makes CAPTCHAs usable, and non-intrusive. Color is identified as the primary culprit in intrusiveness, as clashing schema can interfere with presentation of the site itself. Furthermore, coloring a CAPTCHA lowers robustness, since it gives an easy target for segmentation, i.e., separating the image by color. Surprisingly, inclusion of color in a CAPTCHA is clamed to be a benefit for both usability and robustness if done correctly. However, what constitutes correct color usage is left as an open problem.

Khalil et al. examine the impact of alphabet familiarity on CAPTCHA performance using different character sets [12]. Familiarity with the alphabet used to construct a text-based CAPTCHA does not impact error rates. However, users’ satisfaction is positively correlated with their familiarity level with the alphabet being used.

Burszstein et al. [4] paramaterized CAPTCHA features to find the most usable combination. This was done with particular focus on low-security CAPTCHAs that could sacrifice robustness and allow bots to achieve \(>0.01\%\) success rate. Subjects were found to prefer CAPTCHAs composed of English-language words with positive connotations (such as “cutest”) with simple global distortions, and very few intersection or occluding lines. The study concluded with a candidate CAPTCHA design that showed a \(95.4\%\) success rate.

To date, there has been no evaluation of user performance with CAPTCHAs in a noisy environment.

1.3 A.3 Effects of Sensory Stimulation

Sensory stimulation has variable impact on task performance. This is due to many factors, including the subject’s current level of arousal. The Yerkes-Dodson Law stipulates an inverse quadratic relationship between arousal and task performance [7]. It implies that, across all contributing stimulants, subjects who are either at a very low – or very high – level of arousal are unlikely to perform well, and there exists an optimal level of arousal for correct task completion.

An extension to this law is the notion that completion of less complex tasks that produce lower levels of initial arousal in subjects benefits from inclusion of external stimuli with low to medium arousal. At the same time, completion of complex tasks that produce a high level of initial arousal suffers from inclusion of external stimuli. Hockey [10] and Benignus et al.  [1] classified this causal relationship by defining task complexity as a function of the task’s event rate (i.e., how many subtasks must be completed in a given time-frame) and the number of sources that originate these subtasks. External stimulation can serve to sharpen the focus of a subject at a low arousal level, improving task performance [17]. Conversely, it can overload subjects that are already at a high level of arousal, and induce errors in task completion [9].

O’Malley and Poplawsky [18] argued that sensory noise affects behavioral selectivity. Specifically, while a consistent positive or negative effect on task completion may not occur, a consistent negative effect was observed for tasks that require subjects to react to signals on their periphery. Meanwhile, a consistent positive effect on task completion was observed for tasks that require subjects to react to signals in the center of their field of attention. This leads the authors to claim that sensory stimulation has the effect of narrowing the subject’s area of attention.

B B: Study Shortcomings

This section discusses some shortcomings of the study.

Homogeneous Subjects: Our subject group was comprised of young and tech-savvy college students. This is a consequence of the experiment’s location and recruitment methods. Replication of this experiment in a non-academic setting would be useful. However, recruiting an appropriately diverse set of subjects is still difficult, even in a public setting. Ideal venues might be stadiums, concert halls, fairgrounds or shopping malls. Unfortunately, deployment of the unattended setup in such public locations is logistically infeasible. Since such public areas are already full of other sensory stimuli, reliable adjustment of subjects’ arousal level in a consistent manner would be very hard. Furthermore, it would be very difficult to secure expensive experimental equipment.

Synthetic Environment: Even though we attempted to provide a realistic environment for CAPTCHAs, our setup was obviously a contrived, artificial and controlled space. Typically, people encounter CAPTCHAs while using their own devices from their own homes or offices. As such, it would be intuitive to conduct a study remotely over the Internet. However, this would introduce many compounding and potentially dangerous variables. First, there would be no way of knowing ahead of time the exact nature of the potential subjects’ auditory environment. This could lead to complications ranging from the trivial nullification of collected data (e.g., if subject’s audio-out is muted) all the way to potential hurting subject’s auditory faculties (e.g., in-ear headphones turned to a dangerously high volume).

This further complicates measurement of any effects of auditory stimuli, as it becomes unclear if any two subjects encounter the stimuli the same way. For example, a subject using headphones at a high volume is going to have a drastically different experience than a subject using speakers at a low volume. These differences will confound the actual impact of the stimuli, making it extremely difficult to quantify any meaningful effect on task performance. Because of the need of homogeneity in presentation of the stimuli, it is easy to see how such an online experiment would be ineffective in practice.

C C: Ethical Consideration

Experiments described in this paper were fully authorized by the Institutional Review Board (IRB) of the university, well before the study. The level of review was: Exempt, Category II. Further IRB-related details are available upon request. No sensitive data was harvested during the experiments and minimal identifying information was retained. In particular:

  1. 1.

    No names, addresses, phone numbers or other identifying information was collected from the participants.

  2. 2.

    Although email addresses were solicited in order to confirm participation, they were erased very soon thereafter.

  3. 3.

    Video recordings of the experiments were kept for study integrity purposes. However, they were erased before the IRB expiration time.

Finally, with regard to safety, sound levels were maintained at between 70 and 88 dB, which is (especially, for only 2:15 min) generally considered safe.

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Berg, B., Kaczmarek, T., Kobsa, A., Tsudik, G. (2020). Exploring Effects of Auditory Stimuli on CAPTCHA Performance. In: Bernhard, M., et al. Financial Cryptography and Data Security. FC 2020. Lecture Notes in Computer Science(), vol 12063. Springer, Cham. https://doi.org/10.1007/978-3-030-54455-3_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-54455-3_11

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-54454-6

  • Online ISBN: 978-3-030-54455-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics