research-article

Going Beyond One-Size-Fits-All Image Descriptions to Satisfy the Information Wants of People Who are Blind or Have Low Vision

Authors:
Abigale Stangl

Human Centered Design and Engineering University of Washington, United States

Human Centered Design and Engineering University of Washington, United States
View Profile

,
Nitin Verma

School of Information University of Texas at Austin, United States

School of Information University of Texas at Austin, United States
View Profile

,
Kenneth R. Fleischmann

School of Information The University of Texas at Austin, United States

School of Information The University of Texas at Austin, United States
View Profile

,
Meredith Ringel Morris

Microsoft Research, United States

Microsoft Research, United States
View Profile

,
Danna Gurari

School of Information University of Texas at Austin, United States

School of Information University of Texas at Austin, United States
View Profile

ASSETS '21: Proceedings of the 23rd International ACM SIGACCESS Conference on Computers and AccessibilityOctober 2021Article No.: 16Pages 1–15https://doi.org/10.1145/3441852.3471233

Published:17 October 2021Publication History

ASSETS '21: Proceedings of the 23rd International ACM SIGACCESS Conference on Computers and Accessibility

Pages 1–15

ABSTRACT

Image descriptions are how people who are blind or have low vision (BLV) access information depicted within images. To our knowledge, no prior work has examined how a description for an image should be designed for different scenarios in which users encounter images. Scenarios consist of the information goal the person has when seeking information from or about an image, paired with the source where the image is found. To address this gap, we interviewed 28 people who are BLV to learn how the scenario impacts what image content (information) should go into an image description. We offer our findings as a foundation for considering how to design next-generation image description technologies that can both (A) support a departure from one-size-fits-all image descriptions to context-aware descriptions, and (B) reveal what content to include in minimum viable descriptions for a large range of scenarios.

Supplemental Material

TP5970.mp4

Video presentation

mp4

31 MB

Download

Available for Download

zip

Supplemental materials (426.3 KB)

vtt

TP5970.vtt (15.9 KB)

References

[n.d.]. We Count: Fair Treatment, Disability and Machine Learning. https://www.w3.org/2020/06/machine-learning-workshop/talks/we_count_fair_treatment_disability_and_machine_learning.htmlGoogle Scholar
Ali Abdolrahmani, Kevin M. Storer, Antony Rishin Mukkath Roy, Ravi Kuber, and Stacy M. Branham. 2020. Blind Leading the Sighted: Drawing Design Insights from Blind Users towards More Productivity-oriented Voice Interfaces. ACM Transactions on Accessible Computing 12, 4 (Jan 2020), 1–35. https://doi.org/10.1145/3368426Google ScholarDigital Library
Anne Aula and Daniel M Russell. 2008. Complex and exploratory web search. In Information Seeking Support Systems Workshop (ISSS 2008), Chapel Hill, NC, USA. Citeseer.Google Scholar
Katrin Auspurg and Thomas Hinz. 2014. Factorial survey experiments. Vol. 175. Sage Publications.Google Scholar
Wendy Beautyman and Andrew K Shenton. 2009. When does an academic information need stimulate a school-inspired information want?Journal of Librarianship and Information Science 41, 2 (2009), 67–80.Google Scholar
Cynthia L Bennett, Martez E Mott, Edward Cutrell, Meredith Ringel Morris, 2018. How Teens with Visual Impairments Take, Edit, and Share Photos on Social Media. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. ACM, 76.Google ScholarDigital Library
Jeffrey P Bigham, Chandrika Jayant, Hanjie Ji, Greg Little, Andrew Miller, Robert C Miller, Robin Miller, Aubrey Tatarowicz, Brandyn White, Samual White, 2010. VizWiz: nearly real-time answers to visual questions. In Proceedings of the 23nd annual ACM symposium on User interface software and technology. ACM, 333–342.Google ScholarDigital Library
Jeffrey P Bigham, Richard E Ladner, and Yevgen Borodin. 2011. The design of human-powered access technology. In The proceedings of the 13th international ACM SIGACCESS conference on Computers and accessibility. ACM, 3–10.Google ScholarDigital Library
Erin Brady, Meredith Ringel Morris, and Jeffrey P Bigham. 2015. Gauging receptiveness to social microvolunteering. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems. ACM, 1055–1064.Google ScholarDigital Library
Erin Brady, Meredith Ringel Morris, Yu Zhong, Samuel White, and Jeffrey P. Bigham. 2013. Visual Challenges in the Everyday Lives of Blind People. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (Paris, France) (CHI ’13). ACM, New York, NY, USA, 2117–2126.Google ScholarDigital Library
Stacy M Branham, Ali Abdolrahmani, William Easley, Morgan Scheuerman, Erick Ronquillo, and Amy Hurst. 2017. Is Someone There? Do They Have a Gun: How Visual Information about Others Can Improve Personal Safety Management for Blind Individuals. In Proceedings of the 19th International ACM SIGACCESS Conference on Computers and Accessibility. ACM, 260–269.Google ScholarDigital Library
Virginia Braun and Victoria Clarke. 2006. Using thematic analysis in psychology. Qualitative research in psychology 3, 2 (2006), 77–101.Google Scholar
Anita Brown. 2004. Reference services for children: information needs and wants in the public library. The Australian Library Journal 53, 3 (2004), 261–274. https://doi.org/10.1080/00049670.2004.10721654 arXiv:https://doi.org/10.1080/00049670.2004.10721654Google ScholarCross Ref
Emeline Brule, Gilles Bailly, Anke Brock, Frédéric Valentin, Grégoire Denis, and Christophe Jouffrais. 2016. MapSense: multi-sensory interactive maps for children living with visual impairments. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems. 445–457.Google ScholarDigital Library
Michele A Burton, Erin Brady, Robin Brewer, Callie Neylan, Jeffrey P Bigham, and Amy Hurst. 2012. Crowdsourcing subjective fashion advice using VizWiz: challenges and opportunities. In Proceedings of the 14th international ACM SIGACCESS conference on Computers and accessibility. ACM, 135–142.Google ScholarDigital Library
Diagram Center. 2015. Specific Guidelines: Art, Photos & Cartoons. http://diagramcenter.org/specific-guidelines-final-draft.html. (Accessed on 02/12/20).Google Scholar
Cesc Chunseong Park, Byeongchang Kim, and Gunhee Kim. 2017. Attend to You: Personalized Image Captioning With Context Sequence Memory Networks. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Google Scholar
Cesc Chunseong Park, Byeongchang Kim, and Gunhee Kim. 2017. Attend to you: Personalized image captioning with context sequence memory networks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 895–903.Google Scholar
Morgan Klaus Scheuerman Jeffrey P. Bigham Anhong Guo Cynthia L. Bennett, Cole Gleason and Alexandra To. 2021. “It’s Complicated”: Negotiating Accessibility and (Mis)Representation in ImageDescriptions of Race, Gender, and Disability. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (Virtual, McVirtualand).Google Scholar
Dhruba Dahal. 2018. Simplifying and Improving Effectiveness of Image Description for Accessibility using Sample Cues. Master’s thesis. OsloMet-Oslo Metropolitan University.Google Scholar
Jacob Devlin, Hao Cheng, Hao Fang, Saurabh Gupta, Li Deng, Xiaodong He, Geoffrey Zweig, and Margaret Mitchell. 2015. Language models for image captioning: The quirks and what works. arXiv preprint arXiv:1505.01809(2015).Google Scholar
Hao Fang, Saurabh Gupta, Forrest Iandola, Rupesh K Srivastava, Li Deng, Piotr Dollár, Jianfeng Gao, Xiaodong He, Margaret Mitchell, John C Platt, 2015. From captions to visual concepts and back. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1473–1482.Google ScholarCross Ref
Leah Findlater, Steven Goodman, Yuhang Zhao, Shiri Azenkot, and Margot Hanley. 2020. Fairness issues in AI systems that augment sensory abilities. ACM SIGACCESS Accessibility and Computing125 (2020), 1–1.Google Scholar
Giovanni Fusco and Valerie S Morash. 2015. The tactile graphics helper: providing audio clarification for tactile graphics using machine vision. In Proceedings of the 17th International ACM SIGACCESS Conference on Computers & Accessibility. 97–106.Google ScholarDigital Library
Chuang Gan, Zhe Gan, Xiaodong He, Jianfeng Gao, and Li Deng. 2017. StyleNet: Generating Attractive Visual Captions With Styles. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Google Scholar
Chuang Gan, Zhe Gan, Xiaodong He, Jianfeng Gao, and Li Deng. 2017. Stylenet: Generating attractive visual captions with styles. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3137–3146.Google ScholarCross Ref
Cole Gleason, Amy Pavel, Himalini Gururaj, Kris Kitani, and Jeffrey P. Bigham. [n.d.]. Making GIFs Accessible. https://www.colegleason.com/static/papers/MakingGIFsAccessible_ASSETS2020.pdf. (Accessed on 09/08/2020).Google Scholar
Cole Gleason, Amy Pavel, Xingyu Liu, Patrick Carrington, Lydia B. Chilton, and Jeffrey P. Bigham. 2019. Making Memes Accessible. In The 21st International ACM SIGACCESS Conference on Computers and Accessibility(ASSETS ’19). Association for Computing Machinery, Pittsburgh, PA, USA, 367–376. https://doi.org/10/ggt2d6Google ScholarDigital Library
Cole Gleason, Amy Pavel, Emma McCamey, Christina Low, Patrick Carrington, Kris M. Kitani, and Jeffrey P. Bigham. 2020. Twitter A11y: A Browser Extension to Make Twitter Images Accessible. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems (Honolulu, HI, USA) (CHI ’20). Association for Computing Machinery, New York, NY, USA, 1–12. https://doi.org/10.1145/3313831.3376728Google ScholarDigital Library
Darren Guinness, Edward Cutrell, and Meredith Ringel Morris. 2018. Caption crawler: Enabling reusable alternative text descriptions using reverse image search. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. ACM, 518.Google ScholarDigital Library
Anhong Guo, Ece Kamar, Jennifer Wortman Vaughan, Hanna Wallach, and Meredith Ringel Morris. 2019. Toward Fairness in AI for People with Disabilities: A Research Roadmap. arXiv preprint arXiv:1907.02227(2019).Google Scholar
Danna Gurari, Yinan Zhao, Meng Zhang, and Nilavra Bhattacharya. 2020. Captioning images taken by people who are blind. In European Conference on Computer Vision. Springer, 417–434.Google ScholarDigital Library
Philip Hider. 2006. Search goal revision in models of information retrieval. Journal of information science 32, 4 (2006), 352–361.Google ScholarCross Ref
Ting-Hao Kenneth Huang, Francis Ferraro, Nasrin Mostafazadeh, Ishan Misra, Aishwarya Agrawal, Jacob Devlin, Ross Girshick, Xiaodong He, Pushmeet Kohli, Dhruv Batra, 2016. Visual storytelling. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 1233–1239.Google ScholarCross Ref
Melanie Kellar, Carolyn Watters, and Michael Shepherd. 2006. A goal-based classification of web information tasks. Proceedings of the American Society for Information Science and Technology 43, 1 (2006), 1–22.Google Scholar
Paul J. Lavrakas. 2008. Purposive Sample, Encyclopedia of Survey Research Methods. Sage Publications, Inc., 2008..Google ScholarCross Ref
Veronica Lewis. 2018. How to Write Alt Text and Image Descriptions for the visually impaired. https://www.perkinselearning.org/technology/blog/how-write-alt-text-and-image-descriptions-visually-impaired Accessed on: 2020-05-05.Google Scholar
Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C Lawrence Zitnick. 2014. Microsoft coco: Common objects in context. In European conference on computer vision. Springer, 740–755.Google ScholarCross Ref
Livingspaces.com. 2020. Living Room Ideas & Decor. https://www.livingspaces.com/inspiration/rooms/living-room-ideas. (Accessed on 02/16/2020).Google Scholar
Mopay Lola. 2019. Final Idea. https://medium.com/@mopaylola/final-idea-5090ad49bb2a. (Accessed on 02/16/2020).Google Scholar
Christina Low, Emma McCamey, Cole Gleason, Patrick Carrington, Jeffrey P Bigham, and Amy Pavel. 2019. Twitter A11y: A Browser Extension to Describe Images. In The 21st International ACM SIGACCESS Conference on Computers and Accessibility. 551–553.Google ScholarDigital Library
Haley MacLeod, Cynthia L Bennett, Meredith Ringel Morris, and Edward Cutrell. 2017. Understanding Blind People’s Experiences with Computer-Generated Captions of Social Media Images. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems. ACM, 5988–5999.Google ScholarDigital Library
Alexander Mathews, Lexing Xie, and Xuming He. 2015. SentiCap: Generating Image Descriptions with Sentiments. arxiv:cs.CV/1510.01431Google Scholar
Alexander Patrick Mathews, Lexing Xie, and Xuming He. 2016. Senticap: Generating image descriptions with sentiments. In Thirtieth AAAI conference on artificial intelligence.Google ScholarCross Ref
MAXQDA. 2020. All-In-One Tool for Qualitative Data Analysis & Mixed Methods. https://www.maxqda.com/. (Accessed on 02/18/2020).Google Scholar
Nora McDonald, Sarita Schoenebeck, and Andrea Forte. 2019. Reliability and inter-rater reliability in qualitative research: Norms and guidelines for CSCW and HCI practice. Proceedings of the ACM on Human-Computer Interaction 3, CSCW(2019), 1–23.Google ScholarDigital Library
Valerie S Morash, Yue-Ting Siu, Joshua A Miele, Lucia Hasty, and Steven Landau. 2015. Guiding novice web workers in making image descriptions using templates. ACM Transactions on Accessible Computing (TACCESS) 7, 4 (2015), 12.Google Scholar
Meredith Ringel Morris. 2020. AI and accessibility. Commun. ACM 63, 6 (May 2020), 35–37. https://doi.org/10.1145/3356727Google ScholarDigital Library
Meredith Ringel Morris, Jazette Johnson, Cynthia L Bennett, and Edward Cutrell. 2018. Rich representations of visual content for screen reader users. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. ACM, 59.Google ScholarDigital Library
Meredith Ringel Morris, Annuska Zolyomi, Catherine Yao, Sina Bahram, Jeffrey P Bigham, and Shaun K Kane. 2016. With most of it being pictures now, I rarely use it: Understanding Twitter’s Evolving Accessibility to Blind Users. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems. ACM, 5506–5516.Google ScholarDigital Library
Nasrin Mostafazadeh, Ishan Misra, Jacob Devlin, Margaret Mitchell, Xiaodong He, and Lucy Vanderwende. 2016. Generating natural questions about an image. arXiv preprint arXiv:1603.06059(2016).Google Scholar
Karen Nakamura. 2019. My Algorithms Have Determined You’re Not Human: AI-ML, Reverse Turing-Tests, and the Disability Experience. In The 21st International ACM SIGACCESS Conference on Computers and Accessibility(ASSETS ’19). Association for Computing Machinery, 1–2. https://doi.org/10.1145/3308561.3353812Google ScholarDigital Library
Kimberly A Neuendorf. 2019. 18 Content analysis and thematic analysis. Advanced Research Methods for Applied Psychology (2019), 211.Google Scholar
Valeria Occelli, Charles Spence, and Massimiliano Zampini. 2008. Audiotactile temporal order judgments in sighted and blind individuals. Neuropsychologia 46, 11 (2008), 2845–2850.Google ScholarCross Ref
Joon Sung Park, Danielle Bragg, Ece Kamar, and Meredith Ringel Morris. 2021. Designing an Online Infrastructure for Collecting AI Data From People With Disabilities. (2021), 12.Google Scholar
Helen Petrie, Chandra Harrison, and Sundeep Dev. 2005. Describing images on the web: a survey of current practice and prospects for the future. Proceedings of Human Computer Interaction International (HCII) 71 (2005).Google Scholar
David Clapp Photography. 2020. Portfolio Categories. https://www.davidclapp.co.uk/portfolio. (Accessed on 02/16/2020).Google Scholar
Elliot Salisbury, Ece Kamar, and Meredith Ringel Morris. 2017. Toward scalable social alt text: Conversational crowdsourcing as a tool for refining vision-to-language technology for the blind. In Fifth AAAI Conference on Human Computation and Crowdsourcing.Google ScholarCross Ref
Elliot Salisbury, Ece Kamar, and Meredith Ringel Morris. 2018. Evaluating and Complementing Vision-to-Language Technology for People who are Blind with Conversational Crowdsourcing.. In IJCAI. 5349–5353.Google Scholar
Nithya Sambasivan, Shivani Kapania, Hannah Highfill, Diana Akrong, Praveen Paritosh, and Lora Aroyo. 2021. “Everyone wants to do the model work, not the data work’’: Data Cascades in High-Stakes AI. (2021), 15.Google Scholar
Christine Samson, Casey Fiesler, and Shaun K. Kane. 2016. “Holy Starches Batman!! We Are Getting Walloped!”: Crowdsourcing Comic Book Transcriptions. In Proceedings of the 18th International ACM SIGACCESS Conference on Computers and Accessibility (Reno, Nevada, USA) (ASSETS ’16). Association for Computing Machinery, New York, NY, USA, 289–290. https://doi.org/10.1145/2982142.2982211Google ScholarDigital Library
Bernie Sanders. 2019. This Is How We Will Cancel All Student Debt. https://medium.com/@SenSanders/this-is-why-we-should-cancel-all-student-debt-6ea987d02ce2. (Accessed on 02/16/2020).Google Scholar
Morgan Klaus Scheuerman, Jacob M. Paul, and Jed R. Brubaker. 2019. How Computers See Gender: An Evaluation of Gender Classification in Commercial Facial Analysis Services. Proc. ACM Hum.-Comput. Interact. 3, CSCW, Article 144 (Nov. 2019), 33 pages. https://doi.org/10.1145/3359246Google ScholarDigital Library
Ben Shneiderman. 1996. The eyes have it: a task by data type taxonomy for information visualizations. In Proceedings 1996 IEEE Symposium on Visual Languages. 336–343. https://doi.org/10/fwdq26 ISSN: 1049-2615.Google ScholarCross Ref
Kurt Shuster, Samuel Humeau, Antoine Bordes, and Jason Weston. 2018. Engaging Image Chat: Modeling Personality in Grounded Dialogue. CoRR abs/1811.00945(2018). arxiv:1811.00945http://arxiv.org/abs/1811.00945Google Scholar
Kurt Shuster, Samuel Humeau, Hexiang Hu, Antoine Bordes, and Jason Weston. 2019. Engaging image captioning via personality. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 12516–12526.Google ScholarCross Ref
Rachel N Simons, Danna Gurari, and Kenneth R Fleischmann. 2020. ” I Hope This Is Helpful” Understanding Crowdworkers’ Challenges and Motivations for an Image Description Task. Proceedings of the ACM on Human-Computer Interaction 4, CSCW2(2020), 1–26.Google Scholar
John M Slatin and Sharron Rush. 2002. Maximum accessibility: Making your web site more usable for everyone. Addison-Wesley Longman Publishing Co., Inc.Google ScholarDigital Library
Abigale Stangl, Ann Cunningham, Lou Ann Blake, and Tom Yeh. 2019. Defining Problems of Practices to Advance Inclusive Tactile Media Consumption and Production. In The 21st International ACM SIGACCESS Conference on Computers and Accessibility(Pittsburgh, PA, USA) (ASSETS ’19). Association for Computing Machinery, New York, NY, USA, 329–341. https://doi.org/10.1145/3308561.3353778Google ScholarDigital Library
Abigale Stangl, Meredith Ringel Morris, and Danna Gurari. 2020. “Person, Shoes, Tree. Is the Person Naked?” What People with Vision Impairments Want in Image Descriptions. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems (Honolulu, HI, USA) (CHI ’20). Association for Computing Machinery, New York, NY, USA, 1–13. https://doi.org/10.1145/3313831.3376404Google ScholarDigital Library
Abigale J Stangl, Esha Kothari, Suyog D Jain, Tom Yeh, Kristen Grauman, and Danna Gurari. 2018. BrowseWithMe: An Online Clothes Shopping Assistant for People with Visual Impairments. In Proceedings of the 20th International ACM SIGACCESS Conference on Computers and Accessibility. ACM, 107–118.Google ScholarDigital Library
Peter M. Steiner, Christiane Atzmüller, and Dan Su. 2017. Designing Valid and Reliable Vignette Experiments for Survey Research: A Case Study on the Fair Gender Income Gap. Journal of Methods and Measurement in the Social Sciences 7, 22 (Jun 2017). https://doi.org/10.2458/v7i2.20321Google ScholarCross Ref
Kenneth Tran, Xiaodong He, Lei Zhang, Jian Sun, Cornelia Carapcea, Chris Thrasher, Chris Buehler, and Chris Sienkiewicz. 2016. Rich image captioning in the wild. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 49–56.Google ScholarCross Ref
Shari Trewin. 2018. AI Fairness for People with Disabilities: Point of View. arXiv:1811.10670 [cs] (Nov 2018). http://arxiv.org/abs/1811.10670 arXiv:1811.10670.Google Scholar
Oriol Vinyals, Alexander Toshev, Samy Bengio, and Dumitru Erhan. 2015. Show and tell: A neural image caption generator. In Proceedings of the IEEE conference on computer vision and pattern recognition. 3156–3164.Google ScholarCross Ref
Luis von Ahn and Laura Dabbish. 2004. Labeling Images with a Computer Game. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (Vienna, Austria) (CHI ’04). Association for Computing Machinery, New York, NY, USA, 319–326. https://doi.org/10.1145/985692.985733Google ScholarDigital Library
Violeta Voykinska, Shiri Azenkot, Shaomei Wu, and Gilly Leshed. 2016. How blind people interact with visual content on social networking services. In Proceedings of the 19th ACM Conference on Computer-Supported Cooperative Work & Social Computing. ACM, 1584–1595.Google ScholarDigital Library
Alexandra Vtyurina and Adam Fourney. 2018. Exploring the role of conversational cues in guided task support with virtual assistants. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. ACM, 208.Google ScholarDigital Library
W3C Web Accessibility Initiative. 2018. Web Content Accessibility Guidelines (WCAG) Overview. https://www.w3.org/WAI/standards-guidelines/wcag/ Accessed on 2020-05-05.Google Scholar
Zheshen Wang, Baoxin Li, Terri Hedgpeth, and Teresa Haven. 2009. Instant tactile-audio map: enabling access to digital maps for people with visual impairment. In Proceedings of the 11th international ACM SIGACCESS conference on Computers and accessibility. 43–50.Google ScholarDigital Library
Qi Wu, Peng Wang, Chunhua Shen, Anthony Dick, and Anton van den Hengel. 2016. Ask Me Anything: Free-Form Visual Question Answering Based on Knowledge from External Sources. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4622–4630.Google Scholar
Shaomei Wu, Jeffrey Wieland, Omid Farivar, and Julie Schiller. 2017. Automatic alt-text: Computer-generated image descriptions for blind users on a social network service. In Proceedings of the 2017 ACM Conference on Computer Supported Cooperative Work and Social Computing. ACM, 1180–1192.Google ScholarDigital Library
Bo Xie. 2009. Older adults’ health information wants in the internet age: Implications for patient–provider relationships. Journal of health communication 14, 6 (2009), 510–524.Google ScholarCross Ref
Xiaoyu Zeng, Yanan Wang, Tai-Yin Chiu, Nilavra Bhattacharya, and Danna Gurari. 2020. Vision skills needed to answer visual questions. Proceedings of the ACM on Human-Computer Interaction 4, CSCW2(2020), 1–31.Google ScholarDigital Library
Xiaoyi Zhang, Lilian de Greef, Amanda Swearngin, Samuel White, Kyle Murray, Lisa Yu, Qi Shan, Jeffrey Nichols, Jason Wu, Chris Fleizach, 2021. Screen Recognition: Creating Accessibility Metadata for Mobile Applications from Pixels. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems. 1–15.Google ScholarDigital Library
Yuhang Zhao, Shaomei Wu, Lindsay Reynolds, and Shiri Azenkot. 2017. The Effect of Computer-Generated Descriptions on Photo-Sharing Experiences of People With Visual Impairments. Proceedings of the ACM on Human-Computer Interaction 1, CSCW(2017), 121.Google ScholarDigital Library
Yu Zhong, Walter S Lasecki, Erin Brady, and Jeffrey P Bigham. 2015. Regionspeak: Quick comprehensive spatial descriptions of complex images for blind users. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems. ACM, 2353–2362.Google ScholarDigital Library
Ying Zhong, Masaki Matsubara, and Atsuyuki Morishima. 2018. Identification of Important Images for Understanding Web Pages. In 2018 IEEE International Conference on Big Data (Big Data). IEEE, 3568–3574.Google Scholar

Index Terms

Going Beyond One-Size-Fits-All Image Descriptions to Satisfy the Information Wants of People Who are Blind or Have Low Vision
1. Human-centered computing
  1. Human computer interaction (HCI)
2. Social and professional topics
  1. Professional topics
    1. Computing profession
  2. User characteristics
    1. People with disabilities

Index terms have been assigned to the content through auto-classification.

Recommendations

"Person, Shoes, Tree. Is the Person Naked?" What People with Vision Impairments Want in Image Descriptions
CHI '20: Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems

Access to digital images is important to people who are blind or have low vision (BLV). Many contemporary image description efforts do not take into account this population's nuanced image description preferences. In this paper, we present a qualitative ...
Read More
Dimensional alt text: Enhancing Spatial Understanding through Dimensional Layering of Image Descriptions for Screen Reader Users
CHI EA '23: Extended Abstracts of the 2023 CHI Conference on Human Factors in Computing Systems

Over the past decade, there has been a significant improvement in the quality of images we see on the web, and image processing technologies such as monocular depth estimation are opening up new possibilities for various applications. However, despite ...
Read More
"That's in the eye of the beholder": Layers of Interpretation in Image Descriptions for Fictional Representations of People with Disabilities
ASSETS '21: Proceedings of the 23rd International ACM SIGACCESS Conference on Computers and Accessibility

Image accessibility is an established research area in Accessible Computing and a key area of digital accessibility for blind and low vision (BLV) people worldwide. Recent work has delved deeper into the question of how image descriptions should properly ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
ASSETS '21: Proceedings of the 23rd International ACM SIGACCESS Conference on Computers and Accessibility
October 2021
730 pages
ISBN:9781450383066
DOI:10.1145/3441852
Editors:
Jonathan Lazar
University of Maryland, USA
,
Jinjuan Heidi Feng
Towson University, USA
,
Faustina Hwang
Towson University, USA
Copyright © 2021 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 17 October 2021
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
alt text
blind
context aware
image caption
image description
low vision
minimum viable description
scenarios
visual impairment
Qualifiers
- research-article
- Research
- Refereed limited
Conference

Acceptance Rates
ASSETS '21 Paper Acceptance Rate36of134submissions,27%Overall Acceptance Rate436of1,556submissions,28%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 24
  Total Citations
  View Citations
- 572
  Total Downloads
- Downloads (Last 12 months)266
- Downloads (Last 6 weeks)43
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

Going Beyond One-Size-Fits-All Image Descriptions to Satisfy the Information Wants of People Who are Blind or Have Low Vision

ASSETS '21: Proceedings of the 23rd International ACM SIGACCESS Conference on Computers and Accessibility

ABSTRACT

Supplemental Material

Available for Download

References

Cited By

Index Terms

Recommendations

"Person, Shoes, Tree. Is the Person Naked?" What People with Vision Impairments Want in Image Descriptions

Dimensional alt text: Enhancing Spatial Understanding through Dimensional Layering of Image Descriptions for Screen Reader Users

"That's in the eye of the beholder": Layers of Interpretation in Image Descriptions for Fictional Representations of People with Disabilities

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

HTML Format

Share this Publication link

Share on Social Media