skip to main content
10.1145/3460112.3471963acmconferencesArticle/Chapter ViewAbstractPublication PagescompassConference Proceedingsconference-collections
research-article

Costs and Benefits of Conducting Voice-based Surveys Versus Keypress-based Surveys on Interactive Voice Response Systems

Published: 23 September 2021 Publication History

Abstract

Recent improvements through machine learning in speech technologies and natural language processing has prompted active interest in the development of conversational agents for various tasks. We look at the area of data collection in low-resource settings among rural women in North India, and explore the feasibility of using voice-based surveys conducted through IVR (Interactive Voice Response) systems where users may speak their responses in a conversational manner through natural speech. Through an iterative design process and detailed user feedback, we describe several nuances in running voice-based surveys, and compare their accuracy of data collection through equivalent keypress-based surveys. We find strong user preferences for voice-based surveys, and comparable performance with keypress-based surveys for most types of questions. Our results suggest that voice-based conversational interfaces may hold significant potential to build interactive applications for low-income and less-literate populations. Our findings are likely to be useful for other researchers and practitioners using ICTs (Information and Communication Technologies) in developing regions.

Supplementary Material

Analysis Rubric (3460112.3471963.pdf)

References

[1]
Rami Al-Rfou, Bryan Perozzi, and Steven Skiena. 2013. Polyglot: Distributed Word Representations for Multilingual NLP. In Proceedings of the Seventeenth Conference on Computational Natural Language Learning. Association for Computational Linguistics, Sofia, Bulgaria, 183–192. http://www.aclweb.org/anthology/W13-3520
[2]
Pranav Bhagat, Sachin Kumar Prajapati, and Aaditeshwar Seth. 2020. Initial Lessons from Building an IVR-based Automated Question-Answering System. In Proceedings of the 2020 International Conference on Information and Communication Technologies and Development. 1–5.
[3]
Dipanjan Chakraborty, Akshay Gupta, and Aaditeshwar Seth. 2019. Experiences from a mobile-based behaviour change campaign on maternal and child nutrition in rural India. In Proceedings of the Tenth International Conference on Information and Communication Technologies and Development. 1–11.
[4]
Dipanjan Chakraborty, Indrani Medhi, Edward Cutrell, and William Thies. 2013. Man versus machine: evaluating IVR versus a live operator for phone surveys in India. In Proceedings of the 3rd ACM Symposium on Computing for Development. 1–9.
[5]
Sebastien Cuendet, Indrani Medhi, Kalika Bali, and Edward Cutrell. 2013. VideoKheti: Making video content accessible to low-literate and novice users. In Proceedings of the SIGCHI conference on human factors in computing systems. 2833–2842.
[6]
Google. 2020. Ok Google: How is voice making technology more accessible in India?https://www.thinkwithgoogle.com/intl/en-apac/country/india/ok-google-how-is-voice-making-technology-more-accessible-in-india/
[7]
Google. 2021. Dialog Flow. https://cloud.google.com/dialogflow
[8]
Aditi Sharma Grover, Madelaine Plauché, Etienne Barnard, and Christiaan Kuun. 2009. HIV health information access using spoken dialogue systems: Touchtone vs. speech. In 2009 International Conference on Information and Communication Technologies and Development (ICTD). IEEE, 95–107.
[9]
Aparna Hegde and Riddhi Doshi. 2016. Assessing the Impact of Mobile-based Intervention on Health Literacy among Pregnant Women in Urban India. In AMIA.
[10]
Louis H Janda, Michael Janda, and Eric Tedford. 2001. IVR Test & Survey: a computer program to collect data via computerized telephonic applications. Behavior Research Methods, Instruments, & Computers 33, 4 (2001), 513–516.
[11]
Kwan Min Lee and Jennifer Lai. 2005. Speech versus touch: A comparative study of the use of speech and DTMF keypad for navigation. International Journal of Human-Computer Interaction 19, 3(2005), 343–360.
[12]
Amnesty LeFevre, Smisha Agarwal, Sara Chamberlain, Kerry Scott, Anna Godfrey, Rakesh Chandra, Aditya Singh, Neha Shah, Diva Dhar, Alain Labrique, 2019. Are stage-based health information messages effective and good value for money in improving maternal newborn and child health outcomes in India? Protocol for an individually randomized controlled trial. Trials 20, 1 (2019), 1–12.
[13]
Indrani Medhi, Aman Sagar, and Kentaro Toyama. 2006. Text-free user interfaces for illiterate and semi-literate users. In 2006 international conference on information and communication technologies and development. IEEE, 72–82.
[14]
Aparna Moitra, Vishnupriya Das, Gram Vaani, Archna Kumar, and Aaditeshwar Seth. 2016. Design lessons from creating a mobile-based community media platform in Rural India. In Proceedings of the Eighth International Conference on Information and Communication Technologies and Development. 1–11.
[15]
Preeti Mudliar, Jonathan Donner, and William Thies. 2012. Emergent practices around CGNet Swara, voice forum for citizen journalism in rural India. In Proceedings of the Fifth International Conference on Information and Communication Technologies and Development. 159–168.
[16]
Nirmala Murthy, Subhashini Chandrasekharan, Muthu Perumal Prakash, Aakash Ganju, Joanne Peter, Nadi Kaonga, and Patricia Mechael. 2020. Effects of an mHealth voice message service (mMitra) on maternal health knowledge and practices of low-income women in India: findings from a pseudo-randomized controlled trial. BMC Public Health 20(2020), 1–10.
[17]
Neil Patel, Sheetal Agarwal, Nitendra Rajput, Amit Nanavati, Paresh Dave, and Tapan S Parikh. 2009. A comparative study of speech and dialed input voice interfaces in rural India. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. 51–54.
[18]
Neil Patel, Deepti Chittamuru, Anupam Jain, Paresh Dave, and Tapan S Parikh. 2010. Avaaj otalo: a field study of an interactive voice forum for small farmers in rural india. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. 733–742.
[19]
Muhammad Qasim, Haris Bin Zia, Awais Athar, Tania Habib, and Agha Ali Raza. 2021. Personalized weather information for low-literate farmers using multimodal dialog systems. International Journal of Speech Technology(2021), 1–17.
[20]
Peng Qi, Yuhao Zhang, Yuhui Zhang, Jason Bolton, and Christopher D Manning. 2020. Stanza: A Python natural language processing toolkit for many human languages. arXiv preprint arXiv:2003.07082(2020).
[21]
Shan M Randhawa, Tallal Ahmad, Jay Chen, and Agha Ali Raza. 2021. Karamad: A Voice-based Crowdsourcing Platform for Underserved Populations. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems. 1–15.
[22]
Agha Ali Raza, Mansoor Pervaiz, Christina Milo, Samia Razaq, Guy Alster, Jahanzeb Sherwani, Umar Saif, and Roni Rosenfeld. 2012. Viral entertainment as a vehicle for disseminating speech-based services to low-literate users. In Proceedings of the Fifth International Conference on Information and Communication Technologies and Development. 350–359.
[23]
Johan Schalkwyk, Doug Beeferman, Françoise Beaufays, Bill Byrne, Ciprian Chelba, Mike Cohen, Maryam Kamvar, and Brian Strope. 2010. “your word is my command”: Google search by voice: A case study. In Advances in speech recognition. Springer, 61–90.
[24]
A Seth, A Gupta, A Moitra, D Kumar, D Chakraborty, L Enoch, O Ruthven, P Panjal, RA Siddiqi, R Singh, 2020. Reflections from Practical Experiences of Managing Participatory Media Platforms for Development. In Proceedings of the 2020 International Conference on Information and Communication Technologies and Development. 1–15.
[25]
Jahanzeb Sherwani, Nosheen Ali, Sarwat Mirza, Anjum Fatma, Yousuf Memon, Mehtab Karim, Rahul Tongia, and Roni Rosenfeld. 2007. Healthline: Speech-based access to health information by low-literate users. In 2007 International Conference on Information and Communication Technologies and Development. IEEE, 1–9.
[26]
Jahanzeb Sherwani, Sooraj Palijo, Sarwat Mirza, Tanveer Ahmed, Nosheen Ali, and Roni Rosenfeld. 2009. Speech vs. touch-tone: Telephony interfaces for information access by low literate users. In 2009 International Conference on Information and Communication Technologies and Development (ICTD). IEEE, 447–457.
[27]
Aditya Vashistha, Abhinav Garg, and Richard Anderson. 2019. ReCall: Crowdsourcing on basic phones to financially sustain voice forums. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. 1–13.
[28]
Aditya Vashistha, Pooja Sethi, and Richard Anderson. 2017. Respeak: A voice-based, crowd-powered speech transcription system. In Proceedings of the 2017 CHI conference on human factors in computing systems. 1855–1866.
[29]
Aditya Vashistha, Pooja Sethi, and Richard Anderson. 2018. BSpeak: An accessible voice-based crowdsourcing marketplace for low-income blind people. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. 1–13.

Cited By

View all
  • (2024)A Multi-Process System for Investigating Inclusive Design in User Interfaces for Low-Income CountriesAlgorithms10.3390/a1706023217:6(232)Online publication date: 27-May-2024
  • (2024)A Design Vocabulary for Scaffolding Group Interaction Archetypes through Synchronous TelephonyProceedings of the ACM on Human-Computer Interaction10.1145/36372898:CSCW1(1-22)Online publication date: 26-Apr-2024

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
COMPASS '21: Proceedings of the 4th ACM SIGCAS Conference on Computing and Sustainable Societies
June 2021
462 pages
ISBN:9781450384537
DOI:10.1145/3460112
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 23 September 2021

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Interactive Voice Response systems
  2. data collection
  3. entity extraction
  4. natural language processing
  5. speech recognition
  6. surveys

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

COMPASS '21
Sponsor:

Acceptance Rates

Overall Acceptance Rate 25 of 50 submissions, 50%

Upcoming Conference

COMPASS '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)14
  • Downloads (Last 6 weeks)0
Reflects downloads up to 28 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)A Multi-Process System for Investigating Inclusive Design in User Interfaces for Low-Income CountriesAlgorithms10.3390/a1706023217:6(232)Online publication date: 27-May-2024
  • (2024)A Design Vocabulary for Scaffolding Group Interaction Archetypes through Synchronous TelephonyProceedings of the ACM on Human-Computer Interaction10.1145/36372898:CSCW1(1-22)Online publication date: 26-Apr-2024

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media