skip to main content
research-article

Evaluating Intelligibility and Battery Drain of Mobile Sign Language Video Transmitted at Low Frame Rates and Bit Rates

Published:14 November 2015Publication History
Skip Abstract Section

Abstract

Mobile sign language video conversations can become unintelligible if high video transmission rates cause network congestion and delayed video. In an effort to understand the perceived lower limits of intelligible sign language video intended for mobile communication, we evaluated sign language video transmitted at four low frame rates (1, 5, 10, and 15 frames per second [fps]) and four low fixed bit rates (15, 30, 60, and 120 kilobits per second [kbps]) at a constant spatial resolution of 320 × 240 pixels. We discovered an “intelligibility ceiling effect,” in which increasing the frame rate above 10fps did not improve perceived intelligibility, and increasing the bit rate above 60kbps produced diminishing returns. Given the study parameters, our findings suggest that relaxing the recommended frame rate and bit rate to 10fps at 60kbps will provide intelligible video conversations while reducing total bandwidth consumption to 25% of the ITU-T standard (at least 25fps and 100kbps). As part of this work, we developed the Human Signal Intelligibility Model, a new conceptual model useful for informing evaluations of video intelligibility and our methodology for creating linguistically accessible web surveys for deaf people. We also conducted a battery-savings experiment quantifying battery drain when sign language video is transmitted at the lower frame rates and bit rates. Results confirmed that increasing the transmission rates monotonically decreased the battery life.

References

  1. N. Ahmen, T. Natarajan, and K. R. Rao. 1974. Discrete cosine transform. IEEE Transactions on Computers C-23, 1, 90--93. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. L. Aimar, L. Merritt, E. Petit, et al. 2005. x264 - a free h264/avc encoder. Online (last accessed on: 04/01/07). http://www.videolan.org/developers/x264.html.Google ScholarGoogle Scholar
  3. Apple. 2013. Apple - QuickTime - Download. Retrieved September 30, 2015 from http://www.apple.com/quicktime/download/.Google ScholarGoogle Scholar
  4. ARM. 2008. The architecture for the digital world. Retrieved September 30, 2015 from http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0419c/index.html.Google ScholarGoogle Scholar
  5. B. Arons. 1997. SpeechSkimmer: A system for interactively skimming recorded speech. Proceedings of the CHI, 3--38. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. F. Asim. 2013. AndroSensor. Retrieved September 30, 2015 from http://www.fivasim.com/androsensor.html.Google ScholarGoogle Scholar
  7. Asterisk. 2014. Asterisk. Retrieved September 30, 2015 from http://www.asterisk.org/.Google ScholarGoogle Scholar
  8. AT&T. 2014. AT&T. Retrieved September 30, 2015 from http://www.att.com/shop/wireless/data-plans.html#&T. Retrieved September 30, 2015 from http://www.att.com/shop/wireless/data-plans.html##fbid=027qt05YFJ6.Google ScholarGoogle Scholar
  9. S. Bae, T. N. Pappas, and B. Juang. 2009. Spatial resolution and quantization noise tradeoffs for scalable image compression. ICASSP, IEEE, II--945--II--948.Google ScholarGoogle Scholar
  10. D. Barnlund. 1970. A Transactional Model of Communication. Harper & Row. New York, NY.Google ScholarGoogle Scholar
  11. D. K. Berlo. 1960. The Process of Communication. Holt, Rinehart, & Winston, New York, NY.Google ScholarGoogle Scholar
  12. A. Cavender, R. Ladner, and E. Riskin. 2006. MobileASL: Intelligibility of sign language video as constrained by mobile phone technology. Proceedings of ASSETS, 71--78. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. B. Chen. 2013. AT&T allows FaceTime for limited data users. What about unlimited? The New York Times. Retrieved September 30, 2015 from http://bits.blogs.nytimes.com/2013/01/16/facetime-limited-data-att/?_php=true&_type==blogs&_r==0.Google ScholarGoogle Scholar
  14. J. Y. C. Chen and J. E. Thropp. 2007. Review of low frame rate effects on human performance. IEEE Transactions on Systems, Man, and Cybernetics—Part A: Systems and Humans 37, 6, 1063--1076. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. N. Cherniavsky, J. Chon, J. O. Wobbrock, R. Ladner, and E. Riskin. 2009. Activity analysis enabling real-time video communication on mobile phones for deaf users. UIST. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. J. Chon. 2011. Real-time sign language video communication over cell phones. Ph.D. thesis. University of Washington. 1--105.Google ScholarGoogle Scholar
  17. J. Chon, N. Cherniavsky, E. Riskin, and R. Ladner. 2009. Enabling access through real-time sign language communication over cell phones. Asilomar Conference on Signals, Systems, and Computers, 588--592. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. F. Ciaramello and S. Hemami. 2011. A computational intelligibility model for assessment and compression of American Sign Language video. IEEE Trans. IP. 20, 11. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. L. Cicco, S. Mascolo, and V. Palmisano. 2008. Skype video responsiveness to bandwidth variations. NOSSDAV. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. H. Clark. 1985. Language use and language users. In: Handbook of Social Psychology. Harper & Row, New York, NY, 179--231.Google ScholarGoogle Scholar
  21. Convo. 2011. Convo. Retrieved September 30, 2015 from https://www.convorelay.com/.Google ScholarGoogle Scholar
  22. C. Cumming and M. Rodda. 1989. Advocacy, prejudice, and role modeling in the Deaf community. Social Psychology 1, 129, 5--12.Google ScholarGoogle Scholar
  23. Doubango Telecom. 2009. IMSDroid-High Quality Video SIP/IMS client for Google Android. Retrieved September 30, 2015 from http://code.google.com/p/imsdroid/.Google ScholarGoogle Scholar
  24. R. Feghali, F. Speranza, D. Wang, and A. Vincent. 2007. Video quality metric for bit rate control via joint adjustment of quantization and frame rate. IEEE Transactions on Broadcasting 53, 1, 441--446.Google ScholarGoogle ScholarCross RefCross Ref
  25. D. Fitzgerald. 2013. How much smartphone data do you really need? The Wall Street Journal. Retrieved September 30, 2015 from http://blogs.wsj.com/digits/2013/08/01/how-much-smartphone-data-do-you-really-need/.Google ScholarGoogle Scholar
  26. K. Harrigan. 1995. The SPECIAL system: Self-paced education with compressed interactive audio learning. Journal of Research on Computing in Education 3, 27, 361--370.Google ScholarGoogle ScholarCross RefCross Ref
  27. G. W. Heiman and R. D. Tweney. 1981. Intelligibility and comprehension of time compressed sign language narratives. Journal of Psycholinguistic Research 10, 1, 3--15.Google ScholarGoogle ScholarCross RefCross Ref
  28. J. J. Higgins and S. Tashtoush. 1994. An aligned rank transform test for interaction. Nonlinear World 1, 2, 201--2011.Google ScholarGoogle Scholar
  29. J. Hollington. 2013. Costs associated with using FaceTime. iLounge. Retrieved September 30, 2015 from http://www.ilounge.com/index.php/articles/comments/costs-associated-with-using-facetime/.Google ScholarGoogle Scholar
  30. S. Holm. 1979. A simple sequentially rejective multiple test procedure. Scandinavian Journal of Statistics 6, 2, 65--70.Google ScholarGoogle Scholar
  31. S. Hooper, C. Miller, S. Rose, and G. Veletsianos. 2007. The effects of digital video quality on learner comprehension in an American sign language assessment environment. Sign Language Studies 8, 1, 42--58.Google ScholarGoogle ScholarCross RefCross Ref
  32. B. F. Johnson and J. K. Caird. 1996. The effect of frame rate and video information redundancy on the perceptual learning of American sign language gestures. In Proceedings of the CHI’96 Conference Companion on Human Factors in Computing Systems, ACM, New York, NY. 121--122. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. R. Koul. 2003. Synthetic speech perception in individuals with and without disabilities. 19, 1, 49--58.Google ScholarGoogle Scholar
  34. Kurtnoise. 2009. Yet another MP4 box user interface for Windows users. Retrieved September 30, 2015 from http://yamb.unite-video.com/index.html.Google ScholarGoogle Scholar
  35. H. Lane. 1992. The Mask of Benevolence: Disabling the Deaf Community. Alfred A. Knopf, Inc., New York, NY.Google ScholarGoogle Scholar
  36. S. Lawson. 2011. Mobile growth driving out unlimited data. Retrieved September 30, 2015 from http://www.pcworld.com/businesscenter/article/242376/mobile_growth_driving_out_unlimited_data.html.Google ScholarGoogle Scholar
  37. C. Lucas and C. Valli. 2000. Linguistics of American Sign Language: An Introduction. Gallaudet University Press, Washington, DC.Google ScholarGoogle Scholar
  38. J. Maher. 1996. Seeing Language in Sign: The Work of William C. Stokoe. Gallaudet University Press, Washington, DC.Google ScholarGoogle Scholar
  39. G. Marshall. 2014. How much 4G data do you really need? Retrieved September 30, 2015 from http://www.techradar.com/us/news/phone-and-communications/mobile-phones/how-much-4g-data-do-you-really-need--1176594.Google ScholarGoogle Scholar
  40. M. Masry and S. S. Hemami. 2001. An analysis of subjective quality in low bit rate video. International Conference on Image Processing, IEEE, 465--468.Google ScholarGoogle Scholar
  41. M. Masry and S. Hemami. 2003. CVQE: A metric for continuous video quality evaluation at low bit rates. SPIE Human Vision and Electronic Imaging.Google ScholarGoogle Scholar
  42. J. McCarthy, M. A. Sasse, and D. Miras. 2004. Sharp or smooth? Comparing the effects of quantization vs. frame rate for streamed video. Proceedings of the CHI. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Merriam-Webster. 2003. The Merriam-Webster Dictionary. http://www.merriam-webster.com (8 May 2003).Google ScholarGoogle Scholar
  44. Microsoft. 2013. How much data will Skype use on my mobile phone? http://community.skype.com/t5/Other-features/How-much-data-does-skype-use/td-p/897886.Google ScholarGoogle Scholar
  45. I. Munoz-Baell and T. Ruiz. 2000. Empowering the deaf. Epidemiology and Community Health 1, 54, 40--44.Google ScholarGoogle ScholarCross RefCross Ref
  46. Cisco. 2015. Cisco visual networking index:global mobile data trafic forecast update, 2014--2019. http://www.cisco.com/c/en/us/solutions/collateral/service-provider/visual-networking-index-vni/white_paper_c11-520862.pdf.Google ScholarGoogle Scholar
  47. A. Nemethova, M. Ries, M. Zavodsky, and M. Rupp. 2006. PSNR-based estimation of subjective time-variant video quality for mobiles. Proceedings of MESAQIN 2006, Prag, Tschechien, June, 2006.Google ScholarGoogle Scholar
  48. N. Omoigui, L. He, A. Gupta, J. Grudin, and E. Sanocki. 1999. Time-compression. Proceedings of CHI, ACM Press, New York, NY, 136--143. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. A. Oppenheim and R. Schafer. 1975. Discrete-Time Signal Processing. Pearson.Google ScholarGoogle Scholar
  50. C. Padden and T. Humphries. 2005. Inside Deaf Culture. Harvard University Press, Boston, MA.Google ScholarGoogle Scholar
  51. J. Postel. 1980. User Datagram Protocol--RFC 768. https://tools.ietf.org/html/rfc768. Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Purple. 2014. Purple VRS on Your Devices. Retrieved September 30, 2015 from http://www.purple.us/.Google ScholarGoogle Scholar
  53. T. Reagan. 1995. A social culture understanding of deafness: American Sign Language and the culture of deaf people. Intercultural Relations 19, 2, 239--251.Google ScholarGoogle ScholarCross RefCross Ref
  54. I. Richardson. 2004. vocdex: H.264 tutorial white papers. http://www.vcodex.com/h264.html.Google ScholarGoogle Scholar
  55. E. Riskin, R. Ladner, and J. Wobbrock. 2012. MobileASL. University of Washington. Retrieved September 30, 2015 from http://mobileasl.cs.washington.edu/.Google ScholarGoogle Scholar
  56. J. Rosenberg, H. Schulzrinee, G. Camarillo, et al. 2002. SIP: Session Initiation Protocol. RCS 3261. https://tools.ietf.org/html/rfc3261. Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. A. Saks and G. Hellström. 2006. Quality of conversation experience in sign language, lip reading and text. ITU-T Workshop on End-to-end QoE/QoS.Google ScholarGoogle Scholar
  58. C. E. Shannon. 1948. A mathematical theory of communication. The Bell System Technical Journal 27, 379-426, 623--656.Google ScholarGoogle ScholarCross RefCross Ref
  59. Skype. 2011. Skype. Retrieved September 30, 2015 from http://www.skype.com/intl/en-us/home.Google ScholarGoogle Scholar
  60. Sorenson. 2014. Sorenson Communications. Retrieved September 30, 2015 from http://www.sorenson.com/.Google ScholarGoogle Scholar
  61. G. Sperling, M. Landy, Y. Cohen, and M. Pavel. 1985. Intelligible encoding of ASL image sequences at extremely low information rates. Computer Vision Graphics, and Image Processing 31, 335--391.Google ScholarGoogle Scholar
  62. Static Brain Research Institute. 2012. Skype statistics. Retrieved September 30, 2015 from http://www.statisticbrain.com/skype-statistics.Google ScholarGoogle Scholar
  63. H. Thu and M. Ghanbari. 2008. Scope of validity of PSNR in image/video quality assessment. Electronic Letters 44, 13, 800--801.Google ScholarGoogle ScholarCross RefCross Ref
  64. T-Mobile. 2014. T-Mobile. Retrieved September 30, 2015 from http://www.t-mobile.com/cell-phone-plans/individual.html#lshop_plans_1.Google ScholarGoogle Scholar
  65. J. J. Tran, B. Flowers, E. Riskin, R. Ladner, and J. O. Wobbrock. 2014. Analyzing the intelligibility of real-time mobile sign language video transmitted below recommended standards. Proceedings of ASSETS, 177--184. Google ScholarGoogle ScholarDigital LibraryDigital Library
  66. J. J. Tran, J. Kim, J. Chon, E. Riskin, R. Ladner, and J. O. Wobbrock. 2011. Evaluating quality and comprehension of real-time sign language video on mobile phones. Proceedings of ASSETS, 115--122. Google ScholarGoogle ScholarDigital LibraryDigital Library
  67. J. J. Tran, E. Riskin, R. Ladner, and J. O. Wobbrock. 2013. Increasing mobile sign language video accessibility by relaxing video transmission standards. Third Mobile Accessibility Workshop at Proceedings of CHI.Google ScholarGoogle Scholar
  68. Verizon. 2014. Verizon Wireless. Retrieved September 30, 2015 from http://www.verizonwireless.com/b2c/index.html.Google ScholarGoogle Scholar
  69. Y. Wang and Y. Ou. 2012. Modeling rate and perceptual quality of scalable video as functions of quantization and frame rate and its application in scalable video adaptation. IEEE Transactions on Circuits and Systems for Video Technology, 671--682.Google ScholarGoogle Scholar
  70. Z. Wang, A. Bovik, and L. Lu. 2002. Why is image quality assessment so difficult? ITASS, 3313--3316.Google ScholarGoogle Scholar
  71. E. Weber. 1834. De pulsu, resorptione, auditu et tactu. Anatationes anatomicae et physiologicae.Google ScholarGoogle Scholar
  72. T. Wiegang, H. Schwarz, A. Joch, F. Kossentini, and G. Sullivan. 2003. Rate-constrained coder control and comparison of video coding standards. IEEE Transactions on Circuits and Systems for Video Technology 13, 7, 688--703. Google ScholarGoogle ScholarDigital LibraryDigital Library
  73. S. Winkler and P. Mohandas. 2008. The evolution of video quality measurement: From PSNR to hybrid metrics. IEEE Transactions on Broadcasting 54, 3, 660--668.Google ScholarGoogle ScholarCross RefCross Ref
  74. J. O. Wobbrock, L. Findlater, D. Gergie, and J. J. Higgins. 2011. The Aligned Rank Transform for nonparametric factorial analyses using only ANOVA procedures. Proceedings of CHI, 143--146. Google ScholarGoogle ScholarDigital LibraryDigital Library
  75. G. Yadavalli, S. Hemami, and M. Masry. 2003. Frame rate preferences in low bit rate video. IEEE Trans. IP. 441--444.Google ScholarGoogle Scholar
  76. E. Zeman. 2010. “iPhone 4 jailbreak unlocks 3G FaceTime calls. Information Week. Retrieved September 30, 2015 from http://www.informationweek.com/mobile/mobile-devices/iphone-4-jailbreak-unlocks-3g-facetime-calls/d/d-id/1091309?Google ScholarGoogle Scholar
  77. ZVRS. 2014. ZVRS Communication Service for the Deaf, Inc. http://www.zvrs.com/products/softwareapps.Google ScholarGoogle Scholar

Index Terms

  1. Evaluating Intelligibility and Battery Drain of Mobile Sign Language Video Transmitted at Low Frame Rates and Bit Rates

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Accessible Computing
      ACM Transactions on Accessible Computing  Volume 7, Issue 3
      Special Issue (Part 2) of Papers from ASSETS 2013
      November 2015
      79 pages
      ISSN:1936-7228
      EISSN:1936-7236
      DOI:10.1145/2836329
      Issue’s Table of Contents

      Copyright © 2015 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 14 November 2015
      • Revised: 1 June 2015
      • Accepted: 1 June 2015
      • Received: 1 June 2014
      Published in taccess Volume 7, Issue 3

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader