Skip to main content
Log in

E m o A s s i s t: emotion enabled assistive tool to enhance dyadic conversation for the blind

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

This paper presents the design and implementation of E m o A s s i s t: a smart-phone based system to assist in dyadic conversations. The main goal of the system is to provide access to more non-verbal communication options to people who are blind or visually impaired. The key functionalities of the system are to predict behavioral expressions (such a yawn, a closed lip smile, a open lip smile, looking away, sleepy, etc.) and 3-D affective dimensions (valence, arousal, and dominance) from visual cues in order to provide the correct auditory feedback or response. A number of challenges related to the data communication protocols, efficient tracking of the face, modeling of behavioral expressions/affective dimensions, feedback mechanism and system integration were addressed to build an effective and functional system. In addition, orientation-sensor information from the smart-phone was used to correct image alignment to improve the robustness for real world application. Empirical studies show that the E m o A s s i s t can predict affective dimensions with acceptable accuracy (Maximum Correlation-Coefficient for valence: 0.76, arousal: 0.78, and dominance: 0.76) in natural dyadic conversation. The overall minimum and maximum response-times are (64.61 milliseconds) and (128.22 milliseconds), respectively. The integration of sensor information for correcting the orientation improved (16 % in average) the accuracy in recognizing behavioralexpressions. A usability study with ten blind people in social interaction shows that the E m o A s s i s t is highly acceptable with an Average acceptability rating using of 6.0 in Likert scale (where 1 and 7 are the lowest and highest possible ratings, respectively).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20

Similar content being viewed by others

Notes

  1. 1 yawn, closed lip smile, looking away, open-lip smile, sleepy etc.

  2. 2 Valence, Arousal, and Dominance: VAD.

  3. 3 http://umdrive.memphis.edu/arahman/public/EmoAssistVideoDemo.mp4

  4. 4 http://www.icta.ufl.edu/projects_nih/data/chewingV1.htm

  5. 5 http://www.clovernook.org/

References

  1. Atkinson AP, Adolphs R (2005) Visual emotion perception: mechanisms and processes. Emotion and Consciousness 150

  2. AKMMahbubur Rahman, Tanveer MI, Yeasin M (2011) A spatio-temporal probabilistic framework for dividing and predicting facial action units. In: Proceedings of the 4th international conference on Affective computing and intelligent interaction - Volume Part II. ACII’11. Springer-Verlag, Berlin, pp 598–607

  3. Bee N, Franke S, Andrea E (2009) Relations between facial display, eye gaze and head tilt: dominance perception variations of virtual agents. In: ACII Workshop 2009. doi:10.1109/ACII.2009.5349573

  4. Boker SM, Cohn JF, Theobald B-J, Matthews I, Brick TR, Spies JR (2009) Effects of damping head movement and facial expression in dyadic conversation using real–time facial expression tracking and synthesized avatars. Philos Trans R Soc, B Biol Scie 364(1535):3485–3495

    Article  Google Scholar 

  5. Cohen I, Sebe N, Garg A, Chen LS, Huang TS (2003) Facial expression recognition from video sequences: temporal and static modeling. Comput Vis Image Underst 91(1):160–187

    Article  Google Scholar 

  6. Clemons J, Zhu H, Savarese S, Austin T (2011) Mevbench: a mobile computer vision benchmarking suite. In: 2011 IEEE International Symposium on Workload Characterization (IISWC). IEEE, pp 91–102

  7. Dunbar NE (2005) Perceptions of power and interactional dominance in interpersonal relationships. J Soc Pers Relat 2:22. doi:10.1177/0265407505050944

    Google Scholar 

  8. Ekman P (1982) Emotions in the human face. Studies in Emotion and Social Interaction

  9. Goldie P (2002) The emotions: a philosophical exploration. Oxford University Press, USA

    Book  Google Scholar 

  10. Graesser A, Chipman P (2006) Detection of emotions during learning with AutoTutor. In: 28th Annual Meetings of the Cognitive Science Society. Erlbaum, pp 285–290

  11. Grahe JE, Bernieri FJ (1999) The importance of nonverbal cues in judging rapport. J Nonverbal Behav 23(4):253–269

    Article  Google Scholar 

  12. Grimm M, Kroschel K, Narayanan S (2008) The Vera am Mittag German audio-visual emotional speech database. In: 2008 IEEE International Conference on Multimedia and Expo. IEEE, pp 865–868

  13. Hinds A, Sinclair A, Park J, Suttie A, Paterson H, Macdonald M (2003) Impact of an interdisciplinary low vision service on the quality of life of low vision patients. Br J Ophthalmol 11:87

    Google Scholar 

  14. Krishna S, Balasubramanian V, Panchanathan S (2010) Enriching social situational awareness in remote interactions: insights and inspirations from disability focused research. In: ACM Multimedia. ACM, Firenze. doi:10.1145/1873951.1874202

  15. Krishna S, Bala S, McDaniel T, McGuire S, Panchanathan S (2010) VibroGlove: an assistive technology aid for conveying facial expressions. In: Proceedings of the 28th of the international conference extended abstracts on Human factors in computing systems. ACM, pp 3637–3642

  16. Lamb TA (1981) Nonverbal and paraverbal control in Dyads and Triads: sex or power differences? Social Psychology Quarterly 44(1):49–53. http://www.jstor.org/stable/3033863

    Article  Google Scholar 

  17. Liu L et al (2008) Vibrotactile rendering of human emotions on the manifold of facial expressions. J Multi 3(3):18–25

    Google Scholar 

  18. Likert R (1932) A technique for the measurement of attitudes. Arch Psychol

  19. Lucey S, Ashraf AB, Cohn J (2007) Investigating spontaneous facial action recognition through aam representations of the face. I-TECH Education and Publishing, Vienna, pp 275–286

    Book  Google Scholar 

  20. Mann HB, Whitney DR (1947) On a test of whether one of two random variables is stochastically larger than the other. Ann Math Stat 18(1):50–60

    Article  MathSciNet  MATH  Google Scholar 

  21. McKeown G, Valstar M, Cowie R, Pantic M, Schroder M (2012) The SEMAINE database: annotated multimodal records of emotionally colored conversations between a person and a limited agent. IEEE Trans Affect Comput 3:5–17

    Article  Google Scholar 

  22. Nicolaou MA, Gunes H, Pantic M (2011) Continuous prediction of spontaneous affect from multiple cues and modalities in valence-arousal space. IEEE Trans Affect Comput 2(2)

  23. Palm G, Glodek M (2013) Towards emotion recognition in human computer interaction. In: Neural nets and surroundings. Springer, pp 323–336

  24. Pelin A (2011) Real-time mobile-cloud computing for context- aware blind navigation. IJNGC 2. http://perpetualinnovation.net/ojs/index.php/ijngc/article/view/107

  25. Peterson LL, Davie BS (2007) Computer networks: a systems approach. Elsevier

  26. Rahman S, Li L (2010) iFeeling: vibrotactile rendering of human emotions on mobile phones. In: Mobile multimedia processing, vol 5960 of lecture notes in computer science. Springer Berlin / Heidelberg

  27. Rahman AKMM, Tanveer MI, Anam ASMI, Yeasin M (2012) IMAPS: a smart phone based real-time framework for prediction of affect in natural dyadic conversation. In: Visual Communications and Image Processing (VCIP), 2012 IEEE, pp 1–6. doi:10.1109/VCIP.2012.6410828

  28. Reshef DN, Reshef YA, Finucane HK, Grossman SR, McVean G, Turnbaugh PJ, Lander ES, Mitzenmacher M, Sabeti PC (2011) Detecting novel associations in large data sets. Science 334(6062):1518–1524

    Article  MATH  Google Scholar 

  29. Roberts NA, Tsai JL, Coan JA (2007) Emotion elicitation using dyadic interaction tasks. In: Coan JA, Allen JJB (eds) Handbook of emotion elicitation and assessment. Oxford University Press

  30. Russell JA (1978) Evidence of convergent validity on the dimensions of affect. J Pers Soc Psychol 36(10). http://content.apa.org/journals/psp/36/10/1152

  31. Saragih JM, Lucey S, Cohn JF (2009) Face alignment through subspace constrained mean-shifts. In: 2009 IEEE 12th international conference on Computer vision. IEEE, pp 1034–1041

  32. Saragih JM, Lucey S, Cohn JF (2011) Deformable model fitting by regularized landmark mean-shift. Int J Comput Vision 91(1):16. doi:10.1007/s11263-010-0380-4

    MathSciNet  MATH  Google Scholar 

  33. TakeoKanade Y-L, Cohn JF (2001) Recognizing facial actions by combining geometric features and regional appearance patterns. Citeseer

  34. Tanveer MI, Anam ASM, Rahman AKM, Ghosh S, Yeasin M (2012) FEPS: a sensory substitution system for the blind to perceive facial expressions. In: Proceedings of the 14th international ACM SIGACCESS conference on Computers and accessibility. ACM, pp 207–208

  35. Vapnik V, Golowich SE, Smola A (1997) Support vector method for function approximation, regression estimation, and signal processing. Advances in Neural Information Processing Systems 281–287

  36. Viola P, Jones MJ (2004) Robust real-time face detection. Int J Comput Vis 57(2):137–154

    Article  Google Scholar 

  37. Watson RW, Mamrak SA (1987) Gaining efficiency in transport services by appropriate design and implementation choices. ACM Trans Comput Syst (TOCS) 5 (2):97–120

    Article  Google Scholar 

  38. Zeng Z, Pantic M, Roisman GI, Huang TS A survey of affect recognition methods: audio, visual, and spontaneous expressions. IEEE Trans Pattern Anal Mach Intell 31(1):39–58. doi:10.1109/TPAMI.2008.52

Download references

Acknowledgments

We are grateful to the participants of our study, specially the “Design Team” for actively helping us in our research and for giving the amazing feedback. Any opinions, findings, and conclusions or recommendations expressed in this material are our own and do not necessarily reflect the views of the funding institution. We are also thankful to our lab colleague Md Iftekhar Tanveer to share his code to extract facial features and head pose from the face tracker.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to AKMMahbubur Rahman.

Additional information

This work was partially funded by National Science Foundation (NSF-IIS-0746790), USA.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Rahman, A., Anam, A.I. & Yeasin, M. E m o A s s i s t: emotion enabled assistive tool to enhance dyadic conversation for the blind. Multimed Tools Appl 76, 7699–7730 (2017). https://doi.org/10.1007/s11042-016-3295-4

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-016-3295-4

Keywords

Navigation